25. Allow next unit to refresh
In the same Juju event after (Kubernetes) the workload has been allowed to start or (machines) the snap has been refreshed, the charm code must attempt to:
-
Start the workload
-
Check if the application and the unit are healthy
-
If they are both healthy, set
next_unit_allowed_to_refresh = True
If next_unit_allowed_to_refresh
is not set to True
(#3) (because
-
starting the workload [#1] failed,
-
checking if the application and the unit were healthy [#2] failed,
-
either the application or unit was unhealthy in #2,
-
or the charm code raised an uncaught exception later in the same Juju event
), then the charm code must retry #1-#3, as applicable, in every Juju event until next_unit_allowed_to_refresh
is set to True
and an uncaught exception is not raised by the charm code later in the same Juju event.
"Every Juju event" includes Juju events that the charm code may not currently observe |
If #2 fails or if either the application or the unit is unhealthy in #2, the charm code must set a unit status to indicate what is unhealthy.
The next_unit_allowed_to_refresh
attribute can be read to determine if (any part of) #1-#3 must be retried.
It must only be read for that purpose.
next_unit_allowed_to_refresh
can only be set to True
.
When the unit is refreshed, next_unit_allowed_to_refresh
will be automatically reset to False
.
Kubernetes
The charm code must first execute #1-#3 in the first Juju event where workload_allowed_to_start
is True
.
class PostgreSQLCharm(ops.CharmBase):
def reconcile(self, event):
if self.refresh.workload_allowed_to_start:
ensure_workload_service_is_enabled()
if not self.refresh.next_unit_allowed_to_refresh:
try:
ensure_application_and_unit_are_healthy()
except Unhealthy as exception:
self.unit.status = ops.BlockedStatus(exception.reason)
else:
self.refresh.next_unit_allowed_to_refresh = True
Machines
After the snap is successfully refreshed, refresh_snap will not be called again on the unit (until the next juju refresh
[e.g. rollback]).
This is true even if the charm code raised an uncaught exception in the same Juju event where the snap was successfully refreshed. Also (unlike |
However, even if the snap was successfully refreshed, #1-#3 (on this page) still must be retried until next_unit_allowed_to_refresh
is set to True
and an uncaught exception is not raised by the charm code later in the same Juju event.
There are two common approaches to accomplish this:
-
For charm code with an event handler that is executed for every Juju event, add #1-#3 to that event handler
Exampleclass PostgreSQLCharm(ops.CharmBase): def reconcile(self, event): (1) ensure_workload_service_is_enabled() try: ensure_application_and_unit_are_healthy() except Unhealthy as exception: self.unit.status = ops.BlockedStatus(exception.reason) else: self.refresh.next_unit_allowed_to_refresh = True
1 Event handler that is executed for every Juju event During the Juju event that the snap is refreshed in, the event handler must be executed
-
Create a method (e.g.
post_snap_refresh
) that runs in refresh_snap and is retried as needed in theops.CharmBase
__init__
methodExample@dataclasses.dataclass(eq=False) class MachinesPostgreSQLRefresh(charm_refresh.CharmSpecificMachines): def refresh_snap( self, *, snap_name: str, snap_revision: str, refresh: charm_refresh.Machines, ) -> None: # [...] (1) self._charm.post_snap_refresh(refresh) class PostgreSQLCharm(ops.CharmBase): def post_snap_refresh(self, refresh: charm_refresh.Machines): ensure_workload_service_is_enabled() try: ensure_application_and_unit_are_healthy() except Unhealthy as exception: self.unit.status = ops.BlockedStatus(exception.reason) else: refresh.next_unit_allowed_to_refresh = True def __init__(self, *args): # [...] self.refresh = charm_refresh.Machines( # [...] ) # [...] if not self.refresh.next_unit_allowed_to_refresh: if self.refresh.in_progress: self.post_snap_refresh(self.refresh) else: self.refresh.next_unit_allowed_to_refresh = True
1 Implemented in 24. Implement refresh_snap