Skip to content

Commit 04531ba

Browse files
authored
Merge pull request #4521 from evankanderson/unconventional
Update Condition guidance
2 parents 7d534bd + 0a9d1eb commit 04531ba

File tree

1 file changed

+76
-23
lines changed

1 file changed

+76
-23
lines changed

contributors/devel/sig-architecture/api-conventions.md

+76-23
Original file line numberDiff line numberDiff line change
@@ -308,17 +308,77 @@ response reduces the complexity of these clients.
308308

309309
##### Typical status properties
310310

311-
**Conditions** represent the latest available observations of an object's
312-
state. They are an extension mechanism intended to be used when the details of
313-
an observation are not a priori known or would not apply to all instances of a
314-
given Kind. For observations that are well known and apply to all instances, a
315-
regular field is preferred. An example of a Condition that probably should
316-
have been a regular field is Pod's "Ready" condition - it is managed by core
317-
controllers, it is well understood, and it applies to all Pods.
311+
**Conditions** provide a standard mechanism for higher-level status reporting
312+
from a controller. They are an extension mechanism which allows tools and other
313+
controllers to collect summary information about resources without needing to
314+
understand resource-specific status details. Conditions should complement more
315+
detailed information about the observed status of an object written by a
316+
controller, rather than replace it. For example, the "Available" condition of a
317+
Deployment can be determined by examining `readyReplicas`, `replicas`, and
318+
other properties of the Deployment. However, the "Available" condition allows
319+
other components to avoid duplicating the availability logic in the Deployment
320+
controller.
318321

319322
Objects may report multiple conditions, and new types of conditions may be
320323
added in the future or by 3rd party controllers. Therefore, conditions are
321-
represented using a list/slice, where all have similar structure.
324+
represented using a list/slice of objects, where each condition has a similar
325+
structure. This collection should be treated as a map with a key of `type`.
326+
327+
Conditions are most useful when they follow some consistent conventions:
328+
329+
* Conditions should be added to explicitly convey properties that users and
330+
components care about rather than requiring those properties to be inferred
331+
from other observations. Once defined, the meaning of a Condition can not be
332+
changed arbitrarily - it becomes part of the API, and has the same backwards-
333+
and forwards-compatibility concerns of any other part of the API.
334+
335+
* Controllers should apply their conditions to a resource the first time they
336+
visit the resource, even if the `status` is Unknown. This allows other
337+
components in the system to know that the condition exists and the controller
338+
is making progress on reconciling that resource.
339+
340+
* Not all controllers will observe the previous advice about reporting
341+
"Unknown" or "False" values. For known conditions, the absence of a
342+
condition status should be interpreted the same as `Unknown`, and
343+
typically indicates that reconciliation has not yet finished (or that the
344+
resource state may not yet be observable).
345+
346+
* For some conditions, `True` represents normal operation, and for some
347+
conditions, `False` represents normal operation. ("Normal-true" conditions
348+
are sometimes said to have "positive polarity", and "normal-false" conditions
349+
are said to have "negative polarity".) Without further knowledge of the
350+
conditions, it is not possible to compute a generic summary of the conditions
351+
on a resource.
352+
353+
* Condition type names should make sense for humans; neither positive nor
354+
negative polarity can be recommended as a general rule. A negative condition
355+
like "MemoryExhausted" may be easier for humans to understand than
356+
"SufficientMemory". Conversely, "Ready" or "Succeeded" may be easier to
357+
understand than "Failed", because "Failed=Unknown" or "Failed=False" may
358+
cause double-negative confusion.
359+
360+
* Condition type names should describe the current observed state of the
361+
resource, rather than describing the current state transitions. This
362+
typically means that the name should be an adjective ("Ready", "OutOfDisk")
363+
or a past-tense verb ("Succeeded", "Failed") rather than a present-tense verb
364+
("Deploying"). Intermediate states may be indicated by setting the status of
365+
the condition to `Unknown`.
366+
367+
* For state transitions which take a long period of time (rule of thumb: > 1
368+
minute), it is reasonable to treat the transition itself as an observed
369+
state. In these cases, the Condition (such as "Resizing") itself should not
370+
be transient, and should instead be signalled using the
371+
`True`/`False`/`Unknown` pattern. This allows other observers to determine
372+
the last update from the controller, whether successful or failed. In cases
373+
where the state transition is unable to complete and continued
374+
reconciliation is not feasible, the Reason and Message should be used to
375+
indicate that the transition failed.
376+
377+
* When designing Conditions for a resource, it's helpful to have a common
378+
top-level condition which summarizes more detailed conditions. Simple
379+
consumers may simply query the top-level condition. Although they are not a
380+
consistent standard, the `Ready` and `Succeeded` condition types may be used
381+
by API designers for long-running and bounded-execution objects, respectively.
322382

323383
The `FooCondition` type for some resource type `Foo` may include a subset of the
324384
following fields, but must contain at least `type` and `status` fields:
@@ -347,20 +407,13 @@ Use of the `Reason` field is encouraged.
347407
Use the `LastHeartbeatTime` with great caution - frequent changes to this field
348408
can cause a large fan-out effect for some resources.
349409

350-
Conditions should be added to explicitly convey properties that users and
351-
components care about rather than requiring those properties to be inferred from
352-
other observations. Once defined, the meaning of a Condition can not be
353-
changed arbitrarily - it becomes part of the API, and has the same backwards-
354-
and forwards-compatibility concerns of any other part of the API.
410+
Condition types should be named in PascalCase. Short condition names are
411+
preferred (e.g. "Ready" over "MyResourceReady").
355412

356413
Condition status values may be `True`, `False`, or `Unknown`. The absence of a
357414
condition should be interpreted the same as `Unknown`. How controllers handle
358415
`Unknown` depends on the Condition in question.
359416

360-
Condition types should indicate state in the "abnormal-true" polarity. For
361-
example, if the condition indicates when a policy is invalid, the "is valid"
362-
case is probably the norm, so the condition should be called "Invalid".
363-
364417
The thinking around conditions has evolved over time, so there are several
365418
non-normative examples in wide use.
366419

@@ -371,12 +424,12 @@ we define comprehensive state machines for objects, nor behaviors associated
371424
with state transitions. The system is level-based rather than edge-triggered,
372425
and should assume an Open World.
373426

374-
An example of an oscillating condition type is `Ready` (despite it running
375-
afoul of current guidance), which indicates the object was believed to be fully
376-
operational at the time it was last probed. A possible monotonic condition
377-
could be `Failed`. A `True` status for `Failed` would imply failure with no
378-
retry. An object that was still active would generally not have a `Failed`
379-
condition.
427+
An example of an oscillating condition type is `Ready`, which indicates the
428+
object was believed to be fully operational at the time it was last probed. A
429+
possible monotonic condition could be `Succeeded`. A `True` status for
430+
`Succeeded` would imply completion and that the resource was no longer
431+
active. An object that was still active would generally have a `Succeeded`
432+
condition with status `Unknown`.
380433

381434
Some resources in the v1 API contain fields called **`phase`**, and associated
382435
`message`, `reason`, and other status fields. The pattern of using `phase` is

0 commit comments

Comments
 (0)