WIP: controllers: "Unsupported", etc. condition messaging#61
WIP: controllers: "Unsupported", etc. condition messaging#61wking wants to merge 2 commits intoopenshift:masterfrom
Conversation
The test-cases have names. Using a sub-test will expose those names in the output for folks who are hunting down failures.
"Operator is non-functional" sounds a bit frightening. This commit
pivots to say "Operator has no role on platform {platform-name}" in
the cases where we determine we are not needed.
I'm also removing ReasonEmpty, because it's always nice to say
*something* about why we picked the state we picked. In the
ReasonInvalidConfiguration and ReasonDeployTimedOut cases, that makes
me wonder if we really intend to be Available then. Maybe we should
just echo whatever we're setting for Degraded in that case? FIXME
until we decide.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: wking The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| case ReasonInvalidConfiguration, ReasonDeployTimedOut: | ||
| v1helpers.SetStatusCondition(&conds, setStatusCondition(osconfigv1.OperatorDegraded, osconfigv1.ConditionTrue, string(newReason), msg)) | ||
| v1helpers.SetStatusCondition(&conds, setStatusCondition(osconfigv1.OperatorAvailable, osconfigv1.ConditionTrue, string(ReasonEmpty), "")) | ||
| v1helpers.SetStatusCondition(&conds, setStatusCondition(osconfigv1.OperatorAvailable, osconfigv1.ConditionTrue, string(ReasonAsExpected), "FIXME: are we really available here?")) |
There was a problem hiding this comment.
I'm fine with Degraded=True here. I'm not clear on Available=True. What functionality is the operator available to fulfil when it has ReasonInvalidConfiguration or ReasonDeployTimedOut?
There was a problem hiding this comment.
We looked into several options detailed here ; https://github.com/openshift/enhancements/blob/master/enhancements/baremetal/an-slo-for-baremetal.md#not-in-use-slo-behaviors
There seems to be no defined behavior when a CO needs to convey that the Operator is running, is non-functional and that is expected.
There was a problem hiding this comment.
Neither ReasonInvalidConfiguration nor ReasonDeployTimedOut are "expected non-functional", are they? That is ReasonUnsupported, which I also touch in this PR, but it is not what this thread/line is about.
There was a problem hiding this comment.
Agreed. Giving context around your question if Available=True is required.
There was a problem hiding this comment.
Available is definitely require. In master, ReasonInvalidConfiguration and ReasonDeployTimedOut are Available=True Degraded=False. My naive reaction to the reason strings has me suspecting the might actually call for Available=False Degraded=False. Can you list and services available (but degraded) under these conditions?
There was a problem hiding this comment.
I'm fine with Degraded=True here. I'm not clear on Available=True. What functionality is the operator available to fulfil when it has ReasonInvalidConfiguration or ReasonDeployTimedOut?
The Operator is Available because it is still watching its resources. If the configuration changes to a valid value for example, then the operator is able to get itself out of the Degraded state. That is how I view the Available=True in this scenario.
|
@wking: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
/close |
|
@sadasu: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
"Operator is non-functional" sounds a bit frightening. This commit pivots to say "Operator has no role on platform {platform-name}" in the cases where we determine we are not needed.
I'm also removing ReasonEmpty, because it's always nice to say something about why we picked the state we picked. In the
ReasonInvalidConfigurationandReasonDeployTimedOutcases, that makes me wonder if we really intend to be Available then. Maybe we should just echo whatever we're setting for Degraded in that case? FIXME/WIP until we decide.