Skip to content

WIP: controllers: "Unsupported", etc. condition messaging#61

Closed
wking wants to merge 2 commits intoopenshift:masterfrom
wking:avoid-empty-condition-reasons
Closed

WIP: controllers: "Unsupported", etc. condition messaging#61
wking wants to merge 2 commits intoopenshift:masterfrom
wking:avoid-empty-condition-reasons

Conversation

@wking
Copy link
Member

@wking wking commented Oct 30, 2020

"Operator is non-functional" sounds a bit frightening. This commit pivots to say "Operator has no role on platform {platform-name}" in the cases where we determine we are not needed.

I'm also removing ReasonEmpty, because it's always nice to say something about why we picked the state we picked. In the ReasonInvalidConfiguration and ReasonDeployTimedOut cases, that makes me wonder if we really intend to be Available then. Maybe we should just echo whatever we're setting for Degraded in that case? FIXME/WIP until we decide.

wking added 2 commits October 30, 2020 15:26
The test-cases have names.  Using a sub-test will expose those names
in the output for folks who are hunting down failures.
"Operator is non-functional" sounds a bit frightening.  This commit
pivots to say "Operator has no role on platform {platform-name}" in
the cases where we determine we are not needed.

I'm also removing ReasonEmpty, because it's always nice to say
*something* about why we picked the state we picked.  In the
ReasonInvalidConfiguration and ReasonDeployTimedOut cases, that makes
me wonder if we really intend to be Available then.  Maybe we should
just echo whatever we're setting for Degraded in that case?  FIXME
until we decide.
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 30, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wking
To complete the pull request process, please assign dhellmann after the PR has been reviewed.
You can assign the PR to them by writing /assign @dhellmann in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

case ReasonInvalidConfiguration, ReasonDeployTimedOut:
v1helpers.SetStatusCondition(&conds, setStatusCondition(osconfigv1.OperatorDegraded, osconfigv1.ConditionTrue, string(newReason), msg))
v1helpers.SetStatusCondition(&conds, setStatusCondition(osconfigv1.OperatorAvailable, osconfigv1.ConditionTrue, string(ReasonEmpty), ""))
v1helpers.SetStatusCondition(&conds, setStatusCondition(osconfigv1.OperatorAvailable, osconfigv1.ConditionTrue, string(ReasonAsExpected), "FIXME: are we really available here?"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Degraded makes sense

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with Degraded=True here. I'm not clear on Available=True. What functionality is the operator available to fulfil when it has ReasonInvalidConfiguration or ReasonDeployTimedOut?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We looked into several options detailed here ; https://github.com/openshift/enhancements/blob/master/enhancements/baremetal/an-slo-for-baremetal.md#not-in-use-slo-behaviors

There seems to be no defined behavior when a CO needs to convey that the Operator is running, is non-functional and that is expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither ReasonInvalidConfiguration nor ReasonDeployTimedOut are "expected non-functional", are they? That is ReasonUnsupported, which I also touch in this PR, but it is not what this thread/line is about.

Copy link
Contributor

@sadasu sadasu Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Giving context around your question if Available=True is required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Available is definitely require. In master, ReasonInvalidConfiguration and ReasonDeployTimedOut are Available=True Degraded=False. My naive reaction to the reason strings has me suspecting the might actually call for Available=False Degraded=False. Can you list and services available (but degraded) under these conditions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with Degraded=True here. I'm not clear on Available=True. What functionality is the operator available to fulfil when it has ReasonInvalidConfiguration or ReasonDeployTimedOut?

The Operator is Available because it is still watching its resources. If the configuration changes to a valid value for example, then the operator is able to get itself out of the Degraded state. That is how I view the Available=True in this scenario.

@openshift-ci-robot
Copy link
Contributor

@wking: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 6, 2020
@openshift-merge-robot
Copy link
Contributor

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-metal-ipi 98e1c29 link /test e2e-metal-ipi
ci/prow/e2e-agnostic 98e1c29 link /test e2e-agnostic
ci/prow/e2e-metal-ipi-ovn-ipv6 98e1c29 link /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 22, 2021
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 24, 2021
@sadasu
Copy link
Contributor

sadasu commented Mar 25, 2021

/close
I believe this is already fixed.

@openshift-ci-robot
Copy link
Contributor

@sadasu: Closed this PR.

Details

In response to this:

/close
I believe this is already fixed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants