-
Notifications
You must be signed in to change notification settings - Fork 4.8k
OCPBUGS-6503: upgrade/adminack: simplify polling and unblock "guaranteed" post-upgrade check #27678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-6503: upgrade/adminack: simplify polling and unblock "guaranteed" post-upgrade check #27678
Conversation
…ade check openshift#27645 intended to add a guaranteed post-upgrade check but I have overlooked how exactly the polling is implemented and terminated, leading to the post-upgrade check never actually execute. Previously the test used `PollImmediateWithContext` for the each-10-minutes check. The `ConditionFunc` never actually returned `true` or non-nil `err`, so the `PollImmediateWithContext` never terminated by the means of `ConditionFunc`: it was always terminated by the `ctx.Done()` that the framework does on finished upgrade (or a test timeout). This means that `PollImmediateWithContext` always terminated with `err=wait.ErrWaitTimeout` and the `Test` method immediately returned, so the "guaranteed" check code is never reached. Given our `ConditionFunc` never terminates the polling, we can simplify and use the `wait.UntilWithContext` instead, which is a simpler version that precisely implements the desired loop (poll until context is done).
|
@petr-muller: This pull request references Jira Issue OCPBUGS-6503, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/jira refresh |
|
@petr-muller: This pull request references Jira Issue OCPBUGS-6503, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest |
|
/retest CI registry is unhappy: |
|
/retest |
During testing of OCPBUGS-5505, it was discovered that even with shortening the CVO cache TTL, CVO may still only update `Upgradeable` in its sync interval, which may be as high as 4 minutes. Hence the tests needs to wait for that time (I added 5 second buffer on top of that).
|
@petr-muller: This pull request references Jira Issue OCPBUGS-6503, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@petr-muller: This pull request references Jira Issue OCPBUGS-6503, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest |
|
GCP jobs affected by OHSS-18195, hope the AWS ones give me something to work with |
|
|
|
/retest |
|
/retest |
wking
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
/test e2e-aws-ovn-fips |
|
/retest-required |
|
/skip |
|
/test e2e-aws-ovn-fips |
1 similar comment
|
/test e2e-aws-ovn-fips |
|
/test e2e-aws-ovn-fips |
|
/test e2e-aws-ovn-fips |
|
/test ci/prow/e2e-aws-ovn-fips |
|
@petr-muller: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test e2e-aws-ovn-fips |
|
/override |
|
/override ci/prow/e2e-aws-ovn-fips |
|
@petr-muller: petr-muller unauthorized: /override is restricted to Repo administrators, approvers in top level OWNERS file, and the following github teams:openshift: openshift-release-oversight. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test e2e-aws-ovn-fips |
|
/hold Revision c3184c4 was retested 3 times: holding |
|
/hold cancel |
1 similar comment
|
/retest |
|
@petr-muller: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest |
|
@petr-muller: Some pull requests linked via external trackers have merged: The following pull requests linked via external trackers have not merged:
These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-6503 has not been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/jira refresh |
|
@petr-muller: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-6503 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
#27645 intended to add a guaranteed post-upgrade check but I have overlooked how exactly the polling is implemented and terminated, leading to the post-upgrade check never actually execute.
Previously the test used
PollImmediateWithContextfor the each-10-minutes check. TheConditionFuncnever actually returnedtrueor non-nilerr, so thePollImmediateWithContextnever terminated by the means ofConditionFunc: it was always terminated by thectx.Done()that the framework does on finished upgrade (or a test timeout). This means thatPollImmediateWithContextalways terminated witherr=wait.ErrWaitTimeoutand theTestmethod immediately returned, so the "guaranteed" check code is never reached.Given our
ConditionFuncnever terminates the polling, we can simplify and use thewait.UntilWithContextinstead, which is a simpler version that precisely implements the desired loop (poll until context is done).During testing of OCPBUGS-5505, it was discovered that even with shortening the CVO cache TTL, CVO may still only update
Upgradeablein its sync interval, which may be as high as 4 minutes. Hence the tests needs to wait for that time (I added 5 second buffer on top of that).