-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Bug 2009879: Add admin ack Upgradeable condition gate test #26525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@jottofar: This pull request references Bugzilla bug 2009879, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
21fa931 to
4b34b5d
Compare
|
/retest |
|
/retest |
test/extended/util/adminack.go
Outdated
| return | ||
| } | ||
| var msg string | ||
| if msg = gateCm.Data["ack-4.8-kube-1.22-api-removals-in-4.9"]; msg == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will not have this gate in born-in-4.9+ clusters. Can we replace this with something generic that walks keys which happen to be present in gateCm.Data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets into whether the purpose of this is to test the code or to simply set the gate. The latter really being a pre-test step. Since this is a test I thought explicitly having to name the gate(s) was okay. As gates are added the test has to be updated in a very explicit and purposeful way. Otherwise I will also start to duplicate the code under test in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter really being a pre-test step.
It's hard to make setting the gate a pre-test step for steps that install 4.y and perform a chained update through 4.(y+1) to 4.(y+2) and beyond, if gate is new in 4.(y+1). So I'm hoping that this can be both "test the functionality" and "set the gate". But I can see some concerns about setting the gate in non-update conformance tests, so perhaps just extend openshift/release#22368 and actively create that ConfigMap when it doesn't exist, and populate it with all the acks we ever have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As gates are added the test has to be updated in a very explicit and purposeful way.
This isn't clear to me. Can't the "test the code" bit just ensure that Upgradeable responds as expected when all configured gates are acked (it goes False) and when any configured gate is not acked (it goes True)? I guess you'd need to include something to pick out gates that applied to the current 4.y. But I don't understand how this would be tied to a specific gate beyond the minor-version association.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just went ahead an looped through the configured gates to keep must of the logic here instead of adding to release. Having the test assume all the configured gates are valid, i.e. should be there, is reasonable enough.
test/extended/util/adminack.go
Outdated
| if !upgradeable(ctx, t.Config) { | ||
| if adminAckRequiredWithMessage(ctx, t.Config, msg) { | ||
| framework.Failf(fmt.Sprintf("Gate ack-4.8-kube-1.22-api-removals-in-4.9 has been ack'ed but Upgradeable is "+ | ||
| "false with reason AdminAckRequired and message %q.", msg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!upgradeable(...) is looking for "Upgradeable is not True". That will often be "because it's False", but might be "because it's unset". Can we use "not True" here? Or teach upgradeable to return a status pointer or something so we can distinguish between the various not-True cases in the failure message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it ... returns true if the Upgradeable condition is nil or is set to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so still, !upgradeable(...) could be "false" or could be "unknown" or whatever. So can we pull the whole Upgradeable condition back and do something like:
framework.Failf(fmt.Sprintf("Gate ack-4.8-kube-1.22-api-removals-in-4.9 has been ack'ed but Upgradeable is %s with reason %s and message: %s", cond.Status, cond.Reason, cond.Message))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so still, !upgradeable(...) could be "false" or could be "unknown" or whatever
No. Put another way upgradeable only returns false when cond != nil && cond.Status == configv1.ConditionFalse so that's why I hard-coded status and reason. Perhaps changing the method name would help. Maybe upgradeableNotFalse, but then we can have a double negative, so perhaps upgradeableNilOrTrue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it's that the condition's Status is not set (unknown ) not that cond is nil. Changed to reflect this understanding.
be5c058 to
17c7f3e
Compare
|
/retest |
wking
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
🎉 Now we just need to track down an approver...
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, jottofar, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@jottofar: All pull requests linked via external trackers have merged: Bugzilla bug 2009879 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherrypick release-4.9 |
|
@wking: new pull request created: #26629 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…tForAdminAck(Not)Required message We'd had an empty-string msg since this test-case landed in 4f5be5a (test: Add admin ack Upgradeable condition gate test, 2021-10-13, openshift#26525), as seen in [1]: Dec 9 19:12:51.005: INFO: Waiting for Upgradeable to be AdminAckRequired for "" ... Dec 9 19:12:51.279: INFO: Waiting for Upgradeable to not be AdminAckRequired for "" ... That's fine as long as we only have one ack gate, but this commit passes through the gate value to match the adminAckRequiredWithMessage call slightly earlier in AdminAckTest.test. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=2027929#c2
…MultipleReasons
When there's another reason like TechPreviewNoUpgrade, the CVO will be
complaining about both and set MultipleReasons. We want the test
suite to understand that that's still an AdminAckRequired complaint,
and not fail like [1]:
: [sig-cluster-lifecycle] TestAdminAck should succeed [Suite:openshift/conformance/parallel]
Run #0: 3m3s
{ fail [github.com/openshift/origin/test/extended/util/openshift/clusterversionoperator/adminack.go:117]: May 6 07:24:05.028: Error while waiting for Upgradeable to go AdminAckRequired with message "Kubernetes 1.25 and therefore OpenShift 4.12 remove several APIs which require admin consideration. Please see\nthe knowledge article https://access.redhat.com/articles/6955381 for details and instructions.\n": timed out waiting for the condition
Upgradeable: Status=False, Reason=MultipleReasons, Message="Cluster should not be upgraded between minor versions for multiple reasons: AdminAckRequired,FeatureGates_RestrictedFeatureGates_TechPreviewNoUpgrade\n* Kubernetes 1.25 and therefore OpenShift 4.12 remove several APIs which require admin consideration. Please see\nthe knowledge article https://access.redhat.com/articles/6955381 for details and instructions.\n\n* Cluster operator kube-apiserver should not be upgraded between minor versions: FeatureGatesUpgradeable: \"TechPreviewNoUpgrade\" does not allow updates".}
The Contains logic that I'm removing is from the original test-case
implementation in 4f5be5a (test: Add admin ack Upgradeable
condition gate test, 2021-10-13, openshift#26525), and seems to have been
assuming that we would concatenate the constituent reasons, instead of
replacing all of them with MultipleReasons.
[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-azure-techpreview/1522457624627384320
Create test to verify Admin Ack functionality and include in both
extendedande2e/upgradetests.