-
Notifications
You must be signed in to change notification settings - Fork 2.1k
ci-operator/config/openshift/release: Drop cross-minor rollback jobs #39897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci-operator/config/openshift/release: Drop cross-minor rollback jobs #39897
Conversation
|
/lgtm |
|
/retest-required Remaining retests: 0 against base HEAD e120d02 and 2 for PR HEAD 1663941fa8587aa6534872dcf73f3fdf13d8fc4d in total |
|
Some wires are getting crossed because someone just made these informers, but I had my doubts about the usefulness of doing that. |
|
Ah, thanks for the pointer. I think folks are conflating the various jobs with |
|
/retest-required Remaining retests: 0 against base HEAD 09d4cc8 and 1 for PR HEAD 1663941fa8587aa6534872dcf73f3fdf13d8fc4d in total |
|
/retest-required Remaining retests: 0 against base HEAD 8f4925f and 0 for PR HEAD 1663941fa8587aa6534872dcf73f3fdf13d8fc4d in total |
|
/lgtm cancel need to remove these jobs in release-controller. here's the 4.12 example I think |
The job flavor was originally added in 0837634 (Add ovn-upgrade-rollback job for 4.7->4.8, 2021-02-24, openshift#16260). The jobs have subsequently been cloned forward to new minors as part of the branching process. And as older jobs started failing, I'd been dropping them gradually like 856aab2 (ci-operator/config/openshift/release/openshift-release-master__ci-4.11-upgrade-from-stable-4.10: Drop failing rollback jobs, 2022-10-11, openshift#33005). But rounding with Jamo, the jobs no longer serve a useful role, and as 856aab2 points out, rollbacks between minor releases are not supported. Drop the likely-to-fail and not-useful-even-when-it-passes jobs in their entirety, so they stop getting cloned forward during branching. I'm also adjusting the release controller changes from 421c921 (Introducing Rollback informing jobs, 2023-05-19, openshift#39488). I'm dropping 4.12 and earlier rollback informers, so we can focus on 4.13 while we feel out the new process. And I'm pivoting 4.13 away from the cross-minor job that this pull request drops, and towards the rollback-oldest-supported job that will help back [1]. [1]: https://issues.redhat.com/browse/OTA-455
1663941 to
c5c89bf
Compare
vrutkovs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
|
Seems it needs another |
Because [1]: ERROR: The following differences were found: 3a4 > 03c544e5d55a55ae9f19d0de7d786341 .//core-services/release-controller/_releases/priv/release-ocp-4.12.json 35d35 < 1826a1b520574b66f152f814811c19f6 .//core-services/release-controller/_releases/priv/release-ocp-4.13.json 42a43 ... tells me what files need changing, but not what changes to make to them. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/39897/pull-ci-openshift-release-master-release-controller-config/1664331471080394752
|
[REHEARSALNOTIFIER] Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
@wking: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
vrutkovs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
/approve |
|
/aprpove |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bear-redhat, vrutkovs, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@wking: Updated the following 2 configmaps:
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…penshift#39897) * ci-operator/config/openshift/release: Drop cross-minor rollback jobs The job flavor was originally added in 0837634 (Add ovn-upgrade-rollback job for 4.7->4.8, 2021-02-24, openshift#16260). The jobs have subsequently been cloned forward to new minors as part of the branching process. And as older jobs started failing, I'd been dropping them gradually like 856aab2 (ci-operator/config/openshift/release/openshift-release-master__ci-4.11-upgrade-from-stable-4.10: Drop failing rollback jobs, 2022-10-11, openshift#33005). But rounding with Jamo, the jobs no longer serve a useful role, and as 856aab2 points out, rollbacks between minor releases are not supported. Drop the likely-to-fail and not-useful-even-when-it-passes jobs in their entirety, so they stop getting cloned forward during branching. I'm also adjusting the release controller changes from 421c921 (Introducing Rollback informing jobs, 2023-05-19, openshift#39488). I'm dropping 4.12 and earlier rollback informers, so we can focus on 4.13 while we feel out the new process. And I'm pivoting 4.13 away from the cross-minor job that this pull request drops, and towards the rollback-oldest-supported job that will help back [1]. [1]: https://issues.redhat.com/browse/OTA-455 * hack/validate-release-controller-config: Supplemental Git diff Because [1]: ERROR: The following differences were found: 3a4 > 03c544e5d55a55ae9f19d0de7d786341 .//core-services/release-controller/_releases/priv/release-ocp-4.12.json 35d35 < 1826a1b520574b66f152f814811c19f6 .//core-services/release-controller/_releases/priv/release-ocp-4.13.json 42a43 ... tells me what files need changing, but not what changes to make to them. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/39897/pull-ci-openshift-release-master-release-controller-config/1664331471080394752 --------- Co-authored-by: wking <wking@penguin>
openshift#39897 didn't run `make release-controllers`.
…y-4.14-upgrade-from-stable-4.13: Restore cross-minor rollbacks We'd dropped the last of these in 856aab2 (ci-operator/config/openshift/release/openshift-release-master__ci-4.11-upgrade-from-stable-4.10: Drop failing rollback jobs, 2022-10-11, openshift#33005) and 5e746a7 (ci-operator/config/openshift/release: Drop cross-minor rollback jobs, 2023-06-07, openshift#39897). There's now renewed interest in how these sorts of rollbacks look, so I'm reviving them for recent releases. I expect the issues with these rollbacks will at least include issues with the cluster-version operator losing the ability to write to ClusterVersion as the older CRD's enum rejects the capabilities added in the new release: openshift/api $ git diff origin/release-4.13..origin/release-4.14 -- config/v1/types_cluster_version.go | grep kubebuilder:validation:Enum -// +kubebuilder:validation:Enum=openshift-samples;baremetal;marketplace;Console;Insights;Storage;CSISnapshot;NodeTuning +// +kubebuilder:validation:Enum=openshift-samples;baremetal;marketplace;Console;Insights;Storage;CSISnapshot;NodeTuning;MachineAPI;Build;DeploymentConfig;ImageRegistry -// +kubebuilder:validation:Enum=None;v4.11;v4.12;v4.13;vCurrent +// +kubebuilder:validation:Enum=None;v4.11;v4.12;v4.13;v4.14;vCurrent So a cluster updating from 4.13 to 4.14 will enable (possibly implicitly) MachineAPI and other newly-labeled-in-4.14 capabilities. And then when the 4.13 ClusterVersion CRD is pushed during the rollback, those values become illegal, and the Kubernetes API server will reject the cluster-version operators attempts to write ClusterVersion status with errors complaining about the unrecognised MachineAPI and other capability string [1]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/941/pull-ci-openshift-cluster-version-operator-master-e2e-agnostic-ovn-upgrade-out-of-change/1671502401497993216/artifacts/e2e-agnostic-ovn-upgrade-out-of-change/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-7fd84b7b99-8b2qk_cluster-version-operator.log | grep 'ClusterVersion.config.openshift.io "version" is invalid' | tail -n1 I0621 16:45:41.154360 1 cvo.go:601] Error handling openshift-cluster-version/version: ClusterVersion.config.openshift.io "version" is invalid: status.capabilities.enabledCapabilities[3]: Unsupported value: "MachineAPI": supported values: "openshift-samples", "baremetal", "marketplace", "Console", "Insights", "Storage", "CSISnapshot", "NodeTuning" [1]: openshift/cluster-version-operator#941 (review)
The job flavor was originally added in 0837634 (#16260). The jobs have subsequently been cloned forward to new minors as part of the branching process. And as older jobs started failing, I'd been dropping them gradually like 856aab2 (#33005). But rounding with @jluhrsen, the jobs no longer serve a useful role, and as 856aab2 points out, rollbacks between minor releases are not supported. Drop the likely-to-fail and not-useful-even-when-it-passes jobs in their entirety, so they stop getting cloned forward during branching.