-
Notifications
You must be signed in to change notification settings - Fork 2.1k
ci-operator/config/openshift/release: Drop failing minor rollback tests #26629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci-operator/config/openshift/release: Drop failing minor rollback tests #26629
Conversation
In 4.11, [1], [2], and [3] all have recent passes, so I'm leaving them in. In 4.10, [4] and [5] have recent passes, so I'm leaving them in. Checking [6], both [7] and [8] update from 4.9 to 4.10 and start heading back towards 4.9, but they hang a control-plane node on drain. Same for the OVN flavor [9,10,11]. Since we don't support minor rollbacks, or really rollbacks of any sort [12], I'm dropping these jobs instead of root-causing the hang. In 4.9, [13] and [14] have recent passes, so I'm leaving them in. We already dropped the other 4.8 -> 4.9 -> 4.8 rollback jobs back in b3d04e5 (ci-operator/config/openshift/release: Drop 4.8 -> 4.9 -> 4.8 rollback jobs, 2021-09-27, openshift#22287). In 4.8, [15] and [16] have recent passes, so I'm leaving them in. 4.7 -> 4.8 -> 4.7 rollback tests timeout [17,18,19,20,21,22], without the pretty e2e-interval chart to make identifying the stuck thing easier. But again, not supported, so dropping instead of sinking time into root-causing. On 4.7, [23] and [24] have recent passes, so I'm leaving them in. 4.6 -> 4.7 -> 4.6 rollback tests timeout [25,26], so dropping them. On 4.6, [27] has recent passes, so I'm leaving it in. 4.5 -> 4.6 -> 4.7 rollback tests timeout [28,29], failed to build, but I've been dropping all the 4.y minor rollback jobs since 4.10, so keeping these around to see if subsequent runs will build and pass seems unlikely to be worth the effort. Dropping them too. 4.5 is end-of-life [30], so I'm dropping 4.4 -> 4.5 -> 4.4 rollback jobs without even looking to see if they're passing. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-e2e-aws-upgrade-rollback [2]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-upgrade-rollback [3]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade-rollback [4]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-ci-4.10-e2e-aws-upgrade-rollback [5]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-upgrade-rollback-oldest-supported [6]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-upgrade-rollback [7]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-upgrade-rollback/1497288333930270720 [8]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-upgrade-rollback/1498013325101895680 [9]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-ovn-upgrade-rollback [10]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-ovn-upgrade-rollback/1497671569042837504 [11]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-ovn-upgrade-rollback/1498033961199210496 [12]: https://github.com/openshift/openshift-docs/blame/d4762f0f626a4dddb9d7330e63a3bb6cb73f5bb5/modules/update-upgrading-cli.adoc#L160-L162 [13]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-ci-4.9-e2e-aws-upgrade-rollback [14]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-nightly-4.9-e2e-aws-upgrade-rollback-oldest-supported [15]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ci-4.8-e2e-aws-upgrade-rollback [16]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-upgrade-rollback-oldest-supported [17]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade-rollback [18]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade-rollback/1496996649593999360 [19]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade-rollback/1497721649917595648 [20]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade-rollback [21]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade-rollback/1497620733604401152 [22]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade-rollback/1497983125802717184 [23]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-ci-openshift-release-master-ci-4.7-e2e-aws-upgrade-rollback [24]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-ci-openshift-release-master-nightly-4.7-e2e-aws-upgrade-rollback-oldest-supported [25]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-ci-openshift-release-master-ci-4.7-upgrade-from-stable-4.6-e2e-aws-upgrade-rollback [26]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.7-upgrade-from-stable-4.6-e2e-aws-upgrade-rollback/1497650430434349056 [27]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-informing#periodic-ci-openshift-release-master-ci-4.6-e2e-aws-upgrade-rollback [28]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-informing#periodic-ci-openshift-release-master-ci-4.6-upgrade-from-stable-4.5-e2e-aws-upgrade-rollback [29]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.6-upgrade-from-stable-4.5-e2e-aws-upgrade-rollback/1494388672508727296 [30]: https://access.redhat.com/support/policy/updates/openshift#dates
0a658b5 to
2d73374
Compare
LalatenduMohanty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: LalatenduMohanty, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@wking: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@wking: Updated the following 2 configmaps:
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…1-upgrade-from-stable-4.10: Drop failing rollback jobs Like 2d73374 (origin/pr/26629) ci-operator/config/openshift/release: Drop failing minor rollback tests, 2022-02-28, openshift#26629), but for the 4.10-to-4.11-to-4.10 rollbacks. This time both the OVN and SDN rollback jobs are perma-failing [1,2], and in both cases the issue is sticking on [3,4]: INFO: cluster upgrade is Progressing: Working towards 4.10.35: 614 of 773 done (79% complete), waiting on openshift-controller-manager with that operator crash-looping on [5,6]: F1010 09:51:56.918590 1 cmd.go:138] open /var/run/configmaps/config/config.yaml: permission denied I haven't dug in more deeply to try to understand that failure, but as 2d73374 points out: > Since we don't support minor rollbacks, or really rollbacks of any > sort [12], I'm dropping these jobs instead of root-causing the hang. > ... > [12]: https://github.com/openshift/openshift-docs/blame/d4762f0f626a4dddb9d7330e63a3bb6cb73f5bb5/modules/update-upgrading-cli.adoc#L160-L162 Since then, those docs have moved to [7], but the lack of rollback support still stands. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade-rollback [2]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-upgrade-rollback [3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade-rollback/1579338022623645696 [4]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-upgrade-rollback/1578454440359235584 [5]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade-rollback/1579338022623645696/artifacts/e2e-aws-ovn-upgrade-rollback/gather-extra/artifacts/pods/openshift-controller-manager-operator_openshift-controller-manager-operator-7fbc8cc67d-zbrv4_openshift-controller-manager-operator_previous.log [6]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-upgrade-rollback/1578454440359235584/artifacts/e2e-aws-upgrade-rollback/gather-extra/artifacts/pods/openshift-controller-manager-operator_openshift-controller-manager-operator-7fbc8cc67d-s5pwz_openshift-controller-manager-operator_previous.log [7]: https://github.com/openshift/openshift-docs/blob/7f87267bc69d65abd96e6b783100195c6b78549f/updating/updating-troubleshooting.adoc
…1-upgrade-from-stable-4.10: Drop failing rollback jobs (#33005) Like 2d73374 (origin/pr/26629) ci-operator/config/openshift/release: Drop failing minor rollback tests, 2022-02-28, #26629), but for the 4.10-to-4.11-to-4.10 rollbacks. This time both the OVN and SDN rollback jobs are perma-failing [1,2], and in both cases the issue is sticking on [3,4]: INFO: cluster upgrade is Progressing: Working towards 4.10.35: 614 of 773 done (79% complete), waiting on openshift-controller-manager with that operator crash-looping on [5,6]: F1010 09:51:56.918590 1 cmd.go:138] open /var/run/configmaps/config/config.yaml: permission denied I haven't dug in more deeply to try to understand that failure, but as 2d73374 points out: > Since we don't support minor rollbacks, or really rollbacks of any > sort [12], I'm dropping these jobs instead of root-causing the hang. > ... > [12]: https://github.com/openshift/openshift-docs/blame/d4762f0f626a4dddb9d7330e63a3bb6cb73f5bb5/modules/update-upgrading-cli.adoc#L160-L162 Since then, those docs have moved to [7], but the lack of rollback support still stands. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade-rollback [2]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-upgrade-rollback [3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade-rollback/1579338022623645696 [4]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-upgrade-rollback/1578454440359235584 [5]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade-rollback/1579338022623645696/artifacts/e2e-aws-ovn-upgrade-rollback/gather-extra/artifacts/pods/openshift-controller-manager-operator_openshift-controller-manager-operator-7fbc8cc67d-zbrv4_openshift-controller-manager-operator_previous.log [6]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-upgrade-rollback/1578454440359235584/artifacts/e2e-aws-upgrade-rollback/gather-extra/artifacts/pods/openshift-controller-manager-operator_openshift-controller-manager-operator-7fbc8cc67d-s5pwz_openshift-controller-manager-operator_previous.log [7]: https://github.com/openshift/openshift-docs/blob/7f87267bc69d65abd96e6b783100195c6b78549f/updating/updating-troubleshooting.adoc
We don't support rollbacks of any kind, let alone minor-version rollbacks. 4.(y-1) -> 4.y -> 4.(y-1) rollback jobs are failing across the board for 4.10 and before. Root-causing and potentially fixing the failures might be interesting, but because the behavior is not supported, and we have limited time to investigate and fix, just drop the jobs. If, in the future, we gain more time for investigation, we can restore these jobs. Details in the commit message with links to example runs for each job I'm dropping.