-
Notifications
You must be signed in to change notification settings - Fork 150
Bug 1980531: Add release.openshift.io/delete annotation to consoleLink CRDs #565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Looks like we'll need to update the console e2e tests which expect the items we're removing to be there. cc @kdoberst |
@spadgett I'm assuming the console e2e tests are tests that cover both the console-operator and the console code via the browser. With the above changes, the tests are going to fail because the links are missing. But once the code for openshift/console#9495 is merged in, the links will display because the console code will be providing the links and the tests (in theory) will start passing. The changes proposed in openshift/console#9495 aren't dependent on these console-operator changes. There is code to ignore these two specific help links if they happen to be passed from the console-operator. So here is what I'm thinking ... once openshift/console#9495 is merged in, re-run the tests for this PR and see if they all pass. Thoughts? |
|
@jhadvig: This pull request references Bugzilla bug 1980531, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
👍 |
My only hesitation is that the checks to ignore the help links won't be needed with this change, so we'd want a follow up PR to remove those. It's going to be 3 PRs either way. It might be simpler to
|
Not 100% sure if CVO will remove the ConsoleLink CRs on update. But I thing that the annotation is for the upgrade case, so the when updating cluster from 4.8. to 4.9. it will remove the manifest. |
|
@jhadvig @spadgett My concern is that we don't set up a situation where the tests start failing for other PRs and that we don't accidentally set up a situation where the links are showing up as duplicates (once from the operator and once from the new console code). So how about this approach:
The down side is that there is an issue getting step 3 in, then there is a risk that 4.9 will be released without the help links or having to reverse steps 1 and 2. |
|
Opened a new PR and modified an existing PR to complete the fix for Bug 1980531. This PR should be merged in after openshift/console#9702 which removes a test and allow tests on this PR to pass. As soon as this PR is merged, then |
|
/retest |
spadgett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jhadvig, spadgett The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@jhadvig: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@jhadvig: An error was encountered searching for external tracker bugs for bug 1980531 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
could not parse external identifier "openshift/console/pull/9385#pullrequestreview-702386658" as pull: invalid pull identifier: could not parse 9385#pullrequestreview-702386658 as number: strconv.Atoi: parsing "9385#pullrequestreview-702386658": invalid syntax
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
To sort after 90_consoleplugin.crd.yaml, which this commit series moved to 90. This avoids install-time deadlocks like [1]: level=error msg=Cluster operator monitoring Available is False with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io) level=error msg=Cluster operator monitoring Degraded is True with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io) ... level=error msg=Cluster initialization failed because one or more operators are not functioning properly. where we were deadlocking with: * Console operator was blocked by the missing ConsolePlugin CRD and not proceeding to mark its ClusterOperator Available=True. * Cluster-version operator was blocked by the sad ClusterOperator and not proceeding to push the later-manifest-name ConsolePlugin CRD. Backing up the CVO side of that claim with the manifests in the assembled release image: $ w3m -dump -cols 200 https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/release/artifacts/release-payload-latest/ | grep 'console-operator.*\(operator[.]yaml\|consoleplugin\)' • [file] 0000_50_console-operator_03-rbac-role-ns-operator.yaml • [file] 0000_50_console-operator_07-operator.yaml • [file] 0000_50_console-operator_08-clusteroperator.yaml • [file] 0000_50_console-operator_90_consoleplugin.crd.yaml and gathered logs: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/e2e-gcp-ovn/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-c7fd575cc-xmc8h_cluster-version-operator.log | grep -B100000 'Result of work' | grep console | tail I0803 20:05:26.397553 1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console-operator/console-operator" (576 of 858) I0803 20:05:26.397758 1 sync_worker.go:975] Running sync for serviceaccount "openshift-console/console" (577 of 858) I0803 20:05:26.448647 1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console/console" (577 of 858) I0803 20:05:26.448740 1 sync_worker.go:975] Running sync for consoleclidownload "helm-download-links" (578 of 858) I0803 20:05:26.498688 1 sync_worker.go:990] Done syncing for consoleclidownload "helm-download-links" (578 of 858) I0803 20:05:26.498752 1 sync_worker.go:975] Running sync for deployment "openshift-console-operator/console-operator" (579 of 858) I0803 20:05:26.574316 1 sync_worker.go:990] Done syncing for deployment "openshift-console-operator/console-operator" (579 of 858) I0803 20:05:26.574463 1 sync_worker.go:975] Running sync for clusteroperator "console" (580 of 858) E0803 20:05:26.574829 1 task.go:117] error running apply for clusteroperator "console" (580 of 858): Cluster operator console is not available I0803 20:05:26.575435 1 task_graph.go:550] Result of work: [Cluster operator monitoring is not available Cluster operator console is not available] Showing that the last 'console' manifest reconciliation attempt in the most recently completed reconciliation round was the ClusterOperator. By shifting the ClusterOperator out to 95, it will come in after 90's ConsolePlugin CRD, and ensure that the CVO pushes that CRD before wondering about the ClusterOperator health. I'm not bothering to move the two ConsoleLink manifests, which have been deletion manifests since 6c6c5ce (Add release.openshift.io/delete annotation to consoleLink CRDs, 2021-07-15, openshift#565, 4.9 [2]), so we are well past the days when the position of these release manifests relative to the other console manifests was significant. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
Since these manifests became deletion manifests in 6c6c5ce (Add release.openshift.io/delete annotation to consoleLink CRDs, 2021-07-15, openshift#565, 4.9 [2]), the only change has been 0f860bb (Add console capability to all manifests in manifests/ and quickstarts/, 2022-07-26, openshift#665). We can remove these deletion references now, because a 4.8 cluster that might have included these resources should have completed an update to 4.9 which would have removed them. And if that failed, they should have completed an update to one of the later 4.y and removed the resources. By removing the resource, we save the cluster-version operator some time checking to ensure deletion, and only expose ourselves to leaking the resources on clusters that updated from 4.8 through to 4.14 without ever having completed an update before reaching 4.14. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
Since these manifests became deletion manifests in 6c6c5ce (Add release.openshift.io/delete annotation to consoleLink CRDs, 2021-07-15, openshift#565, 4.9 [1]), the only change has been 0f860bb (Add console capability to all manifests in manifests/ and quickstarts/, 2022-07-26, openshift#665). We can remove these deletion references now, because a 4.8 cluster that might have included these resources should have completed an update to 4.9 which would have removed them. And if that failed, they should have completed an update to one of the later 4.y and removed the resources. By removing the resource, we save the cluster-version operator some time checking to ensure deletion, and only expose ourselves to leaking the resources on clusters that updated from 4.8 through to 4.14 without ever having completed an update before reaching 4.14. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
To sort after 90_consoleplugin.crd.yaml, which this commit series moved to 90. This avoids install-time deadlocks like [1]: level=error msg=Cluster operator monitoring Available is False with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io) level=error msg=Cluster operator monitoring Degraded is True with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io) ... level=error msg=Cluster initialization failed because one or more operators are not functioning properly. where we were deadlocking with: * Console operator was blocked by the missing ConsolePlugin CRD and not proceeding to mark its ClusterOperator Available=True. * Cluster-version operator was blocked by the sad ClusterOperator and not proceeding to push the later-manifest-name ConsolePlugin CRD. Backing up the CVO side of that claim with the manifests in the assembled release image: $ w3m -dump -cols 200 https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/release/artifacts/release-payload-latest/ | grep 'console-operator.*\(operator[.]yaml\|consoleplugin\)' • [file] 0000_50_console-operator_03-rbac-role-ns-operator.yaml • [file] 0000_50_console-operator_07-operator.yaml • [file] 0000_50_console-operator_08-clusteroperator.yaml • [file] 0000_50_console-operator_90_consoleplugin.crd.yaml and gathered logs: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/e2e-gcp-ovn/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-c7fd575cc-xmc8h_cluster-version-operator.log | grep -B100000 'Result of work' | grep console | tail I0803 20:05:26.397553 1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console-operator/console-operator" (576 of 858) I0803 20:05:26.397758 1 sync_worker.go:975] Running sync for serviceaccount "openshift-console/console" (577 of 858) I0803 20:05:26.448647 1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console/console" (577 of 858) I0803 20:05:26.448740 1 sync_worker.go:975] Running sync for consoleclidownload "helm-download-links" (578 of 858) I0803 20:05:26.498688 1 sync_worker.go:990] Done syncing for consoleclidownload "helm-download-links" (578 of 858) I0803 20:05:26.498752 1 sync_worker.go:975] Running sync for deployment "openshift-console-operator/console-operator" (579 of 858) I0803 20:05:26.574316 1 sync_worker.go:990] Done syncing for deployment "openshift-console-operator/console-operator" (579 of 858) I0803 20:05:26.574463 1 sync_worker.go:975] Running sync for clusteroperator "console" (580 of 858) E0803 20:05:26.574829 1 task.go:117] error running apply for clusteroperator "console" (580 of 858): Cluster operator console is not available I0803 20:05:26.575435 1 task_graph.go:550] Result of work: [Cluster operator monitoring is not available Cluster operator console is not available] Showing that the last 'console' manifest reconciliation attempt in the most recently completed reconciliation round was the ClusterOperator. By shifting the ClusterOperator out to 95, it will come in after 90's ConsolePlugin CRD, and ensure that the CVO pushes that CRD before wondering about the ClusterOperator health. I'm not bothering to move the two ConsoleLink manifests, which have been deletion manifests since 6c6c5ce (Add release.openshift.io/delete annotation to consoleLink CRDs, 2021-07-15, openshift#565, 4.9 [2]), so we are well past the days when the position of these release manifests relative to the other console manifests was significant. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
To sort after 90_consoleplugin.crd.yaml, which this commit series moved to 90. This avoids install-time deadlocks like [1]: level=error msg=Cluster operator monitoring Available is False with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io) level=error msg=Cluster operator monitoring Degraded is True with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io) ... level=error msg=Cluster initialization failed because one or more operators are not functioning properly. where we were deadlocking with: * Console operator was blocked by the missing ConsolePlugin CRD and not proceeding to mark its ClusterOperator Available=True. * Cluster-version operator was blocked by the sad ClusterOperator and not proceeding to push the later-manifest-name ConsolePlugin CRD. Backing up the CVO side of that claim with the manifests in the assembled release image: $ w3m -dump -cols 200 https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/release/artifacts/release-payload-latest/ | grep 'console-operator.*\(operator[.]yaml\|consoleplugin\)' • [file] 0000_50_console-operator_03-rbac-role-ns-operator.yaml • [file] 0000_50_console-operator_07-operator.yaml • [file] 0000_50_console-operator_08-clusteroperator.yaml • [file] 0000_50_console-operator_90_consoleplugin.crd.yaml and gathered logs: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/e2e-gcp-ovn/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-c7fd575cc-xmc8h_cluster-version-operator.log | grep -B100000 'Result of work' | grep console | tail I0803 20:05:26.397553 1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console-operator/console-operator" (576 of 858) I0803 20:05:26.397758 1 sync_worker.go:975] Running sync for serviceaccount "openshift-console/console" (577 of 858) I0803 20:05:26.448647 1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console/console" (577 of 858) I0803 20:05:26.448740 1 sync_worker.go:975] Running sync for consoleclidownload "helm-download-links" (578 of 858) I0803 20:05:26.498688 1 sync_worker.go:990] Done syncing for consoleclidownload "helm-download-links" (578 of 858) I0803 20:05:26.498752 1 sync_worker.go:975] Running sync for deployment "openshift-console-operator/console-operator" (579 of 858) I0803 20:05:26.574316 1 sync_worker.go:990] Done syncing for deployment "openshift-console-operator/console-operator" (579 of 858) I0803 20:05:26.574463 1 sync_worker.go:975] Running sync for clusteroperator "console" (580 of 858) E0803 20:05:26.574829 1 task.go:117] error running apply for clusteroperator "console" (580 of 858): Cluster operator console is not available I0803 20:05:26.575435 1 task_graph.go:550] Result of work: [Cluster operator monitoring is not available Cluster operator console is not available] Showing that the last 'console' manifest reconciliation attempt in the most recently completed reconciliation round was the ClusterOperator. By shifting the ClusterOperator out to 95, it will come in after 90's ConsolePlugin CRD, and ensure that the CVO pushes that CRD before wondering about the ClusterOperator health. I'm not bothering to move the two ConsoleLink manifests, which have been deletion manifests since 6c6c5ce (Add release.openshift.io/delete annotation to consoleLink CRDs, 2021-07-15, openshift#565, 4.9 [2]), so we are well past the days when the position of these release manifests relative to the other console manifests was significant. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
No description provided.