Skip to content

Conversation

@jhadvig
Copy link
Member

@jhadvig jhadvig commented Jul 15, 2021

No description provided.

@openshift-ci openshift-ci bot requested review from bparees and spadgett July 15, 2021 14:44
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 15, 2021
@spadgett
Copy link
Member

Looks like we'll need to update the console e2e tests which expect the items we're removing to be there. cc @kdoberst

@kdoberst
Copy link

Looks like we'll need to update the console e2e tests which expect the items we're removing to be there. cc @kdoberst

@spadgett I'm assuming the console e2e tests are tests that cover both the console-operator and the console code via the browser. With the above changes, the tests are going to fail because the links are missing. But once the code for openshift/console#9495 is merged in, the links will display because the console code will be providing the links and the tests (in theory) will start passing.

The changes proposed in openshift/console#9495 aren't dependent on these console-operator changes. There is code to ignore these two specific help links if they happen to be passed from the console-operator. So here is what I'm thinking ... once openshift/console#9495 is merged in, re-run the tests for this PR and see if they all pass. Thoughts?

@jhadvig jhadvig changed the title [DO NOT MERGE] Add release.openshift.io/delete annotation to consoleLink CRDs Add release.openshift.io/delete annotation to consoleLink CRDs Jul 16, 2021
@jhadvig jhadvig changed the title Add release.openshift.io/delete annotation to consoleLink CRDs Bug 1980531: Add release.openshift.io/delete annotation to consoleLink CRDs Jul 16, 2021
@openshift-ci openshift-ci bot added the bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. label Jul 16, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 16, 2021

@jhadvig: This pull request references Bugzilla bug 1980531, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @yapei

Details

In response to this:

Bug 1980531: Add release.openshift.io/delete annotation to consoleLink CRDs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jul 16, 2021
@openshift-ci openshift-ci bot requested a review from yapei July 16, 2021 08:16
@jhadvig
Copy link
Member Author

jhadvig commented Jul 16, 2021

once openshift/console#9495 is merged in, re-run the tests for this PR and see if they all pass. Thoughts?

👍

@spadgett
Copy link
Member

The changes proposed in openshift/console#9495 aren't dependent on these console-operator changes. There is code to ignore these two specific help links if they happen to be passed from the console-operator. So here is what I'm thinking ... once openshift/console#9495 is merged in, re-run the tests for this PR and see if they all pass. Thoughts?

My only hesitation is that the checks to ignore the help links won't be needed with this change, so we'd want a follow up PR to remove those. It's going to be 3 PRs either way. It might be simpler to

  1. Update the console tests in one PR
  2. Merge the console-operator PR that removes the ConsoleLink resources
  3. Merge the console PR that adds the links back

@jhadvig
Copy link
Member Author

jhadvig commented Jul 16, 2021

  1. Merge the console-operator PR that removes the ConsoleLink resources

Not 100% sure if CVO will remove the ConsoleLink CRs on update. But I thing that the annotation is for the upgrade case, so the when updating cluster from 4.8. to 4.9. it will remove the manifest.

@kdoberst
Copy link

@jhadvig @spadgett My concern is that we don't set up a situation where the tests start failing for other PRs and that we don't accidentally set up a situation where the links are showing up as duplicates (once from the operator and once from the new console code). So how about this approach:

  1. Close the current PR for adding the translated links in the console code. A new PR is created to modify the console to not show any help links coming from the console. In this same PR the tests are modified to verify that there are no help links.
  2. Merge the changes for this PR.
  3. Put in the code to add the translation link into the console code, add that ability for the operator to add links and finally modify the tests to verify there are the help links without duplicates. It is at this point that we verify the manifests were removed.

The down side is that there is an issue getting step 3 in, then there is a risk that 4.9 will be released without the help links or having to reverse steps 1 and 2.

@jhadvig
Copy link
Member Author

jhadvig commented Jul 22, 2021

@kdoberst I think that if we go with the actions as @spadgett suggested in his comment we should be OK :)

@kdoberst
Copy link

kdoberst commented Aug 2, 2021

Opened a new PR and modified an existing PR to complete the fix for Bug 1980531.

This PR should be merged in after openshift/console#9702 which removes a test and allow tests on this PR to pass.

As soon as this PR is merged, then openshift/console#9495 openshift/console#9811 should be merged.

@kdoberst
Copy link

kdoberst commented Aug 4, 2021

/retest

Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 4, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jhadvig, spadgett

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2021

@jhadvig: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-single-node 6c6c5ce link /test e2e-aws-single-node

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci openshift-ci bot merged commit 4d133ea into openshift:master Aug 5, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 5, 2021

@jhadvig: An error was encountered searching for external tracker bugs for bug 1980531 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details.

Full error message. could not parse external identifier "openshift/console/pull/9385#pullrequestreview-702386658" as pull: invalid pull identifier: could not parse 9385#pullrequestreview-702386658 as number: strconv.Atoi: parsing "9385#pullrequestreview-702386658": invalid syntax

Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

Details

In response to this:

Bug 1980531: Add release.openshift.io/delete annotation to consoleLink CRDs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking added a commit to wking/console-operator that referenced this pull request Aug 3, 2023
To sort after 90_consoleplugin.crd.yaml, which this commit series
moved to 90.  This avoids install-time deadlocks like [1]:

  level=error msg=Cluster operator monitoring Available is False with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io)
  level=error msg=Cluster operator monitoring Degraded is True with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io)
  ...
  level=error msg=Cluster initialization failed because one or more operators are not functioning properly.

where we were deadlocking with:

* Console operator was blocked by the missing ConsolePlugin CRD and
  not proceeding to mark its ClusterOperator Available=True.
* Cluster-version operator was blocked by the sad ClusterOperator and
  not proceeding to push the later-manifest-name ConsolePlugin CRD.

Backing up the CVO side of that claim with the manifests in the
assembled release image:

  $ w3m -dump -cols 200 https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/release/artifacts/release-payload-latest/ | grep 'console-operator.*\(operator[.]yaml\|consoleplugin\)'
    • [file] 0000_50_console-operator_03-rbac-role-ns-operator.yaml
    • [file] 0000_50_console-operator_07-operator.yaml
    • [file] 0000_50_console-operator_08-clusteroperator.yaml
    • [file] 0000_50_console-operator_90_consoleplugin.crd.yaml

and gathered logs:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/e2e-gcp-ovn/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-c7fd575cc-xmc8h_cluster-version-operator.log | grep -B100000 'Result of work' | grep console | tail
  I0803 20:05:26.397553       1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console-operator/console-operator" (576 of 858)
  I0803 20:05:26.397758       1 sync_worker.go:975] Running sync for serviceaccount "openshift-console/console" (577 of 858)
  I0803 20:05:26.448647       1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console/console" (577 of 858)
  I0803 20:05:26.448740       1 sync_worker.go:975] Running sync for consoleclidownload "helm-download-links" (578 of 858)
  I0803 20:05:26.498688       1 sync_worker.go:990] Done syncing for consoleclidownload "helm-download-links" (578 of 858)
  I0803 20:05:26.498752       1 sync_worker.go:975] Running sync for deployment "openshift-console-operator/console-operator" (579 of 858)
  I0803 20:05:26.574316       1 sync_worker.go:990] Done syncing for deployment "openshift-console-operator/console-operator" (579 of 858)
  I0803 20:05:26.574463       1 sync_worker.go:975] Running sync for clusteroperator "console" (580 of 858)
  E0803 20:05:26.574829       1 task.go:117] error running apply for clusteroperator "console" (580 of 858): Cluster operator console is not available
  I0803 20:05:26.575435       1 task_graph.go:550] Result of work: [Cluster operator monitoring is not available Cluster operator console is not available]

Showing that the last 'console' manifest reconciliation attempt in the
most recently completed reconciliation round was the ClusterOperator.

By shifting the ClusterOperator out to 95, it will come in after 90's
ConsolePlugin CRD, and ensure that the CVO pushes that CRD before
wondering about the ClusterOperator health.

I'm not bothering to move the two ConsoleLink manifests, which have
been deletion manifests since 6c6c5ce (Add
release.openshift.io/delete annotation to consoleLink CRDs,
2021-07-15, openshift#565, 4.9 [2]), so we are well past the days when the
position of these release manifests relative to the other console
manifests was significant.

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
wking added a commit to wking/console-operator that referenced this pull request Aug 3, 2023
Since these manifests became deletion manifests in 6c6c5ce (Add
release.openshift.io/delete annotation to consoleLink CRDs,
2021-07-15, openshift#565, 4.9 [2]), the only change has been 0f860bb (Add
console capability to all manifests in manifests/ and quickstarts/,
2022-07-26, openshift#665).  We can remove these deletion references now,
because a 4.8 cluster that might have included these resources should
have completed an update to 4.9 which would have removed them. And if
that failed, they should have completed an update to one of the later
4.y and removed the resources. By removing the resource, we save the
cluster-version operator some time checking to ensure deletion, and
only expose ourselves to leaking the resources on clusters that
updated from 4.8 through to 4.14 without ever having completed an
update before reaching 4.14.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
wking added a commit to wking/console-operator that referenced this pull request Aug 3, 2023
Since these manifests became deletion manifests in 6c6c5ce (Add
release.openshift.io/delete annotation to consoleLink CRDs,
2021-07-15, openshift#565, 4.9 [1]), the only change has been 0f860bb (Add
console capability to all manifests in manifests/ and quickstarts/,
2022-07-26, openshift#665).  We can remove these deletion references now,
because a 4.8 cluster that might have included these resources should
have completed an update to 4.9 which would have removed them. And if
that failed, they should have completed an update to one of the later
4.y and removed the resources. By removing the resource, we save the
cluster-version operator some time checking to ensure deletion, and
only expose ourselves to leaking the resources on clusters that
updated from 4.8 through to 4.14 without ever having completed an
update before reaching 4.14.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
wking added a commit to wking/console-operator that referenced this pull request Sep 27, 2023
To sort after 90_consoleplugin.crd.yaml, which this commit series
moved to 90.  This avoids install-time deadlocks like [1]:

  level=error msg=Cluster operator monitoring Available is False with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io)
  level=error msg=Cluster operator monitoring Degraded is True with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io)
  ...
  level=error msg=Cluster initialization failed because one or more operators are not functioning properly.

where we were deadlocking with:

* Console operator was blocked by the missing ConsolePlugin CRD and
  not proceeding to mark its ClusterOperator Available=True.
* Cluster-version operator was blocked by the sad ClusterOperator and
  not proceeding to push the later-manifest-name ConsolePlugin CRD.

Backing up the CVO side of that claim with the manifests in the
assembled release image:

  $ w3m -dump -cols 200 https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/release/artifacts/release-payload-latest/ | grep 'console-operator.*\(operator[.]yaml\|consoleplugin\)'
    • [file] 0000_50_console-operator_03-rbac-role-ns-operator.yaml
    • [file] 0000_50_console-operator_07-operator.yaml
    • [file] 0000_50_console-operator_08-clusteroperator.yaml
    • [file] 0000_50_console-operator_90_consoleplugin.crd.yaml

and gathered logs:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/e2e-gcp-ovn/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-c7fd575cc-xmc8h_cluster-version-operator.log | grep -B100000 'Result of work' | grep console | tail
  I0803 20:05:26.397553       1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console-operator/console-operator" (576 of 858)
  I0803 20:05:26.397758       1 sync_worker.go:975] Running sync for serviceaccount "openshift-console/console" (577 of 858)
  I0803 20:05:26.448647       1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console/console" (577 of 858)
  I0803 20:05:26.448740       1 sync_worker.go:975] Running sync for consoleclidownload "helm-download-links" (578 of 858)
  I0803 20:05:26.498688       1 sync_worker.go:990] Done syncing for consoleclidownload "helm-download-links" (578 of 858)
  I0803 20:05:26.498752       1 sync_worker.go:975] Running sync for deployment "openshift-console-operator/console-operator" (579 of 858)
  I0803 20:05:26.574316       1 sync_worker.go:990] Done syncing for deployment "openshift-console-operator/console-operator" (579 of 858)
  I0803 20:05:26.574463       1 sync_worker.go:975] Running sync for clusteroperator "console" (580 of 858)
  E0803 20:05:26.574829       1 task.go:117] error running apply for clusteroperator "console" (580 of 858): Cluster operator console is not available
  I0803 20:05:26.575435       1 task_graph.go:550] Result of work: [Cluster operator monitoring is not available Cluster operator console is not available]

Showing that the last 'console' manifest reconciliation attempt in the
most recently completed reconciliation round was the ClusterOperator.

By shifting the ClusterOperator out to 95, it will come in after 90's
ConsolePlugin CRD, and ensure that the CVO pushes that CRD before
wondering about the ClusterOperator health.

I'm not bothering to move the two ConsoleLink manifests, which have
been deletion manifests since 6c6c5ce (Add
release.openshift.io/delete annotation to consoleLink CRDs,
2021-07-15, openshift#565, 4.9 [2]), so we are well past the days when the
position of these release manifests relative to the other console
manifests was significant.

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
wking added a commit to wking/console-operator that referenced this pull request Oct 3, 2023
To sort after 90_consoleplugin.crd.yaml, which this commit series
moved to 90.  This avoids install-time deadlocks like [1]:

  level=error msg=Cluster operator monitoring Available is False with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io)
  level=error msg=Cluster operator monitoring Degraded is True with UpdatingConsolePluginComponentsFailed: reconciling Console Plugin failed: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io)
  ...
  level=error msg=Cluster initialization failed because one or more operators are not functioning properly.

where we were deadlocking with:

* Console operator was blocked by the missing ConsolePlugin CRD and
  not proceeding to mark its ClusterOperator Available=True.
* Cluster-version operator was blocked by the sad ClusterOperator and
  not proceeding to push the later-manifest-name ConsolePlugin CRD.

Backing up the CVO side of that claim with the manifests in the
assembled release image:

  $ w3m -dump -cols 200 https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/release/artifacts/release-payload-latest/ | grep 'console-operator.*\(operator[.]yaml\|consoleplugin\)'
    • [file] 0000_50_console-operator_03-rbac-role-ns-operator.yaml
    • [file] 0000_50_console-operator_07-operator.yaml
    • [file] 0000_50_console-operator_08-clusteroperator.yaml
    • [file] 0000_50_console-operator_90_consoleplugin.crd.yaml

and gathered logs:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320/artifacts/e2e-gcp-ovn/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-c7fd575cc-xmc8h_cluster-version-operator.log | grep -B100000 'Result of work' | grep console | tail
  I0803 20:05:26.397553       1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console-operator/console-operator" (576 of 858)
  I0803 20:05:26.397758       1 sync_worker.go:975] Running sync for serviceaccount "openshift-console/console" (577 of 858)
  I0803 20:05:26.448647       1 sync_worker.go:990] Done syncing for serviceaccount "openshift-console/console" (577 of 858)
  I0803 20:05:26.448740       1 sync_worker.go:975] Running sync for consoleclidownload "helm-download-links" (578 of 858)
  I0803 20:05:26.498688       1 sync_worker.go:990] Done syncing for consoleclidownload "helm-download-links" (578 of 858)
  I0803 20:05:26.498752       1 sync_worker.go:975] Running sync for deployment "openshift-console-operator/console-operator" (579 of 858)
  I0803 20:05:26.574316       1 sync_worker.go:990] Done syncing for deployment "openshift-console-operator/console-operator" (579 of 858)
  I0803 20:05:26.574463       1 sync_worker.go:975] Running sync for clusteroperator "console" (580 of 858)
  E0803 20:05:26.574829       1 task.go:117] error running apply for clusteroperator "console" (580 of 858): Cluster operator console is not available
  I0803 20:05:26.575435       1 task_graph.go:550] Result of work: [Cluster operator monitoring is not available Cluster operator console is not available]

Showing that the last 'console' manifest reconciliation attempt in the
most recently completed reconciliation round was the ClusterOperator.

By shifting the ClusterOperator out to 95, it will come in after 90's
ConsolePlugin CRD, and ensure that the CVO pushes that CRD before
wondering about the ClusterOperator health.

I'm not bothering to move the two ConsoleLink manifests, which have
been deletion manifests since 6c6c5ce (Add
release.openshift.io/delete annotation to consoleLink CRDs,
2021-07-15, openshift#565, 4.9 [2]), so we are well past the days when the
position of these release manifests relative to the other console
manifests was significant.

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/782/pull-ci-openshift-console-operator-master-e2e-gcp-ovn/1687173467130040320
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1980531#c8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants