OCPNODE-3877: add normal grace period allow non-drain updates to complete by QiWang19 · Pull Request #30480 · openshift/origin

QiWang19 · 2025-11-11T21:59:13Z

The logic now waits for 2min (normal rollout time for non-drain updates) before reporting an error if the pool requires an update but nodes are not ready.
This ensures that non-drain updates can complete successfully, for example, shipping a default ClusterImagePolicy during an upgrade (openshift/cluster-update-keys#85).

QiWang19 · 2025-11-11T22:04:51Z

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480

openshift-ci-robot · 2025-11-12T02:44:41Z

@QiWang19: This pull request references OCPNODE-3877 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

The logic now waits for 2min (normal rollout time for non-drain updates) before reporting an error if the pool requires an update but nodes are not ready.
This ensures that non-drain updates can complete successfully, for example, shipping a default ClusterImagePolicy during an upgrade (openshift/cluster-update-keys#85).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

QiWang19 · 2025-11-12T02:46:08Z

/verified by @QiWang19

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480 passed

https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-origin-30480-openshift-cluster-update-keys-85-openshift-origin-30480-e2e-aws-upgrade/1988367407705493504

openshift-ci-robot · 2025-11-12T02:46:20Z

@QiWang19: This PR has been marked as verified by @QiWang19.

Details

In response to this:

/verified by @QiWang19

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480 passed

https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-origin-30480-openshift-cluster-update-keys-85-openshift-origin-30480-e2e-aws-upgrade/1988367407705493504

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

QiWang19 · 2025-11-12T20:47:21Z

/verified by @QiWang19

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480 passed

https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-origin-30480-openshift-cluster-update-keys-85-openshift-origin-30480-e2e-aws-upgrade/1988367407705493504

openshift-ci-robot · 2025-11-12T20:47:32Z

@QiWang19: This PR has been marked as verified by @QiWang19.

Details

In response to this:

/verified by @QiWang19

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480 passed

https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-origin-30480-openshift-cluster-update-keys-85-openshift-origin-30480-e2e-aws-upgrade/1988367407705493504

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

The logic now waits for 2min (normal rollout time for non-drain updates) before reporting error if the pool requires an update but nodes are not ready. Signed-off-by: Qi Wang <qiwan@redhat.com>

QiWang19 · 2025-11-12T20:56:14Z

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480

QiWang19 · 2025-11-13T01:25:08Z

/verified by @QiWang19

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480 passed

openshift-ci-robot · 2025-11-13T01:25:19Z

@QiWang19: This PR has been marked as verified by @QiWang19.

Details

In response to this:

/verified by @QiWang19

/testwith openshift/cluster-update-keys/main/e2e-aws-upgrade openshift/cluster-update-keys#85 #30480 passed

https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-origin-30480-openshift-cluster-update-keys-85-openshift-origin-30480-e2e-aws-upgrade/1988367407705493504

https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-origin-30480-openshift-cluster-update-keys-85-openshift-origin-30480-e2e-aws-upgrade/1988712562203561984

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

neisw · 2025-11-13T18:52:03Z

/approve

wking

/lgtm

openshift-ci · 2025-11-13T19:42:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neisw, QiWang19, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [neisw]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2025-11-13T20:00:24Z

/retest-required

Remaining retests: 0 against base HEAD 782ff8b and 2 for PR HEAD 2fd0d8e in total

QiWang19 · 2025-11-14T16:48:10Z

/retest-required

openshift-ci-robot · 2025-11-14T22:17:54Z

/retest-required

Remaining retests: 0 against base HEAD b7d2a64 and 1 for PR HEAD 2fd0d8e in total

openshift-ci · 2025-11-15T01:35:00Z

@QiWang19: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-ovn-upgrade-rollback	`2fd0d8e`	link	false	`/test e2e-aws-ovn-upgrade-rollback`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot · 2025-11-15T10:17:27Z

/retest-required

Remaining retests: 0 against base HEAD 8aaf884 and 0 for PR HEAD 2fd0d8e in total

…-openshift-cip"" This reverts commit 7a5dcee. This one has taken us some time: * 2025-08-27, 94f7582, openshift#82 was our first attempt at enabling the ClusterImagePolicy. * ...but it tripped up the origin test suite, so it was reverted in 2025-08-28, c40e7b9, openshift#83. * Qi then hardened the test suite with openshift/origin@d3af51e4acb (not fail upgrade checks if all nodes are ready, 2025-09-29, openshift/origin#30318) and openshift/origin@2fd0d8e242 (Upgrade test add 2min grace period allow non-drain updates to complete, 2025-11-12, openshift/origin#30480). * With the tougher CI in place, we tried a second time with 2025-11-17, 1f89a67, openshift#85. * ...but still tripped up origin, with runs like [1] taking 2.25m (more than the 2m grace period): I1119 17:26:21.890667 1511 upgrade.go:629] Waiting on pools to be upgraded I1119 17:26:21.939178 1511 upgrade.go:792] Pool master is still reporting (Updated: false, Updating: true, Degraded: false) I1119 17:26:21.939259 1511 upgrade.go:666] Invariant violation detected: master pool requires update but nodes not ready. Waiting up to 2m0s for non-draining updates to complete I1119 17:26:31.984116 1511 upgrade.go:792] Pool master is still reporting (Updated: false, Updating: true, Degraded: false) ... I1119 17:28:21.981438 1511 upgrade.go:792] Pool master is still reporting (Updated: false, Updating: true, Degraded: false) I1119 17:28:21.981514 1511 upgrade.go:673] Invariant violation detected: the "master" pool should be updated before the CVO reports available at the new version and: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade/1991158541779472384/artifacts/e2e-gcp-ovn-rt-upgrade/gather-extra/artifacts/inspect/cluster-scoped-resources/machineconfiguration.openshift.io/machineconfigpools/master.yaml | yaml2json | jq -r '.status.conditions[] | select(.type == "Updating") | .lastTransitionTime + " " + .status' 2025-11-19T17:28:36Z False 28:36 - 26:21 = 135s = 2.25m, which overshot the 2m grace period. The second attempt was reverted in 7a5dcee, openshift#87. * Qi then hardened the test suite further with openshift/origin@c17e560263 (Update grace period for cluster upgrade to 10 minutes, 2025-11-19, #openshift/origin#30506). * This commit is taking a third attempt at enabling the ClusterImagePolicy. [1]: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade/1991158541779472384

openshift-ci bot requested review from deads2k and p0lyn0mial November 11, 2025 22:00

QiWang19 mentioned this pull request Nov 12, 2025

OCPNODE-3611: promote openshift ClusterImagePolicy to default featureset openshift/cluster-update-keys#85

Merged

QiWang19 changed the title ~~Upgrade test add normal grace period allow non-drain updates to complete~~ OCPNODE-3877: Upgrade test add normal grace period allow non-drain updates to complete Nov 12, 2025

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 12, 2025

openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 12, 2025

QiWang19 mentioned this pull request Nov 12, 2025

Allow non-drained updates based on node schedulability and kubelet status #30477

Closed

QiWang19 changed the title ~~OCPNODE-3877: Upgrade test add normal grace period allow non-drain updates to complete~~ OCPNODE-3877: add normal grace period allow non-drain updates to complete Nov 12, 2025

QiWang19 force-pushed the grace-wait-pool branch from efe3830 to 70609fd Compare November 12, 2025 20:46

openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Nov 12, 2025

openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 12, 2025

Upgrade test add 2min grace period allow non-drain updates to complete

2fd0d8e

The logic now waits for 2min (normal rollout time for non-drain updates) before reporting error if the pool requires an update but nodes are not ready. Signed-off-by: Qi Wang <qiwan@redhat.com>

QiWang19 force-pushed the grace-wait-pool branch from 70609fd to 2fd0d8e Compare November 12, 2025 20:55

openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Nov 12, 2025

openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 13, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 13, 2025

wking approved these changes Nov 13, 2025

View reviewed changes

openshift-ci bot assigned wking Nov 13, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 13, 2025

openshift-merge-bot bot merged commit 111e203 into openshift:main Nov 15, 2025
21 of 22 checks passed

wking mentioned this pull request Nov 19, 2025

TRT-2426: Revert #85 " \tOCPNODE-3611: promote openshift ClusterImagePolicy to default featureset" openshift/cluster-update-keys#87

Merged

wking mentioned this pull request Nov 19, 2025

TRT-2426: Third run at default feature set cluster image policy openshift/cluster-update-keys#89

Merged

Conversation

QiWang19 commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

QiWang19 commented Nov 11, 2025

Uh oh!

openshift-ci-robot commented Nov 12, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

QiWang19 commented Nov 12, 2025

Uh oh!

openshift-ci-robot commented Nov 12, 2025

Uh oh!

QiWang19 commented Nov 12, 2025

Uh oh!

openshift-ci-robot commented Nov 12, 2025

Uh oh!

QiWang19 commented Nov 12, 2025

Uh oh!

QiWang19 commented Nov 13, 2025

Uh oh!

openshift-ci-robot commented Nov 13, 2025

Uh oh!

neisw commented Nov 13, 2025

Uh oh!

wking left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Nov 13, 2025

Uh oh!

openshift-ci-robot commented Nov 13, 2025

Uh oh!

QiWang19 commented Nov 14, 2025

Uh oh!

openshift-ci-robot commented Nov 14, 2025

Uh oh!

openshift-ci bot commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Nov 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

QiWang19 commented Nov 11, 2025 •

edited

Loading

openshift-ci-robot commented Nov 12, 2025 •

edited by openshift-ci bot

Loading

openshift-ci bot commented Nov 15, 2025 •

edited

Loading