Skip to content

Conversation

@ngopalak-redhat
Copy link
Contributor

@ngopalak-redhat ngopalak-redhat commented Nov 7, 2025

Added test for the dev work in OCPNODE-3719
The e2e Jira is OCPNODE-3720
Adds a test to make sure that node sizing enabled is set correctly on the node. The test is written in similar lines to 4.20 PR: #30467

This PR introduces a new E2E test case to verify the behavior of the NODE_SIZING_ENABLED feature flag following the changes in openshift/machine-config-operator#5390. The primary goal is to ensure that while the patch introduces a new MachineConfig, it does not automatically disable a user's ability to disable node autosizing for reserved resources.

NOTE: [Suite:openshift/machine-config-operator/disruptive] will ensure that the container restart failures are not recorded for this test. Its a common pattern used for machine config tests.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 7, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 7, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ngopalak-redhat
Copy link
Contributor Author

/test all

@ngopalak-redhat
Copy link
Contributor Author

/hold wait for the dev prs to merge

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 7, 2025
@ngopalak-redhat ngopalak-redhat changed the title Add auto-sizing-reversed test to origin OCPNODE-3719: Add auto-sizing-reversed test to origin Nov 7, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 7, 2025

@ngopalak-redhat: This pull request references OCPNODE-3719 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Added test for OCPNODE-3719
Adds a test to make sure that node sizing enabled is set correctly on the node.

The test is currently failing as the dev PRs are not merged. But the code can be reviewed:

I1107 13:48:39.570486 485199 client.go:465] Project "e2e-test-node-sizing-dkmgw" has been fully provisioned.
 STEP: Getting a worker node to test @ 11/07/25 13:48:39.57
I1107 13:48:39.583496 485199 node_sizing.go:36] Testing on node: ci-op-lw9pip4i-4cd5c-dskbw-worker-0-266px
 STEP: Setting privileged pod security labels on namespace @ 11/07/25 13:48:39.583
namespace/e2e-test-node-sizing-dkmgw labeled
 STEP: Creating a privileged pod with /etc mounted @ 11/07/25 13:48:39.749
 STEP: Waiting for pod to be running @ 11/07/25 13:48:39.804
 STEP: Verifying /etc/node-sizing-enabled.env file exists @ 11/07/25 13:48:44.825
 STEP: Reading /etc/node-sizing-enabled.env file contents @ 11/07/25 13:48:45.071
I1107 13:48:45.376484 485199 node_sizing.go:118] Contents of /etc/node-sizing-enabled.env:
NODE_SIZING_ENABLED=false
SYSTEM_RESERVED_MEMORY=1Gi
SYSTEM_RESERVED_CPU=500m
SYSTEM_RESERVED_ES=1Gi
 STEP: Verifying NODE_SIZING_ENABLED=true is set in the file @ 11/07/25 13:48:45.376
 STEP: Cleaning up test pod @ 11/07/25 13:48:45.376
 [FAILED] in [It] - github.com/openshift/origin/test/extended/node/node_sizing.go:121 @ 11/07/25 13:48:45.401

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 7, 2025
@ngopalak-redhat
Copy link
Contributor Author

cc: @asahay19

@ngopalak-redhat ngopalak-redhat marked this pull request as ready for review November 7, 2025 14:35
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 7, 2025
@openshift-ci openshift-ci bot requested review from deads2k and sjenning November 7, 2025 14:35
@ngopalak-redhat
Copy link
Contributor Author

@haircommander @sairameshv Please review the test. I have kept the PR on hold so that the dev PRs can merge.

@openshift-trt
Copy link

openshift-trt bot commented Nov 18, 2025

Job Failure Risk Analysis for sha: 9083a46

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-vsphere-ovn High
[Monitor:known-image-checker][sig-arch] Only known images used by tests
This test has passed 100.00% of 18 runs on release 4.21 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:standard Network:ovn NetworkStack:ipv4 Owner:eng Platform:vsphere Procedure:none SecurityMode:default Topology:ha Upgrade:none] in the last week.

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New Test Risks for sha: 9083a46

Job Name New Test Risk
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[sig-node] Node sizing should have NODE_SIZING_ENABLED=true in /etc/node-sizing-enabled.env [Suite:openshift/conformance/parallel]" is a new test that failed 1 time(s) against the current commit
pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6 High - "[sig-node] Node sizing should have NODE_SIZING_ENABLED=true in /etc/node-sizing-enabled.env [Suite:openshift/conformance/parallel]" is a new test that was not present in all runs against the current commit, and also failed 1 time(s).
pull-ci-openshift-origin-main-e2e-vsphere-ovn High - "[sig-node] Node sizing should have NODE_SIZING_ENABLED=true in /etc/node-sizing-enabled.env [Suite:openshift/conformance/parallel]" is a new test that failed 1 time(s) against the current commit
pull-ci-openshift-origin-main-e2e-vsphere-ovn-upi High - "[sig-node] Node sizing should have NODE_SIZING_ENABLED=true in /etc/node-sizing-enabled.env [Suite:openshift/conformance/parallel]" is a new test that failed 1 time(s) against the current commit

New tests seen in this PR at sha: 9083a46

  • "[sig-node] Node sizing should have NODE_SIZING_ENABLED=true in /etc/node-sizing-enabled.env [Suite:openshift/conformance/parallel]" [Total: 5, Pass: 1, Fail: 4, Flake: 0]

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 1, 2025

@ngopalak-redhat: This pull request references OCPNODE-3719 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Added test for the dev work in OCPNODE-3719
The e2e Jira is OCPNODE-3720
Adds a test to make sure that node sizing enabled is set correctly on the node. The test is written in similar lines to 4.20 PR: #30467

This PR introduces a new E2E test case to verify the behavior of the NODE_SIZING_ENABLED feature flag following the changes in openshift/machine-config-operator#5390. The primary goal is to ensure that while the patch introduces a new MachineConfig, it does not automatically disable a user's ability to disable node autosizing for reserved resources.

NOTE: [Suite:openshift/machine-config-operator/disruptive] will ensure that the container restart failures are not recorded for this test. Its a common pattern used for machine config tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ngopalak-redhat ngopalak-redhat changed the title OCPNODE-3719: Add auto-sizing-reversed test to origin OCPNODE-3720: Add auto-sizing-reversed test to origin Dec 1, 2025
@openshift-ci openshift-ci bot changed the title OCPNODE-3720: Add auto-sizing-reversed test to origin OCPNODE-3719: Add auto-sizing-reversed test to origin Dec 1, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 1, 2025

@ngopalak-redhat: This pull request references OCPNODE-3719 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Added test for the dev work in OCPNODE-3719
The e2e Jira is OCPNODE-3720
Adds a test to make sure that node sizing enabled is set correctly on the node. The test is written in similar lines to 4.20 PR: #30467

This PR introduces a new E2E test case to verify the behavior of the NODE_SIZING_ENABLED feature flag following the changes in openshift/machine-config-operator#5390. The primary goal is to ensure that while the patch introduces a new MachineConfig, it does not automatically disable a user's ability to disable node autosizing for reserved resources.

NOTE: [Suite:openshift/machine-config-operator/disruptive] will ensure that the container restart failures are not recorded for this test. Its a common pattern used for machine config tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ngopalak-redhat ngopalak-redhat changed the title OCPNODE-3719: Add auto-sizing-reversed test to origin OCPNODE-3720: Add auto-sizing-reversed test to origin Dec 1, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 1, 2025

@ngopalak-redhat: This pull request references OCPNODE-3720 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Added test for the dev work in OCPNODE-3719
The e2e Jira is OCPNODE-3720
Adds a test to make sure that node sizing enabled is set correctly on the node. The test is written in similar lines to 4.20 PR: #30467

This PR introduces a new E2E test case to verify the behavior of the NODE_SIZING_ENABLED feature flag following the changes in openshift/machine-config-operator#5390. The primary goal is to ensure that while the patch introduces a new MachineConfig, it does not automatically disable a user's ability to disable node autosizing for reserved resources.

NOTE: [Suite:openshift/machine-config-operator/disruptive] will ensure that the container restart failures are not recorded for this test. Its a common pattern used for machine config tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ngopalak-redhat ngopalak-redhat force-pushed the ngopalak/node_sizing_test branch from 9083a46 to 7ee723c Compare December 1, 2025 19:01
@ngopalak-redhat
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 1, 2025
@ngopalak-redhat
Copy link
Contributor Author

/retest-required

@ngopalak-redhat
Copy link
Contributor Author

/test all

@ngopalak-redhat
Copy link
Contributor Author

/jira-refresh

@ngopalak-redhat
Copy link
Contributor Author

/jira refresh

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2025
@ngopalak-redhat
Copy link
Contributor Author

/assign @stbenjam or @neisw Please review
This is the similar to the PR: #30467. This targets main branch and tests that the AutoSizingReserved is true by default for new clusters.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 2, 2025

@ngopalak-redhat: GitHub didn't allow me to assign the following users: review, or, Please.

Note that only openshift members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

Details

In response to this:

/assign @stbenjam or @neisw Please review
This is the similar to the PR: #30467. This targets main branch and tests that the AutoSizingReserved is true by default for new clusters.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@neisw
Copy link
Contributor

neisw commented Dec 4, 2025

The only concern I have is adding 5m30s to the serial job

Does it need to be in both your disruptive job and main serial?

[Suite:openshift/machine-config-operator/disruptive][Suite:openshift/conformance/serial][Serial][sig-node] Node sizing should have NODE_SIZING_ENABLED=true by default and NODE_SIZING_ENABLED=false when KubeletConfig with autoSizingReserved=false is applied 5m30s

@ngopalak-redhat
Copy link
Contributor Author

The only concern I have is adding 5m30s to the serial job

Does it need to be in both your disruptive job and main serial?

[Suite:openshift/machine-config-operator/disruptive][Suite:openshift/conformance/serial][Serial][sig-node] Node sizing should have NODE_SIZING_ENABLED=true by default and NODE_SIZING_ENABLED=false when KubeletConfig with autoSizingReserved=false is applied 5m30s

@neisw I explored several other approaches, but here are the reasons for keeping it both disruptive and serial:

  • Serial: The test cannot run in parallel because it triggers a node restart and executes the kubelet-auto-sizing script during startup, which could interfere with other tests.

  • Time to run: To minimize execution time, we ensured that the test affects only a single node and reuses an existing node rather than provisioning a new one.

  • Disruptive: If not marked [Disruptive], the node restarts trigger 4 container restarts, which exceeds the suite's limit of 3. We considered increasing the allowed restart limit, but that risks hiding real failures in other tests.

@neisw
Copy link
Contributor

neisw commented Dec 4, 2025

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 4, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, neisw, ngopalak-redhat, sairameshv

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 4, 2025
@ngopalak-redhat
Copy link
Contributor Author

/label acknowledge-critical-fixes-only
As the PR is merged for the MCO repo and the test case is required to validate it.

@openshift-ci openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Dec 5, 2025
@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD fde4688 and 2 for PR HEAD 645fe40 in total

@ngopalak-redhat
Copy link
Contributor Author

/test e2e-vsphere-ovn

@ngopalak-redhat
Copy link
Contributor Author

/test e2e-metal-ipi-ovn-ipv6
/test e2e-aws-ovn-serial-1of2

@ngopalak-redhat
Copy link
Contributor Author

/test e2e-metal-ipi-ovn-ipv6

1 similar comment
@ngopalak-redhat
Copy link
Contributor Author

/test e2e-metal-ipi-ovn-ipv6

@ngopalak-redhat
Copy link
Contributor Author

/test e2e-vsphere-ovn
/test e2e-metal-ipi-ovn-ipv6

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD e76daa0 and 1 for PR HEAD 645fe40 in total

@ngopalak-redhat
Copy link
Contributor Author

/test e2e-vsphere-ovn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 5, 2025

@ngopalak-redhat: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 908502c into openshift:main Dec 5, 2025
20 checks passed
@djoshy
Copy link
Contributor

djoshy commented Dec 8, 2025

/payload-job periodic-ci-openshift-machine-config-operator-release-4.21-periodics-e2e-gcp-mco-disruptive

Hello! I was adding another test to the MCO's disruptive suite when I noticed some failures which seem related to the tests added in this PR.

I1205 19:10:35.634472 19806 pinnedimages.go:610] Node 'ci-op-s0d9pbry-eefda-8smkd-worker-a-rphl7' has is not yet ready and has the current config `rendered-node-sizing-test-80b3c4b36ffd7618b95eb2f5e2ca7156`

This references the custom MCP's config used in this test, is it being cleaned up properly? Here is the actual test link: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-machine-config-operator-5361-periodics-e2e-gcp-mco-disruptive/1996971574372077568

When adding e2es to the MCO disruptive suite, please payload test it against the MCO's disruptive jobs(the default presubmits in origin don't cover it) and let someone from the MCO team review it as well. This affects the MCO's component readiness.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 8, 2025

@djoshy: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-machine-config-operator-release-4.21-periodics-e2e-gcp-mco-disruptive

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/14c712d0-d448-11f0-81f3-8e907bd7726b-0

@isabella-janssen
Copy link
Member

/payload-job periodic-ci-openshift-machine-config-operator-release-4.21-periodics-e2e-gcp-mco-disruptive

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 8, 2025

@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-machine-config-operator-release-4.21-periodics-e2e-gcp-mco-disruptive

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/67a7df50-d481-11f0-82e7-cdf77737dc5e-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants