Skip to content

e2e/loadbalancer: fix e2e by skipping unscheduled nodes on discovery#1340

Merged
k8s-ci-robot merged 2 commits intokubernetes:masterfrom
mtulio:e2e-fix-unscheduled-nodes
Feb 27, 2026
Merged

e2e/loadbalancer: fix e2e by skipping unscheduled nodes on discovery#1340
k8s-ci-robot merged 2 commits intokubernetes:masterfrom
mtulio:e2e-fix-unscheduled-nodes

Conversation

@mtulio
Copy link
Copy Markdown
Contributor

@mtulio mtulio commented Feb 23, 2026

What type of PR is this?

/kind bug
/kind failing-test
/kind flake

What this PR does / why we need it:

NOTE: labeled as kind bug but for e2e test only, nothing related to the controller.

Skip unsupported or unscheduled worker nodes when discovering candidates for load balancer scenarios.

Some tests, such as hairpin traffic, discover worker nodes using the node-role.kubernetes.io label. If a discovered node has NoSchedule or NoExecute taints, the test fails because the workload is implemented generically and does not define specific tolerations.

Filtering these nodes during discovery ensures the test selects a candidate capable of hosting the workload without requiring changes to the test's pod specification.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test. labels Feb 23, 2026
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 23, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 23, 2026
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 23, 2026

cc @nrb @damdo

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 23, 2026

/test pull-cloud-provider-aws-e2e-kubetest2-quick

@mtulio mtulio force-pushed the e2e-fix-unscheduled-nodes branch 4 times, most recently from 508f0af to af4ed8d Compare February 23, 2026 16:10
@mtulio mtulio changed the title e2e/loadbalancer: skip unschedulable nodes during discovery e2e/loadbalancer: fix e2e by skipping unscheduled nodes on discovery Feb 23, 2026
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 23, 2026

/test pull-cloud-provider-aws-e2e-kubetest2-quick

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 23, 2026

Tests .*internal should be reachable with hairpinning traffic are passing in the e2e job

The other issues are unrelated to this change.

@jcpowermac
Copy link
Copy Markdown

/lgtm

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@jcpowermac: changing LGTM is restricted to collaborators

Details

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown
Contributor

@nrb nrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 23, 2026
Copy link
Copy Markdown
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@damdo
Copy link
Copy Markdown
Member

damdo commented Feb 24, 2026

Thanks @mtulio

/assign @kmala @yue9944882 @elmiko

@damdo
Copy link
Copy Markdown
Member

damdo commented Feb 24, 2026

/test pull-cloud-provider-aws-e2e-kubetest2-quick

1 similar comment
@kmala
Copy link
Copy Markdown
Member

kmala commented Feb 25, 2026

/test pull-cloud-provider-aws-e2e-kubetest2-quick

@damdo
Copy link
Copy Markdown
Member

damdo commented Feb 25, 2026

@kmala sounds like the pull-cloud-provider-aws-e2e-kubetest2-quick job has been permafailing for a while now. So very probably unrelated to this PR.

Screenshot 2026-02-25 at 09 22 30

Given the full test pull-cloud-provider-aws-e2e-kubetest2 passed, are we ok ignoring the quick one and merge the PR? TY

Copy link
Copy Markdown
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me, just have a question.

@mtulio mtulio force-pushed the e2e-fix-unscheduled-nodes branch from af4ed8d to c560eee Compare February 25, 2026 17:08
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 25, 2026
@mtulio mtulio force-pushed the e2e-fix-unscheduled-nodes branch from c560eee to 5873ed8 Compare February 25, 2026 17:09
Skip unsupported or unscheduled worker nodes when discovering candidates for load balancer scenarios.

Some tests, such as hairpin traffic, discover worker nodes using the node-role.kubernetes.io label. If a discovered node has NoSchedule or NoExecute taints, the test fails because the workload is implemented generically and does not define specific tolerations.

Filtering these nodes during discovery ensures the test selects a candidate capable of hosting the workload without requiring changes to the test's pod specification.
@mtulio mtulio force-pushed the e2e-fix-unscheduled-nodes branch from 5873ed8 to add3c4d Compare February 25, 2026 17:13
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 25, 2026

/test pull-cloud-provider-aws-e2e-kubetest2-quick

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 25, 2026

yeah, following Dam's comment above, pull-cloud-provider-aws-e2e-kubetest2 is permafailing, looks like caused by CI infra connectivity issues.

@elmiko , feedback addressed!

@damdo
Copy link
Copy Markdown
Member

damdo commented Feb 27, 2026

/test pull-cloud-provider-aws-e2e-kubetest2-quick

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 27, 2026

/test pull-cloud-provider-aws-e2e-kubetest2-quick

Hey @kmala , the test has been fixed with PR kubernetes/test-infra#36535, would you mind taking a look, please? Thanks!

@kmala
Copy link
Copy Markdown
Member

kmala commented Feb 27, 2026

@mtulio thanks for the changes and also for fixing the CI.
I see that dependency review is failing, can you also update the go dependency for otel to fix it as part of the PR, i had to do similar for a release branch #1342

The otel SDK bump to v1.40.0 fixes the CVE GO-2026-4394 that was causing
the govulncheck to fail.

Otel packages bumped from v1.36.0 → v1.40.0:
  - go.opentelemetry.io/otel
  - go.opentelemetry.io/otel/metric
  - go.opentelemetry.io/otel/sdk (this fixes GO-2026-4394)
  - go.opentelemetry.io/otel/trace
  - go.opentelemetry.io/auto/sdk (v1.1.0 → v1.2.1)

Transitive dependency also updated:
  - golang.org/x/sys (v0.38.0 → v0.40.0)
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 27, 2026
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Feb 27, 2026

Hey @kmala , thanks! just bumped to fix otel dependency in commit go.mod: otel SDK bump to v1.40.0 fixes the CVE GO-2026-4394, govulncheck is now passing.

Would you mind taking a look again, please?

@kmala
Copy link
Copy Markdown
Member

kmala commented Feb 27, 2026

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 27, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kmala, nrb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 27, 2026
@k8s-ci-robot k8s-ci-robot merged commit ea961d6 into kubernetes:master Feb 27, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants