Skip to content

Fix hairpinning traffic on internal NLB by introducing TG attribute reconciler#1214

Merged
k8s-ci-robot merged 4 commits intokubernetes:masterfrom
mtulio:fix-hairpin-feat-tg-attrib
Aug 29, 2025
Merged

Fix hairpinning traffic on internal NLB by introducing TG attribute reconciler#1214
k8s-ci-robot merged 4 commits intokubernetes:masterfrom
mtulio:fix-hairpin-feat-tg-attrib

Conversation

@mtulio
Copy link
Copy Markdown
Contributor

@mtulio mtulio commented Jul 18, 2025

What type of PR is this?

/kind bug
/kind documentation
/kind failing-test
/kind feature

What this PR does / why we need it:

This PR introduces the annotation to allow users to configure the target group attributes. This configuration is required to fix the hairpinning traffic affecting internal NLB on the default configurations (target type instance).

The proposal introduce Target Group configuration flexibility only for the following attributes for annotation service.beta.kubernetes.io/aws-load-balancer-target-group-attributes as those are required to fix the hairpinning traffic:

  • preserve_client_ip.enabled
  • proxy_protocol_v2.enabled

Done checklist:

  • Isolate e2e refact from the changes required to this PR: switching e2e test case of hairpin connection to not accept failure HTTP test. refact e2e test config enhancing setup, logging, steps and documenting hooks #1215
  • Resolve if it is accepted to always reconcile target groups with expected/default values from AWS even when annotation is not present. Re: discussed with @elmiko that we could enforce TG attributes as controller manages it. A research will also performed if we can save state in the status.

Which issue(s) this PR fixes:
Fixes #1160

Special notes for your reviewer:

The proposal is not disruptive and will follow the user's explicitly configuration, which means will not change target group attributes from existing services when the new annotation is not added.

Additional discussion int the Slack on kube namespace: https://kubernetes.slack.com/archives/C0LRMHZ1T/p1755530062752269

In conformance with the k8s review policy: this PR has been assisted by: Cursor AI (PR review, documentation strings, and unit tests)

Does this PR introduce a user-facing change?:

A new annotation, `service.beta.kubernetes.io/aws-load-balancer-target-group-attributes`, has been introduced for Kubernetes Service resources of type LoadBalancer to allow configuration of Network Load Balancer (NLB) target group attributes. This enables users to resolve hairpinning issues by setting `preserve_client_ip.enabled=false` and to track source IP addresses with `proxy_protocol_v2.enabled=true`, when the backend supports it. Using this feature requires the `DescribeTargetGroupAttributes` and `ModifyTargetGroupAttributes` IAM permissions.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 18, 2025
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 18, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 18, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @mtulio. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 18, 2025
@elmiko
Copy link
Copy Markdown
Contributor

elmiko commented Jul 18, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 18, 2025
@mtulio mtulio force-pushed the fix-hairpin-feat-tg-attrib branch from 9b21dd2 to 8c18565 Compare July 18, 2025 18:08
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Jul 22, 2025

/test all

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Jul 23, 2025

timeouts

/test pull-cloud-provider-aws-e2e

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 23, 2025
@mtulio mtulio force-pushed the fix-hairpin-feat-tg-attrib branch from 8c18565 to bf23dea Compare July 23, 2025 13:57
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 23, 2025
@mtulio mtulio changed the title Fix hairpinning traffic on internal NLB by introducing TG attribute config Fix hairpinning traffic on internal NLB by introducing TG attribute reconciler Jul 23, 2025
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Jul 23, 2025

PR #1215 merged. Rebased. Re-testing.

/test pull-cloud-provider-aws-e2e

@mtulio mtulio force-pushed the fix-hairpin-feat-tg-attrib branch from 0e5e2d0 to 19ef6f3 Compare August 22, 2025 05:38
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Aug 25, 2025

HI @elmiko and @kmala , PR updated with suggestions to remove disruptions. I also updated the releate notes. Would you mind taking a look?

Introduce the target group annotation[1] for all listeners on a Service
type-loadBalancer NLB.

[1] Annotation service.beta.kubernetes.io/aws-load-balancer-target-group-attributes

The annotation provides a interface for users to opt into non-default
configurations of a target group when creating or updating a Service.

This change also provides a fix for a critical hairpin bug impacting NLB
default configuration (using target type instance), which disables the
'preserve source ip configuration' attribute, leading to timeouts in
such scenario.
@mtulio mtulio force-pushed the fix-hairpin-feat-tg-attrib branch from 19ef6f3 to 73428cd Compare August 27, 2025 01:46
Copy link
Copy Markdown
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is looking good for me

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 29, 2025
@kmala
Copy link
Copy Markdown
Member

kmala commented Aug 29, 2025

/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kmala

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 29, 2025
@k8s-ci-robot k8s-ci-robot merged commit cd61887 into kubernetes:master Aug 29, 2025
11 checks passed
@mtulio mtulio deleted the fix-hairpin-feat-tg-attrib branch September 2, 2025 22:41
mtulio added a commit to mtulio/hypershift that referenced this pull request Dec 4, 2025
…ontroller

Added permission to IAM managed policy for control plane controllers
affecting a specific case where it is required to reconcile the NLB
target group.

The behavior is added after the fix on kubernetes/cloud-provider-aws#1214
mtulio added a commit to mtulio/hypershift that referenced this pull request Dec 4, 2025
…ontroller

Added permission to IAM managed policy for control plane controllers
affecting a specific case where it is required to reconcile the NLB
target group.

The behavior is added after the fix on kubernetes/cloud-provider-aws#1214
mtulio added a commit to mtulio/hypershift that referenced this pull request Dec 4, 2025
…ontroller

Added permission to IAM managed policy for control plane controllers
affecting a specific case where it is required to reconcile the NLB
target group.

The behavior is added after the fix on kubernetes/cloud-provider-aws#1214
mtulio added a commit to mtulio/hypershift that referenced this pull request Dec 9, 2025
Added permission to IAM managed policy for control plane controllers
affecting a specific case where it is required to reconcile the NLB
target group.

The behavior is added after the fix on kubernetes/cloud-provider-aws#1214

OCPBUGS-65885: regenerate delegating AWS client for new ELBv2 permissions
mtulio added a commit to mtulio/hypershift that referenced this pull request Dec 9, 2025
Added permission to IAM managed policy for control plane controllers
affecting a specific case where it is required to reconcile the NLB
target group.

The behavior is added after the fix on kubernetes/cloud-provider-aws#1214

OCPBUGS-65885
k8s-ci-robot added a commit that referenced this pull request Dec 10, 2025
…stream-release-1.34

Automated cherry pick of #1214: doc/service: describe supported target group attributes
mtulio added a commit to mtulio/hypershift that referenced this pull request Dec 18, 2025
Added permission to IAM managed policy for control plane controllers
affecting a specific case where it is required to reconcile the NLB
target group.

The behavior is added after the fix on kubernetes/cloud-provider-aws#1214

OCPBUGS-65885
mtulio added a commit to mtulio/hypershift that referenced this pull request Jan 8, 2026
Added permission to IAM managed policy for control plane controllers
affecting a specific case where it is required to reconcile the NLB
target group.

The behavior is added after the fix on kubernetes/cloud-provider-aws#1214

OCPBUGS-65885
k8s-ci-robot added a commit that referenced this pull request Jan 21, 2026
-#1215-#1217-#1214-upstream-release-1.33

Automated cherry pick of #1153: e2e/deps: enhance test scenarios with NLB
#1161: e2e/loadbalancer: implement hairpin connection cases
#1215: refact: e2e tests documenting hooks and enhance logging/steps
#1217: e2e/debug: increase data collection on e2e failures
#1214: doc/service: describe supported target group attributes
k8s-ci-robot added a commit that referenced this pull request Jan 21, 2026
-#1215-#1217-#1214-upstream-release-1.32

Automated cherry pick of #1153: e2e/deps: enhance test scenarios with NLB
#1161: e2e/loadbalancer: implement hairpin connection cases
#1215: refact: e2e tests documenting hooks and enhance logging/steps
#1217: e2e/debug: increase data collection on e2e failures
#1214: doc/service: describe supported target group attributes
k8s-ci-robot added a commit that referenced this pull request Jan 21, 2026
-#1215-#1217-#1214-upstream-release-1.31

Automated cherry pick of #1153: e2e/deps: enhance test scenarios with NLB
#1161: e2e/loadbalancer: implement hairpin connection cases
#1215: refact: e2e tests documenting hooks and enhance logging/steps
#1217: e2e/debug: increase data collection on e2e failures
#1214: doc/service: describe supported target group attributes
ameukam pushed a commit to ameukam/kops that referenced this pull request Feb 25, 2026
Added permission to read and write/modify Target Group Attributes on
clusters of cloud-provider-aws (CCM) project.

The modify permission is conditional for targget clusters.

This permission is required to be able to test the new requirement,
modify target group attributes, through e2e CI clusters.

More information: kubernetes/cloud-provider-aws#1214
Example of CI job without this permission:
https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/cloud-provider-aws/1214/pull-cloud-provider-aws-e2e/1948477553773645824
Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hairpin connection failed on Service type-LoadBalancer NLB with internal scheme

5 participants