Skip to content

Conversation

phuhung273
Copy link
Contributor

@phuhung273 phuhung273 commented Jun 28, 2025

What type of PR is this?
/kind cleanup
/kind deprecation

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #5405

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:

Migrate elb to AWS SDK v2

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 28, 2025
@k8s-ci-robot k8s-ci-robot requested review from cnmcavoy and damdo June 28, 2025 06:33
@k8s-ci-robot k8s-ci-robot added needs-priority size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 28, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @phuhung273. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2025
@damdo
Copy link
Member

damdo commented Jun 30, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 30, 2025
@damdo
Copy link
Member

damdo commented Jun 30, 2025

@phuhung273 it looks like the linter test is not happy, would you be able to fix it? Thanks!

@phuhung273 phuhung273 force-pushed the elb-sdk-v2 branch 2 times, most recently from 75dd933 to f32eb9c Compare June 30, 2025 11:24
@phuhung273
Copy link
Contributor Author

Sure thanks @damdo for taking a look, lint issue fixed.

@phuhung273
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e-eks pull-cluster-api-provider-aws-e2e

@damdo
Copy link
Member

damdo commented Jun 30, 2025

/test pull-cluster-api-provider-aws-e2e

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me, left one question.

/assign @punkwalker @nrb @richardcase

@k8s-ci-robot
Copy link
Contributor

@damdo: GitHub didn't allow me to assign the following users: punkwalker.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

Looks reasonable to me, left one question.

/assign @punkwalker @nrb @richardcase

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@punkwalker
Copy link
Contributor

punkwalker commented Jun 30, 2025

@phuhung273
Thank you for working on this.

update: I have created a PR #5574 which will solve ServiceLimiter problem. You can find more details in it.

On Side note: Looks like you implemented a new method WithRequestMetricContextMiddleware in metricsV2 package during SQS client migration. Any specific reason you did not use the existing awsmetricsv2,WithMiddleware option?

@phuhung273 phuhung273 force-pushed the elb-sdk-v2 branch 2 times, most recently from 8d549b6 to 257712d Compare July 1, 2025 05:26
@phuhung273
Copy link
Contributor Author

Sure @damdo, I'm still not really get used to how an actual error looks like. If this run doesn't work I will try my local

@damdo
Copy link
Member

damdo commented Jul 3, 2025

/test pull-cluster-api-provider-aws-e2e-blocking

@damdo
Copy link
Member

damdo commented Jul 3, 2025

@phuhung273 there are panics in the CAPA controller (here)

E0703 09:08:52.610094       1 signal_unix.go:917] "Observed a panic" controller="awscluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSCluster" AWSCluster="quick-start-ee17bg/quick-start-m3dnl6-rd227" namespace="quick-start-ee17bg" name="quick-start-m3dnl6-rd227" reconcileID="5bf89cf8-313e-4ab9-9650-8102d92616f0" panic="runtime error: invalid memory address or nil pointer dereference" panicGoValue="\"invalid memory address or nil pointer dereference\"" stacktrace=<
	goroutine 382 [running]:
	k8s.io/apimachinery/pkg/util/runtime.logPanic({0x64b4800, 0xc002a386f0}, {0x52db8e0, 0x8d45210})
		/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:107 +0xbc
	sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile.func1()
		/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:108 +0x112
	panic({0x52db8e0?, 0x8d45210?})
		/usr/local/go/src/runtime/panic.go:791 +0x132
	sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/awserrors.(*SmithyError).ErrorCode(...)
		/workspace/pkg/cloud/awserrors/errors.go:268
	sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/throttle.getServiceLimiterMiddleware.func1({0x64b4800, 0xc002b28c90}, {{0x5a9d120?, 0xc002b28c00?}}, {0x64539e0, 0xc00242f540})
		/workspace/pkg/cloud/throttle/throttle.go:175 +0xd8
	github.com/aws/smithy-go/middleware.finalizeMiddlewareFunc.HandleFinalize(...)
		/go/pkg/mod/github.com/aws/[email protected]/middleware/step_finalize.go:67
	github.com/aws/smithy-go/middleware.decoratedFinalizeHandler.HandleFinalize(...)
		/go/pkg/mod/github.com/aws/[email protected]/middleware/step_finalize.go:200
	sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/metricsv2.getRequestMetricContextMiddleware.func1({0x64b4800, 0xc002b28c90}, {{0x5a9d120?, 0xc002b28c00?}}, {0x64539e0, 0xc00242f560})
		/workspace/pkg/cloud/metricsv2/metrics.go:149 +0x1b8
...

Would you please be able to check?

Having a quick look it might be starting from here and propagating down to the awserrors/errors.go:268

@phuhung273
Copy link
Contributor Author

Thanks @damdo for pointing it out. Fixed by adding a nil check to ServiceLimiter. Let me know if we should start full e2e or wait for unresolved conversations.

Copy link
Contributor

@punkwalker punkwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀
@phuhung273 Make sure to make the remaining small changes.

Comment on lines 175 to 181

if smithyErr != nil {
limiter.ReviewResponseV2(ctx, smithyErr.ErrorCode())
return out, metadata, err
}

return out, metadata, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phuhung273
Thank you for fixing this 👍 The nil check is good on smithyErr.

However, Can you just return the err from L173 instead of nil? If the error is not smithy error, we will not be able to unwrap it, in such cases error should be passed to next middleware which can take care of it.

Copy link
Contributor Author

@phuhung273 phuhung273 Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it. But after looking at the function, I can see we initialize smithyErr right at the start

func ParseSmithyError(err error) *SmithyError {
smithyErr := &SmithyError{}

So the only way for ParseSmithyError to return nil is when input err is nil. Then returning err or nil are the same.

But I think your explanation is good, it can even cover the case someone changing the logic of ParseSmithyError in the future. We should use your logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's just the pattern of AWS SDK middleware, so didn't want to change it.

@phuhung273
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e-eks pull-cluster-api-provider-aws-e2e

@phuhung273
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e

@phuhung273 phuhung273 requested review from damdo and nrb July 4, 2025 07:48
@phuhung273
Copy link
Contributor Author

Thank you for helping out the entire process @damdo

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

/lgtm

@punkwalker let us know if you are happy with these latest changes.

/assign @richardcase @nrb

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 4, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: c6ed3cdc90c7eaafbd2c7ae1b3deb5477eeb3a5b

@damdo damdo requested a review from punkwalker July 4, 2025 08:04
@damdo
Copy link
Member

damdo commented Jul 4, 2025

/assign @AndiDog @dlipovetsky

@punkwalker
Copy link
Contributor

Thanks

/lgtm

@punkwalker let us know if you are happy with these latest changes.

/assign @richardcase @nrb

Yes. LGTM.

@nrb
Copy link
Contributor

nrb commented Jul 7, 2025

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nrb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 7, 2025
@k8s-ci-robot k8s-ci-robot merged commit dd099ec into kubernetes-sigs:main Jul 7, 2025
20 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v2.8 milestone Jul 7, 2025
@phuhung273 phuhung273 deleted the elb-sdk-v2 branch July 8, 2025 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate elb code to AWS SDK v2
8 participants