Skip to content

feat(extension manager): dynamically reload CA certificates#5613

Merged
arkodg merged 18 commits intoenvoyproxy:mainfrom
sapirpol:certificate-reload
Apr 16, 2025
Merged

feat(extension manager): dynamically reload CA certificates#5613
arkodg merged 18 commits intoenvoyproxy:mainfrom
sapirpol:certificate-reload

Conversation

@sapirpol
Copy link
Contributor

What type of PR is this?
Dynamically reload CA certificates for GRPC connections on extension-manager.

What this PR does / why we need it:
This PR introduces dynamic reloading of CA certificates used by Envoy Gateway's extension manager when establishing GRPC connections. Previously, certificates were loaded once, making updates difficult without restarting the service. To resolve this, the PR uses grpc-go's advancedtls library, enabling dynamic certificate management via a custom callback that reloads the certificate.

Comparison of TLS Settings: Before vs. After Update

Setting Before (Old Approach) After (New Approach)
Min TLS Version TLS 1.2 TLS 1.2
Max TLS Version Highest available Highest available
Ciphers Used safe default cipher list with explicitly forbidden RFC 7540 ciphers grpc-go. Uses safe default cipher list grpc-go. RSA key exchange, 3DES, and CBC cipher suites removed in Go 1.22 and Go 1.23.
Dynamic Certificate Loading No Yes (Using GetRootCertificates)

This change was manually tested.

Which issue(s) this PR fixes:

Fixes #5396

Release Notes: Yes

Signed-off-by: Sapir Pol <sapir.pol@sap.com>
Signed-off-by: Sapir Pol <sapir.pol@sap.com>
@sapirpol sapirpol requested a review from a team as a code owner March 25, 2025 20:36
@guydc
Copy link
Contributor

guydc commented Mar 26, 2025

cc @nareddyt

@codecov
Copy link

codecov bot commented Mar 26, 2025

Codecov Report

Attention: Patch coverage is 60.00000% with 14 lines in your changes missing coverage. Please review.

Project coverage is 65.23%. Comparing base (096cb8d) to head (f9a3f16).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
internal/extension/registry/extension_manager.go 60.00% 10 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5613      +/-   ##
==========================================
+ Coverage   65.19%   65.23%   +0.03%     
==========================================
  Files         214      214              
  Lines       34321    34339      +18     
==========================================
+ Hits        22377    22401      +24     
+ Misses      10591    10578      -13     
- Partials     1353     1360       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

guydc
guydc previously approved these changes Mar 26, 2025
Copy link
Contributor

@guydc guydc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Sapir Pol <sapir.pol@sap.com>
Signed-off-by: Sapir Pol <30637290+sapirpol@users.noreply.github.com>
@sapirpol sapirpol requested a review from guydc April 2, 2025 08:56
@sapirpol
Copy link
Contributor Author

sapirpol commented Apr 2, 2025

@nareddyt Could you please review?

Copy link
Contributor

@nareddyt nareddyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this PR! A few minor comments on the production code. Though the test may need another look.


caCertPEMBytes, ok := secret.Data[corev1.TLSCertKey]
if !ok {
return nil, errors.New("no TLS certificate found in CA secret")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, errors.New("no TLS certificate found in CA secret")
return nil, fmt.Errorf("no CA certificate found in Kubernetes Secret %s in namespace %s", secret.GetName(), secret.GetNamespace())
  1. This is parsing a CA certificate, not TLS certificate. The pre-existing naming of key corev1.TLSCertKey is incorrect unfortunately, but we need to support it for backwards compatibility.
  2. Let's include secret name and namespace for debuggability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the error message


func createGetRootCertificatesHandler(ctx context.Context, client k8scli.Client, ext *egv1a1.ExtensionManager, namespace string) func(*advancedtls.ConnectionInfo) (*advancedtls.RootCertificates, error) {
return func(params *advancedtls.ConnectionInfo) (*advancedtls.RootCertificates, error) {
cp, err := getCertPoolFromSecret(ctx, client, ext, namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will run in the background in a goroutine inside advancedtls, correct?

If so, I suggest you don't pass in the parent context to this function, as someone may change the parent context to cancel or timeout in the future.

Instead, create a new context in this function ctx := context.Background()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

},
Type: corev1.SecretTypeTLS,
Data: map[string][]byte{
corev1.TLSCertKey: certData,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this doesn't seem right. extension_manager.go line 316 is reading CA cert from corev1.TLSCertKey (yes it is incorrectly named, but that is a legacy behavior we need to support).

If so, I'm not entirely sure how this test is working (?) Because the CA cert is not in the correct place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the test, and it worked because both the CA and the server certificate were accepted — see: https://stackoverflow.com/questions/77084841/server-certificate-used-as-ca-cert-why-does-it-work/77084918#77084918

I've now updated the test to use a self-signed certificate, so the certificate acts as both the server cert and the CA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, did not know that. Thanks for the explanation!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend modifying the test to only pass in the exact necessary data. This makes it clearer to readers that the current naming is incorrect.

Specifically, change the secret to just this, the test should still pass:

Data: map[string][]byte{
  corev1.TLSCertKey:       caCert,
},

Otherwise the PR looks good.

Copy link
Contributor Author

@sapirpol sapirpol Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I updated the secret to include only the necessary data

sapirpol and others added 2 commits April 6, 2025 11:32
Signed-off-by: Sapir Pol <sapir.pol@sap.com>
@sapirpol
Copy link
Contributor Author

sapirpol commented Apr 6, 2025

Thanks for making this PR! A few minor comments on the production code. Though the test may need another look.

Thanks! I’ve made the changes and replied back.

@sapirpol sapirpol requested a review from nareddyt April 6, 2025 12:28
sapirpol and others added 4 commits April 7, 2025 13:13
Signed-off-by: Sapir Pol <sapir.pol@sap.com>
Signed-off-by: Sapir Pol <sapir.pol@sap.com>
guydc
guydc previously approved these changes Apr 10, 2025
@sapirpol
Copy link
Contributor Author

@nareddyt Could you please re-review the PR and trigger the tests?

@arkodg
Copy link
Contributor

arkodg commented Apr 12, 2025

Hey @sapirpol can you rebase the go.mod file

Signed-off-by: Sapir Pol <30637290+sapirpol@users.noreply.github.com>
@sapirpol
Copy link
Contributor Author

Hi @arkodg, I rebased the branch. Could you please review the PR and retrigger the tests?

@arkodg arkodg merged commit a4eeddc into envoyproxy:main Apr 16, 2025
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extension Manager: support CA certificate reload

4 participants