[Backport/v1.28] identity: reload CA root cert channel on file change#1801
Conversation
Introduce RootCertManager (analogous to CrlManager) to watch the CA root cert file for changes using a notify debouncer. When the file changes, a dirty flag (AtomicBool) is set. On the next call to CaClient::fetch_certificate(), the flag is checked and, if set, the TLS gRPC channel is rebuilt using the updated cert before sending the CSR request. If the rebuild fails, the flag is rearmed and the existing channel is reused, so cert renewal continues working despite a transient error. The gRPC channel is now wrapped in a RwLock inside CaClient to allow lazy replacement without blocking concurrent reads. CaClient::new() now accepts a RootCert directly instead of a pre-built cert provider, so it can start the file watcher and retain the data needed to rebuild the channel later. Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>
Add notify and notify-debouncer-full needed to watch the CA root file parent folder and hot reload the cert. Add tempfile dependency needed for testing purposes NOTE: Used the versions in main for the three dependencies even if there are newer ones Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>
|
Skipping CI for Draft Pull Request. |
|
Hey @jlojosnegros, the backports are automatically created from PR on the master branch by using the label |
Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>
Add write_lock_wait_ms to the debug log emitted after a successful root cert hot-reload, so contention on the RwLock is observable in logs without requiring additional instrumentation. Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>
Introduces a boolean env var CA_CERT_WATCHER (default: true) that allows operators to disable the CA root cert file watcher at runtime. When set to false, no watcher thread is started and the gRPC channel is never rebuilt on cert rotation — the startup-time cert is retained permanently. The flag is only effective when ca_root_cert is a file path; Static and Default certs never start a watcher regardless. Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>
|
Hi @fjglira Has the backport been approved? |
@jlojosnegros can you please check the errors? |
|
@ilrudie, what do you think about the CA_CERT_WATCHER flag? And from my point of view, I think the default state should be false? I mean, if there are users with already a ca_root_cert as a file path, this will change the current behaviour? Another question: should we add a release note in the istio/istio to document the change? Note: @jlojosnegros after we agree all this, you will need to create the backport also for the 1.29 release branch |
Great, approving |
|
/retest |
|
@fjglira, the failure is a lint/clippy one not a flake. It'll need a code fix |
|
Checking ... clippy complains about the number of arguments in CaClient::new exceeding the max of 7 arguments ... if we cannot use '#[allow]' to disable the warning here, we should think on grouping some params to reduce the number of arguments. Maybe the boolean flags? Or something else with low impact in the code ... WDYT @ilrudie could we disable the warning here ( as this is just a backport ) or should we try to group some arguments in a new type? |
|
I just suppressed the lint error. A refactor on a release branch to support a release-branch-only flag doesn't make sense to me. |
CaClient::new has 8 parameters after adding enable_ca_cert_watcher in the previous commit, exceeding clippy's default limit of 7. suppress the lint rather than refactoring. Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>
Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>
Backport of #1775 to release-1.28
Everything is almost directly backported.
Note: some dependencies added.
Introduces a modification over the master PR, a boolean env var CA_CERT_WATCHER (default: true) that allows operators to disable the CA root cert file watcher at runtime.
When set to false, no watcher thread is started and the gRPC channel is never rebuilt on cert rotation — the startup-time cert is retained permanently.
The flag is only effective when ca_root_cert is a file path; Static and Default certs never start a watcher regardless.