tls: SDS support for private key providers#16737
Conversation
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
lambdai
left a comment
There was a problem hiding this comment.
tls_certificate_configs_.emplace_back(*secret, factory_context_, api_) in ContextConfigImpl::setSecretUpdateCallback requires an extended life of factory_context_. I feel like the desired life is longer than the one in listener. Could you revisit it? Thanks!
source/server/listener_impl.cc
Outdated
| ASSERT(workers_started_); | ||
| parent_.inPlaceFilterChainUpdate(*this); | ||
| }), | ||
| transport_factory_context_(parent_.server_.admin(), parent_.server_.sslContextManager(), |
There was a problem hiding this comment.
Sorry, this breaks intelligent listener update.
Previously the transport_factory_context was on the stack.
Now the transport_factory_context instance needs to be extended because this instance is referred by tls transport socket ContextConfigImpl's callback.
Suppose you have listener config with filter chain X, and new listener config with filter chain X+Y,
the listener update will keep the previous X (and its TLS transport socket!) valid,
but the old listener is destroyed so the old transport_factory_context_ is destroyed as well. The filter chain X ends up with a dangling reference to the old transport_factory_context_.
I suggest defining ListenerImpl::transport_factory_context_ as shared_ptr and share the transport_factory_context_ among generations of ListenerImpl.
See the shared member connection_balancer_(origin.connection_balancer_).
There was a problem hiding this comment.
Thanks for the feedback! I guess I didn't quite understand how the listener config update is done at first. I added now a patch to use a shared pointer as you suggested. Could you see if it resolves the issue?
There was a problem hiding this comment.
Thanks! This part LGTM. I will use a new thread as follow up
There was a problem hiding this comment.
Could you add an integration test for this? My experience is that any combination of 2 xDS is very like to introduce issues. I don't fully sure what's the lifetime of the transport socket in ContextConfigImpl::setSecretUpdateCallback. Neither downstream nor upstream. It would be good to have an integration test in test/integration/sds_dynamic_integration_test.cc
There was a problem hiding this comment.
Since you are the original author of the private key provider :)
Is there an example of PrivateKeyMethodProviderInstanceFactory for me to learn? Thanks!
There was a problem hiding this comment.
Ok, I'll add an integration test. So far I've only been trying this out with modified go-control-plane examples. I'll try to experiment with various xDS combinations -- do you have any combinations in mind which would be most likely to expose lifetime issues?
I have a few internal private key providers in use, but so far the only example in the Envoy tree is the test private key provider at https://github.com/envoyproxy/envoy/blob/main/test/extensions/transport_sockets/tls/test_private_key_method_provider.cc. However, I'm planning to introduce an AVX-512 private key provider "soon", see #15871.
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
ef123a3 to
7ef7172
Compare
|
Force pushed to add a forgotten Signed-off-by line from the last patch. |
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
|
Merged main branch to resolve a rebase conflict. |
There was a problem hiding this comment.
If you apply this patch:
index 594b9cd46..621f43253 100644
--- a/test/common/secret/secret_manager_impl_test.cc
+++ b/test/common/secret/secret_manager_impl_test.cc
@@ -32,6 +32,8 @@ namespace Envoy {
namespace Secret {
namespace {
+const ::test::common::secret::TestPrivateKeyMethodConfig _mock_test_private_key_method_config_dummy;
+
class SecretManagerImplTest : public testing::Test, public Logger::Loggable<Logger::Id::secret> {
protected:
SecretManagerImplTest()
The test should pass (I verified this locally).
Target //test/common/secret:secret_manager_impl_test up-to-date:
bazel-bin/test/common/secret/secret_manager_impl_test.exe
INFO: Elapsed time: 126.692s, Critical Path: 125.90s
INFO: 12 processes: 2 internal, 10 local.
INFO: Build completed successfully, 12 total actions
//test/common/secret:secret_manager_impl_test PASSED in 4.3s
I think that the reason this is happening is because the compiler\linker thinks that this dependency is not used and it does not link it with your test.
@wrowe do you have more context on why this diff is needed?
Co-authored-by: Sotiris Nanopoulos <sonanopo@microsoft.com> Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
|
Merged main branch to resolve a rebase conflict. |
|
Please pick up main to get #16643 which should resolve the rest of the Windows issues |
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
The test reuses the test private key provider for TLS tests. Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Also move the new tests to the bottom of the file instead of having them in several places. Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
|
@htuch and @adisuissa for more xDS eyes :-) |
|
Thanks! I would need some guidance about which xDS+SDS sequences to test with the integration tests -- it's bit tricky work so I would like to concentrate on the most suspicious combinations first. |
htuch
left a comment
There was a problem hiding this comment.
Overall this makes sense. Something like HCM preserves context from when a factory is created until it is invoked to produce instances, so in a similar way I think it's reasonable to preserve context here from when the private key provider is created until it is invoked.
| * for the cluster lifetime in case SDS is used. | ||
| */ | ||
| void | ||
| setTransportFactoryContext(std::unique_ptr<Server::Configuration::TransportSocketFactoryContext> |
There was a problem hiding this comment.
Nit TransportSocketFactoryContextPtr type alias.
| } | ||
|
|
||
| void ClusterImplBase::setTransportFactoryContext( | ||
| std::unique_ptr<Server::Configuration::TransportSocketFactoryContext> |
There was a problem hiding this comment.
Main nits remaining are to fix up these unique_ptr type names everywhere to follow the FooPtr style in https://github.com/envoyproxy/envoy/blob/main/STYLE.md#deviations-from-google-c-style-guidelines.
| envoy::extensions::transport_sockets::tls::v3::Secret getCurrentServerPrivateKeyProviderSecret() { | ||
| envoy::extensions::transport_sockets::tls::v3::Secret secret; | ||
| ProtobufWkt::Struct val; | ||
| (*val.mutable_fields())["private_key_file"].set_string_value( |
There was a problem hiding this comment.
Prefer loading from YAML rather than Struct surgery.
|
|
||
| // check the key type and set the mode accordingly | ||
| std::string mode; | ||
| std::string private_key = TestEnvironment::readFileToStringForTest( |
|
|
||
| secret.set_name(server_cert_rsa_); | ||
| auto* tls_certificate = secret.mutable_tls_certificate(); | ||
| tls_certificate->mutable_certificate_chain()->set_filename( |
There was a problem hiding this comment.
Similarly, prefer to build from YAML if possible.
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
| // Test that private key provider definitions inside Secrets can be added dynamically. | ||
| TEST_F(SecretManagerImplTest, SdsDynamicSecretPrivateKeyProviderUpdateSuccess) { | ||
| Server::MockInstance server; | ||
| std::unique_ptr<SecretManager> secret_manager(new SecretManagerImpl(config_tracker_)); |
|
@ipuustin the |
|
@htuch Sorry for the delay with this! I'll do the fixes and merge main as soon as I get back to office (two weeks max). |
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
|
/retest Seems that CI has stalled... |
|
Retrying Azure Pipelines: |
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
* main: Fix for fuzz tests failing due to invalid corpus paths (envoyproxy#17767) kafka: fix integration test (envoyproxy#17764) Fix typo in cluster.proto (envoyproxy#17755) cluster manager: add drainConnections() API (envoyproxy#17747) kafka broker filter: move to contrib (envoyproxy#17750) quiche: switch external dependency to github (envoyproxy#17732) quiche: implement stream idle timeout in codec (envoyproxy#17674) Update c-ares to 1.17.2 (envoyproxy#17704) Fix dns resolve fuzz bug (envoyproxy#17107) Remove members that shadow members of the base class (envoyproxy#17713) thrift proxy: missing parts from the previous PR (envoyproxy#17668) thrift-proxy: cleanup ConnectionManager::ActiveRpc (envoyproxy#17734) listener: extra warning for deprecated use_proxy_proto field (envoyproxy#17736) kafka: add support for metadata request in mesh-filter (envoyproxy#17597) upstream: add all host map to priority set for fast host searching (envoyproxy#17290) Remove the support for `hidden_envoy_deprecated_per_filter_config` (envoyproxy#17725) tls: SDS support for private key providers (envoyproxy#16737) bazel: update rules_foreign_cc (envoyproxy#17445) Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Envoy Secret messages can contain also private key provider definitions in place of plain private keys. However, the current SDS implementation doesn't support that for two reasons: 1. Envoy tries to resolve the private key whether or not that's part of the secret. This causes private key to be an empty string even when it's not set. 2. Envoy loses track of TransportSocketFactoryContext objects after server startup, and thus private key providers (who need the context as part of the API) can't be dynamically initialized. The first issue is fixed with a straightforward check, but the second requires keeping TransportSocketFactoryContexts around for the cluster / listener lifetime. Risk Level: Medium Testing: Unit test for the first case, the second manually tested with a custom SDS control plane created with go-control-plane. Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com> Co-authored-by: Sotiris Nanopoulos <sonanopo@microsoft.com>
Commit Message:
Fix SDS handling for private key providers.
Additional Description:
Envoy Secret messages can contain also private key provider definitions in place of plain private keys. However, the current SDS implementation doesn't support that for two reasons:
The first issue is fixed with a straightforward check, but the second requires keeping TransportSocketFactoryContexts around for the cluster / listener lifetime.
I was exploring the issue on my own and I didn't coordinate about this, so the approach I took might not be the best way of solving the issue though. Any feedback is welcome!
Risk Level: Medium
Testing: Unit test for the first case, the second manually tested with a custom SDS control plane created with go-control-plane.
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A