tooling: restoring runtime fatal-by-defaults and changing to fully qualified#9591
tooling: restoring runtime fatal-by-defaults and changing to fully qualified#9591alyssawilk merged 12 commits intoenvoyproxy:masterfrom
Conversation
…alify Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
| import "google/protobuf/descriptor.proto"; | ||
|
|
||
| // Magic number in this file derived from top 28bit of SHA256 digest of "envoy.annotation.do_not_use" | ||
| extend google.protobuf.FieldOptions { |
There was a problem hiding this comment.
Do you mind adding some comments about what these annotations do? I'm a little confused what the proposed flow is?
|
|
||
| // Magic number in this file derived from top 28bit of SHA256 digest of "envoy.annotation.do_not_use_enum" | ||
| extend google.protobuf.EnumValueOptions { | ||
| bool do_not_use_enum = 162744140; |
There was a problem hiding this comment.
I think you can also call this do_not_use
There was a problem hiding this comment.
I don't think I can.
Things defined in the extend field don't get namespace prefixed. If you notice in the utility file, they're
envoy::annotations::disallowed_by_default_enum
and envoy::annotations::disallowed_by_default
and they conflict if they are both disallowed_by_default
| import "google/protobuf/descriptor.proto"; | ||
|
|
||
| // Magic number in this file derived from top 28bit of SHA256 digest of "envoy.annotation.do_not_use" | ||
| extend google.protobuf.FieldOptions { |
There was a problem hiding this comment.
One other interesting thing here is we're getting awfully close to being able to either encode the runtime key here as an annotation, or via reflection to being able to derive it. Not sure where the status quo there is today. Something worth thinking about, even if it's beyond the scope of this PR.
source/common/protobuf/utility.cc
Outdated
| bool warn_only = false; | ||
| #else | ||
| bool warn_only = true; | ||
| bool warn_only = field->options().GetExtension(envoy::annotations::do_not_use) == false; |
htuch
left a comment
There was a problem hiding this comment.
No strong opinion on 1, 2 or 3, other than to not violate our existing guidance (so maybe 3 is out).
|
|
||
| // Magic number in this file derived from top 28bit of SHA256 digest of "envoy.annotation.do_not_use" | ||
| extend google.protobuf.FieldOptions { | ||
| bool do_not_use = 169129351; |
There was a problem hiding this comment.
Bike shedding: prefer to call this disallowed_by_default.
test/proto/deprecated.proto
Outdated
| string not_deprecated = 1; | ||
| string is_deprecated = 2 [deprecated = true]; | ||
| string is_deprecated_fatal = 3 [deprecated = true]; | ||
| string is_deprecated_fatal = 3 [deprecated = true, (envoy.annotations.do_not_use) = true]; |
There was a problem hiding this comment.
So, IIUC, your tooling or manual deprecations will set this on v2. As long as protoxform doesn't delete them (which it might today, ping me tomorrow on how to avoid this), we're good for v3, since the field doesn't exist there.
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
|
I think I addressed reviewer comments. |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
|
Thanks this makes sense to me at a high level. Agree either (1) or (2) SGTM. |
|
One more meta comment, I think if we're fully qualifying I'm inclined to drop the filename from the name. |
|
I think removing filename is reasonable. |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
|
UGH, we have yet another problem. so now that we're fully qualifying the v2 test for the deprecated field "runtime key" complains that you're using the deprecated which is pretty terrible, because every time we api upgrade we'll break folks runtime guards paging in @htuch - even if we dropped the bit where we fully qualify, the field rename from foo to hidden_envoy_foo is pretty crummy. |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
|
Ok, a bit more context for @htuch This PR is almost good to go. |
|
@alyssawilk tracking the API oracle work at #9612 |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
htuch
left a comment
There was a problem hiding this comment.
LGTM once the stars align. Thanks.
| and use of that configuration field will cause the config to be rejected by default. | ||
| This fail-by-default mode can be overridden in runtime configuration by setting | ||
| envoy.deprecated_features.filename.proto:fieldname or envoy.deprecated_features.filename.proto:enum_value | ||
| In the second phase the field will be tagged as disallowed_by_default |
|
Hm, so with @htuch's latest merged, I'm now getting runtime failures unless we flag both the v2 paht and v3 alpha path as OK. |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
| Runtime::LoaderSingleton::getExisting()->mergeValues( | ||
| {{"envoy.deprecated_features:envoy.extensions.filters.network.redis_proxy.v3alpha.RedisProxy." | ||
| "PrefixRoutes.hidden_envoy_deprecated_cluster", | ||
| "true"}}); |
There was a problem hiding this comment.
This weirdness is because we're loading deprecated v2 config with a v3 proto. Arguably we should have left the deprecated feature test using a v2 proto rather than envoy::extensions::filters::network::redis_proxy::v3alpha::RedisProxy
Can fix if we care but I'm pretty convinced we don't require both for v3 by the intergration tests not neeting it.
There was a problem hiding this comment.
Probably fine for this unit test, we tend to make v3alpha everywhere for unit tests.
| Runtime::RuntimeFeaturesPeer::setAllFeaturesAllowed(); | ||
| // TODO(alyssawilk) improve this. | ||
| Runtime::LoaderSingleton::getExisting()->mergeValues( | ||
| {{"envoy.deprecated_features:envoy.config.route.v3alpha.CorsPolicy." |
There was a problem hiding this comment.
I had wanted the PoC PR to show that an integration test using v2 config would succeed with a v2 override.
Unfortunately we switched the integration test framework to v3alpha so we don't have that.
Do we think it's worth doing? My concern is otherwise there might be somewhere in the upgrade protos + downgrade protos pipeline that loses the v2ness, and we'll end up needing double validation as we do with the redis test
There was a problem hiding this comment.
@alyssawilk you could try and force the integration test config down to v2 with API_DOWNGRADE macro before passing to Envoy. That would give you a v2 path without changing any code (well, one line :)
There was a problem hiding this comment.
FWIW, we have tended to try and force v2 for the xDS aspects of the integration tests with a similar trick, but didn't do this for boostrap. I think it would be a good thing TBH to have the integration tests be focused on v2 external artifacts today.
There was a problem hiding this comment.
Yeah, as discussed a forced downgrade before we write files to disk fails because the HCM is typed config and we don't have utils for that.
Manual testing looks good. Diff patch for trying bad cors config below, and it gets me whining about v2 CorsPolicy rather than v3. Forcing runtime to accept "envoy.deprecated_features:envoy.api.v2.route.CorsPolicy.enabled" results in the failure turning back to a warning
[2020-01-14 17:12:55.235][90650][warning][misc] [source/common/protobuf/utility.cc:423] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin' from file route_components.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2020-01-14 17:12:55.235][90650][critical][main] [source/server/server.cc:94] error initializing configuration 'configs/google_com_proxy.v2.yaml': Proto constraint validation failed (Using deprecated option 'envoy.api.v2.route.CorsPolicy.enabled' from file route_components.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details. If continued use of this field is absolutely necessary, see https://www.envoyproxy.io/docs/envoy/latest/configuration/operations/runtime#using-runtime-overrides-for-deprecated-features for how to apply a temporary and highly discouraged override.):
--- a/configs/google_com_proxy.v2.yaml
+++ b/configs/google_com_proxy.v2.yaml
@@ -30,6 +30,13 @@ static_resources:
route:
host_rewrite: www.google.com
cluster: service_google
-
cors: -
enabled: {} -
allow_origin: "test-origin-1" -
allow_origin: "test-host-2" -
allow_methods: "POST" -
allow_headers: "content-type" -
clusters:
max_age: "100" http_filters: - name: envoy.router
diff --git a/source/common/runtime/runtime_impl.cc b/source/common/runtime/runtime_impl.cc
index f10d053b89..24818e60a4 100644
--- a/source/common/runtime/runtime_impl.cc
There was a problem hiding this comment.
Ack, thanks for the verification. We have a reasonable test in protobuf/utility.cc that validates we are doing cross-version deprecation checks (added in my last PR), so I'm reasonably confident based on that this PR is solid.
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
| // Ideally this would be 'reserved 0' but one can't reserve the default | ||
| // value. Instead we throw an exception if this is ever used. | ||
| UNSUPPORTED_REST_LEGACY = 0 [deprecated = true]; | ||
| UNSUPPORTED_REST_LEGACY = 0 |
There was a problem hiding this comment.
I'm kind of surprised this wasn't disabled earlier, we have zero support for this AFAIK.
There was a problem hiding this comment.
It was disabled, just the old way via the runtime strings :-)
Changing from relative name to absolute name, and fixing the fatal-by-defaults that were broken by the v3 switch.
The old way to allow fatal-by-defaults was
envoy.deprecated_features:proto_file.proto:field_name
the new way is
envoy.deprecated_features:full.namespace.field_name
When we switched to v3, all the hard-coded v2 names stopped working. This reinstates them via hopefully more permanent proto annotation.
The only remaining ugly bit is that unfortunately the full namespace and field name are the v3 versions even if the original config was v2. Between @htuch and I we should fix that before merging.
Risk Level: Medium
Testing: added new unit tests
Docs Changes: updated
Release Notes: n/a