release: flipping fatal-by-defaults#8847
release: flipping fatal-by-defaults#8847alyssawilk wants to merge 6 commits intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
|
I'm going to put this out here before the tests pass so @envoyproxy/maintainers can bike shed about what things we don't want to have fatal. Also meta, while I love landing code and then turning features on, I think quarterly cadence is too slow, because it makes for one massive high risk PR. Should we true-by-default and just leave things in the code for 6 months? true by default unless it's super high risk (like the buffer change)? Thoughts? |
|
Also as long as I'm inviting bike shedding, I'm inclined to split out fatal-by-defaults from feature flips, in case one causes more problems than the other. Thoughts? May check in tomorrow but I don't think it's urgent we land this all before monday so we'll see :-P |
|
Yeah, tests are unhappy locally. I'll look into it Monday unless someone else wants to take a look? |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
mattklein123
left a comment
There was a problem hiding this comment.
Yeah definitely some stuff to discuss in here. Let's all chat as a group on Monday.
/wait
| // 1.12.0 | ||
| "envoy.deprecated_features.route.proto:regex_match", | ||
| "envoy.deprecated_features.route.proto:regex", | ||
| "envoy.deprecated_features.accesslog.proto:config", |
There was a problem hiding this comment.
We just did this 1-2 weeks ago so this seems pretty mean for those running on master. Defer to next release? Although aren't we moving to a 6 month cycle for this anyway? I can't remember. (Applies to all config in this list). Also I think this deprecation isn't even fully completed yet as @lizan identified some stragglers. cc @yanavlasov
There was a problem hiding this comment.
Do we want to leave everything with config until the next release then? I don't think we can do the stragglers in this release
There was a problem hiding this comment.
+1 let's just leave until next release.
| "envoy.deprecated_features.route.proto:regex", | ||
| "envoy.deprecated_features.accesslog.proto:config", | ||
| "envoy.deprecated_features.thrift_proxy.proto:config", | ||
| "envoy.deprecated_features.cds.proto:tls_context", |
There was a problem hiding this comment.
Same, just done, seems not good.
There was a problem hiding this comment.
Given this has been stale for a month, WDYT? :-P
There was a problem hiding this comment.
My concern is still that this is a config option that almost everyone sets. With that said, after a month, it's probably fine. @lizan WDYT?
| "envoy.deprecated_features.route.proto:allow_origin", | ||
| "envoy.deprecated_features.string.proto:regex", | ||
| "envoy.deprecated_features.grpc_service.proto:config", | ||
| "envoy.deprecated_features.tcp_proxy.proto:deprecated", |
There was a problem hiding this comment.
Is this right? I think this should be "deprecated_v1"
There was a problem hiding this comment.
Yeah, the script doesn't deal well with
// [#next-free-field: 11]
message TcpProxy {
// [#not-implemented-hide:] Deprecated.
// TCP Proxy filter configuration using V1 format.
message DeprecatedV1 {
option deprecated = true;
I think this one is beyond python so should manually tweak.
| "envoy.deprecated_features.grpc_service.proto:config", | ||
| "envoy.deprecated_features.tcp_proxy.proto:deprecated", | ||
| "envoy.deprecated_features.overload.proto:config", | ||
| "envoy.deprecated_features.route.proto:value", |
There was a problem hiding this comment.
I think this is in a nested message. I'm a little worried this might accidentally conflict? Should the name actually include the full ptah?
There was a problem hiding this comment.
mmm, I agree, but I think changing naming at this point is going to be ugly.
I'll see what I can do - I think we'll have to transition over to the new naming pretty carefully.
There was a problem hiding this comment.
Oh yeah, I remember why I didn't do this - it's really non-trivial to do given the python script is fairly stateless and doesn't know the message it's in. I think we'd have to fundamentally rewrite from scratch into something that did proto introspection.
Another alternate is we could have a not-yet-fatal section in runtime_features, and ASSERT in test that everything tagged as deprecated is either in disallowed_features or not_yet_disallowed_features. This would CI-regression-test that folks to add their own fully qualified fields as they added code, and then we could just cut from not_yet_disallowed to disallowed as part of the release.
WDYT?
There was a problem hiding this comment.
@alyssawilk do you mind briefly syncing up with @htuch about this? Some of the proto parsing stuff he is doing now is pretty incredibly and I'm guessing it wouldn't be that difficult to change your script to actually run against the real proto tree.
There was a problem hiding this comment.
Yeah, I think the approach you could take here is to leverage the API type database to pull out this information. There may be other approaches we could take, e.g. we could include deprecation date annotations which could allow us to infer the disallowed fields via reflection.
There was a problem hiding this comment.
Sure. Do you have someone who could take a look some time this release (at which point I can revert this) and file a tracking bug at someone else? Alternately I could snag some time on your calendar for code pointers, given my python fu is pretty weak.
There was a problem hiding this comment.
I think anyone who can do this is probably going to be busy landing v3 for this release, so I think we can TODO/file issue here.
There was a problem hiding this comment.
OK, so I'll just land manual cleanups, and TODO script cleanups or script rewrites.
| "envoy.deprecated_features.listener.proto:tls_context", | ||
| "envoy.deprecated_features.listener.proto:config", | ||
| "envoy.deprecated_features.health_check.proto:config", | ||
| "envoy.deprecated_features.http_connection_manager.proto:[(validate.rules).enum", |
There was a problem hiding this comment.
mmm, that's
OperationName operation_name = 1
[(validate.rules).enum = {defined_only: true}, deprecated = true];
Again I think there's some things that are beyond simple checking - we'd have to either do this programatically which would be awesome but I'm not sure how to do, or accept some clean up
+1
+1 |
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
alyssawilk
left a comment
There was a problem hiding this comment.
Woo, I have time to work on Envoy this month!
Still not ready for review - I'm going to do some script fixes, but sending out comments for conversation
| // 1.12.0 | ||
| "envoy.deprecated_features.route.proto:regex_match", | ||
| "envoy.deprecated_features.route.proto:regex", | ||
| "envoy.deprecated_features.accesslog.proto:config", |
There was a problem hiding this comment.
Do we want to leave everything with config until the next release then? I don't think we can do the stragglers in this release
| "envoy.deprecated_features.route.proto:regex", | ||
| "envoy.deprecated_features.accesslog.proto:config", | ||
| "envoy.deprecated_features.thrift_proxy.proto:config", | ||
| "envoy.deprecated_features.cds.proto:tls_context", |
There was a problem hiding this comment.
Given this has been stale for a month, WDYT? :-P
| "envoy.deprecated_features.grpc_service.proto:config", | ||
| "envoy.deprecated_features.tcp_proxy.proto:deprecated", | ||
| "envoy.deprecated_features.overload.proto:config", | ||
| "envoy.deprecated_features.route.proto:value", |
There was a problem hiding this comment.
mmm, I agree, but I think changing naming at this point is going to be ugly.
I'll see what I can do - I think we'll have to transition over to the new naming pretty carefully.
| "envoy.deprecated_features.route.proto:allow_origin", | ||
| "envoy.deprecated_features.string.proto:regex", | ||
| "envoy.deprecated_features.grpc_service.proto:config", | ||
| "envoy.deprecated_features.tcp_proxy.proto:deprecated", |
There was a problem hiding this comment.
Yeah, the script doesn't deal well with
// [#next-free-field: 11]
message TcpProxy {
// [#not-implemented-hide:] Deprecated.
// TCP Proxy filter configuration using V1 format.
message DeprecatedV1 {
option deprecated = true;
I think this one is beyond python so should manually tweak.
| "envoy.deprecated_features.listener.proto:tls_context", | ||
| "envoy.deprecated_features.listener.proto:config", | ||
| "envoy.deprecated_features.health_check.proto:config", | ||
| "envoy.deprecated_features.http_connection_manager.proto:[(validate.rules).enum", |
There was a problem hiding this comment.
mmm, that's
OperationName operation_name = 1
[(validate.rules).enum = {defined_only: true}, deprecated = true];
Again I think there's some things that are beyond simple checking - we'd have to either do this programatically which would be awesome but I'm not sure how to do, or accept some clean up
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
|
Ok, got rid of stragglers. Running tests locally but figured I'd ping about protocol buffer names - do we want to do name.field for 1.12 or do the changeover consistently for all fields when we cut to the next API version? I can go either way |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for coming back to this. A few comments.
/wait
| tracing: | ||
| operation_name: INGRESS | ||
| idle_timeout: 840s | ||
| http2_protocol_options: {} |
There was a problem hiding this comment.
Did you mean to delete this?
| common_http_protocol_options: | ||
| idle_timeout: 840s |
There was a problem hiding this comment.
How did this pass tests in the first place? Shouldn't this have failed somehow on the compile time options build?
There was a problem hiding this comment.
We had the example configs test tagged as excluded. I'll go and fix that as I think it's more trouble than it's worth.
| - safe_regex: | ||
| google_re2: {} | ||
| regex: .* |
There was a problem hiding this comment.
nit: do an exact match on "*" here. Same question though on how this passed tests before?
Sorry can you clarify? I'm not sure what we are choosing between? |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
|
At a high level we were discussing if we wanted to have |
IMO I would just manually fix it now? The downside would be potentially breaking people who are setting these flags but IMO an envoy-announce@ email would be good enough for that. WDYT? /wait-any |
|
Oh sorry, did you want to fix including all the ones that are overridedden today, or just the new ones? I thought we'd have to support both for a bit to not be crummy. Also two questions. If we're going to change this |
|
Fullname has the disadvantage f being a bit longer, but AFIK can't have conflicts |
IMO it's probably OK to just change it without supporting both as long as we email envoy-announce, since users should be able to set both variables in runtime if needed before the deploy. I think this will effect almost no one so it would be better to reduce our pain if we think that is ok.
Big +1 on full name. @htuch any thoughts here? Also see the comment above about auto conversion. /wait |
|
Yeah, full name is the way to go for sure, way less danger of conflict. Regarding whether this gets replaced by something fancier; I think in terms of the technology we use, we have options after v3 lands, we now have some pretty deep proto and C++ analysis/rewrite capabilities (Envoy begins learning at a geometric rate and becomes self aware at 2:14am Eastern time, January 1 2020). OTOH, I do like the idea of fatal-by-default flips each quarter, so orthogonal to the technology choice, I think we want to maintain this capability. This encourages earlier migration to new API use best practices and provides an audit trail of where folks are lagging, within a single major API version. |
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
|
This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
|
So I had this theory wherein I'd get this out today, and unexpectedly spent [longer than I will admit] why I couldn't get things to fail Turns out with the v3 migration, we had depcated and made fatal by default v2 is still supported AFIK via proto conversion so I think for this release we mark the "fatal" ones illegal conversions, and then deal with new deprecations after the release? |
|
Yeah, let's chat tomorrow. The deprecated field renaming is very mechanical, so I feel there must be a way to recover the information you are after here. |
This PR makes the following fatal by default: from cluster.proto: ORIGINAL_DST_LB, tls_context, extension_protocol_options from health_check.proto: use_http2 from route_components.proto: allow_origin regex, pattern, method, regex_match, value from http_connection_manager.proto: operation_name from trace.proto: HTTP_JSON_V1 from string.proto: regex Risk Level: Medium (who knows who is using them) Testing: test framework updates Docs Changes: n/a Release Notes: n/a Originally #8847 Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Making the following configs fatal by default:
"envoy.deprecated_features.route.proto:regex_match",
"envoy.deprecated_features.route.proto:regex",
"envoy.deprecated_features.cds.proto:tls_context",
"envoy.deprecated_features.route.proto:allow_origin",
"envoy.deprecated_features.string.proto:regex",
"envoy.deprecated_features.route.proto:value",
"envoy.deprecated_features.http_connection_manager.proto:idle_timeout",
"envoy.deprecated_features.lds.proto:use_original_dst",
"envoy.deprecated_features.config_source.proto:DEPRECATED_AND_UNAVAILABLE_DO_NOT_USE",
"envoy.deprecated_features.route.proto:pattern",
"envoy.deprecated_features.listener.proto:tls_context",
"envoy.deprecated_features.http_connection_manager.proto:operation_name",
"envoy.deprecated_features.health_check.proto:use_http2",
"envoy.deprecated_features.trace.proto:DEPRECATED_AND_UNAVAILABLE_DO_NOT_USE",
"envoy.deprecated_features.trace.proto:HTTP_JSON_V1",
"envoy.deprecated_features.cds.proto:ORIGINAL_DST_LB",
"envoy.deprecated_features.route.proto:regex",
"envoy.deprecated_features.cds.proto:extension_protocol_options",
"envoy.deprecated_features.route.proto:method",
Risk Level: High