Skip to content

config: improve temporary delta xDS choosing mechanism#8587

Merged
htuch merged 11 commits intoenvoyproxy:masterfrom
fredlas:FIX_delta_ads
Oct 22, 2019
Merged

config: improve temporary delta xDS choosing mechanism#8587
htuch merged 11 commits intoenvoyproxy:masterfrom
fredlas:FIX_delta_ads

Conversation

@fredlas
Copy link
Contributor

@fredlas fredlas commented Oct 11, 2019

The is_delta temporary param breaks things when ADS is configured as state-of-the-world, and then a subscription is called with is_delta true. This PR replaces is_delta with an (also temporary) isDelta() function on GrpcMux, which is the real source of truth on whether it's safe to use a DeltaSubscriptionImpl for a new ADS subscription. All of this will be removed by #8478.

Risk Level: low

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
@fredlas
Copy link
Contributor Author

fredlas commented Oct 11, 2019

/assign stevenzzzz

@stevenzzzz
Copy link
Contributor

could you add some tests? Also since the is_delta is not used, maybe we can do a cleanup to remove that (temp) parameter from the function definition.

@stevenzzzz
Copy link
Contributor

loop in htuch@ for Senior maintainer approval.

@fredlas
Copy link
Contributor Author

fredlas commented Oct 11, 2019

Removal of is_delta in progress; this reduces the file changed count in #8478, so that's nice.

I understand the desire to add tests for something that was broken, but I think this one is an awkward situation. The brokenness is because #7293 got directly merged, rather than getting a sort of "checkpoint" approval and then merging in what is now #8478. It's brokenness due to multiple versions of the same code existing at once. With #8478 there will only be one version left. So, it won't just be "impossible" for this problem to recur, the problem will actually no longer be a meaningfully existing concept.

@stevenzzzz
Copy link
Contributor

Removal of is_delta in progress; this reduces the file changed count in #8478, so that's nice.

I understand the desire to add tests for something that was broken, but I think this one is an awkward situation. The brokenness is because #7293 got directly merged, rather than getting a sort of "checkpoint" approval and then merging in what is now #8478. It's brokenness due to multiple versions of the same code existing at once. With #8478 there will only be one version left. So, it won't just be "impossible" for this problem to recur, the problem will actually no longer be a meaningfully existing concept.

can we cleanup the is_delta parameter? it's not used, so probably just some quick deletion work.

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
@fredlas
Copy link
Contributor Author

fredlas commented Oct 11, 2019

is_delta removed.

Signed-off-by: Fred Douglas <fredlas@google.com>
@stevenzzzz
Copy link
Contributor

LGTM.

@stevenzzzz
Copy link
Contributor

/retest

@repokitteh-read-only
Copy link

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #8587 (comment) was created by @stevenzzzz.

see: more, trace.

@fredlas
Copy link
Contributor Author

fredlas commented Oct 14, 2019

/retest

@repokitteh-read-only
Copy link

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #8587 (comment) was created by @fredlas.

see: more, trace.

@fredlas
Copy link
Contributor Author

fredlas commented Oct 14, 2019

The envoy-macos one appears to be a flake.

@fredlas fredlas changed the title config: improve delta xDS choosing temporary mechanism config: improve temporary delta xDS choosing mechanism Oct 14, 2019
Signed-off-by: Fred Douglas <fredlas@google.com>
@fredlas
Copy link
Contributor Author

fredlas commented Oct 15, 2019

Now envoy-macos passed, but envoy-linux failed, with a different test failing. So, I think it's safe to conclude those are flakes.

@stevenzzzz
Copy link
Contributor

ping on this PR.
@htuch could you please take a look?

lambdai
lambdai previously approved these changes Oct 17, 2019
Copy link
Contributor

@lambdai lambdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ADS: the singleton ads config determine the delta or not, not per ClusterConfig/ListenerConfig.
For non-ADS: config source determine if it's delta or not since it has its own mux

Signed-off-by: Fred Douglas <fredlas@google.com>
…equirement is really on a knife edge...

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
@fredlas
Copy link
Contributor Author

fredlas commented Oct 18, 2019

I'm not sure how the source/common/config/api_type_db.generated.pb_text changes got into this PR. Should I remove them?

@stevenzzzz
Copy link
Contributor

/assign htuch
PTAL.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. Please revert the API type database stuff, this is a bug that I need to fix, it shouldn't be doing that if there are no API changes. Just one comment; I hadn't really though much about mixed delta/SOTW until seeing this PR.
/wait


// TODO(fredlas) PR #8478 will remove this.
/**
* Whether this GrpcMux is delta.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the end goal for supporting a combination of SOTW and delta? I would imagine that some operators will want both, with let's say CDS delta but LDS SOTW. Can that be done over a single gRPC stream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no and maybe eventually yes, haha. Non-aggregated xDS can already mix and match delta and SotW no problem. When using ADS, because delta and SotW have different request/response protos, and different gRPC services corresponding to those protos, there can only be one type.

However, the change we briefly discussed a while back, where there would be a single top level carrier proto for request+response, and the various types of messages (delta vs SotW, non-ACK request vs ACK) were just different oneof options, would enable mixing within ADS. In practice that would take a bit of plumbing in this code (probably pretty straightforward), and would also need the bootstrap config format to be reworked a bit: rather than just configuring your CDS as "CDS, provided by ADS [implicitly referring to the ADS config]", you would also need to choose delta/SotW in the CDS config thing. And... actually, the choice would be removed from the ADS config thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. OK, let's go ahead with this as is, thanks for the reminder of our earlier conversation on this.

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@htuch htuch merged commit b261cd5 into envoyproxy:master Oct 22, 2019
@fredlas
Copy link
Contributor Author

fredlas commented Oct 22, 2019

Great, thanks very much, this is much cleaner, not to mention fixed. (And thanks to @stevenzzzz for bringing it to my attention)

derekargueta pushed a commit to derekargueta/envoy that referenced this pull request Oct 24, 2019
The is_delta temporary param breaks things when ADS is configured as state-of-the-world, and then a subscription is called with is_delta true. This PR replaces is_delta with an (also temporary) isDelta() function on GrpcMux, which is the real source of truth on whether it's safe to use a DeltaSubscriptionImpl for a new ADS subscription. All of this will be removed by envoyproxy#8478.

Risk Level: low

Signed-off-by: Fred Douglas <fredlas@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants