Incremental XDS proposal.#527
Incremental XDS proposal.#527nt wants to merge 1 commit intoenvoyproxy:masterfrom nt:incremental-xds
Conversation
Signed-off-by: Nicolas Thiebaud <nicothieb@google.com>
htuch
left a comment
There was a problem hiding this comment.
Thanks for kicking this off, this is a good start.
How does versioning work in the world of incremental updates? This is one of the key things that needs to be still addressed in this proposal.
|
|
||
| // When true the list of resource_names is added to the set of tracked resources | ||
| // by this node. | ||
| bool incremental = 4; |
There was a problem hiding this comment.
Would you want a way to remove tracked resources incrementally?
We should be clear that when the stream is dropped, this state disappears.
There was a problem hiding this comment.
I think incremental removal is handled below? More generally though, we need to think through reconnection. I think we need to build something into the incremental protocol where the first response on a stream is the full dump, and then it switches to incremental, or something like that. Otherwise in incremental mode Envoy will have to tell the management server everything it knows about when in incremental mode?
There was a problem hiding this comment.
The semantics of incremental are a bit different here. In this particular field, incremental refers to changing the list of tracked resources (or resource hints) maintained by the management server for a given xDS resource. Today, for CDS, this is the empty set, but in the future this will be the set of clusters that have been lazily loaded and the management server needs to know to update when changes come along at its end.
There was a problem hiding this comment.
I see, OK. Then yes we definitely need incremental removal also, in addition to the discussion about reconnection.
|
|
||
| // When true the resources list is a diff to the exiting resources. | ||
| // Incremental can only be set to true in DiscoveryResponse if this stream has | ||
| // seen incremental = true in a DiscorevyRequest. |
There was a problem hiding this comment.
I don't see why there needs to be a connection here. Logically, the resource names are a hint only, and we could do incremental resource names with or without incremental updates from the server.
We need to also discuss how incremental and full updates can be combined (e.g. sometimes it might make sense to do a full update rather than tons of tiny incremental ones or when establishing known state initially). The xDS protocol RFC is essentially https://github.com/envoyproxy/data-plane-api/blob/master/XDS_PROTOCOL.md, so we can cover some of this there.
Also nit: s/DiscorevyRequest/DiscoveryRequest/.
There was a problem hiding this comment.
+1 on incremental vs. full. I suppose it's possible w/ this API, but per my above comment should we consider a more specific sequence of events in which on initial connection we expect the server to send the full dump, then switch to incremental mode? Either way, given this flag, Envoy can either do the delta diff or not.
| // explicitly enumerated in resource_names. | ||
| repeated string resource_names = 3; | ||
|
|
||
| // When true the list of resource_names is added to the set of tracked resources |
There was a problem hiding this comment.
Another question to consider: how do we know if a cluster can't be obtained from the management server? E.g. if we get some header, the filter decides it wants a service referenced in the header, but the xDS server doesn't know about it.
There was a problem hiding this comment.
I suppose if the server doesn't respond with the requested cluster and a higher nonce, Envoy can assume it does not exist, but that seems very fragile. Perhaps we should be adding some ability to carry back error information from the management server? I could imagine the management server responding with the name of a requested resource and an associated error?
|
@nt thanks for starting this. I think what would be helpful here is potentially a full description of what the sequence of events is from Envoy <-> management server when doing incremental. Do you mind putting that together for discussion? I think it would help a lot. Ultimately that would go in the xDS docs anyway so it won't be throwaway work. |
|
Thank you for your comments, I am preparing a better doc/proposal for this. |
|
I'm going to close this out while we work on the doc. |
This is a strawman proposal for incremental XDS.
Its intention is to be backwards compatible with XDS jobs that have not updated to support incremental.