xds: change cdsbalancer to use update from dependency manager#8907
xds: change cdsbalancer to use update from dependency manager#8907eshitachandwani merged 18 commits intogrpc:masterfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #8907 +/- ##
==========================================
+ Coverage 80.74% 83.38% +2.63%
==========================================
Files 416 410 -6
Lines 33434 32570 -864
==========================================
+ Hits 26996 27157 +161
+ Misses 4641 4038 -603
+ Partials 1797 1375 -422
🚀 New features to boost your workflow:
|
|
Regarding all the comments talking about eating the error , we were doing that since that was the current behaviour of cluster resolver i.e. it was eating errors instead of returning the error or putting the channel in TF. After an offline discussion we have decided to change the behaviour and any error in CDS LB policy i.e. any error relating to CDS not liking the update is returned from |
easwars
left a comment
There was a problem hiding this comment.
I see that you have replied saying "done" on a bunch of comments, but I don't see it being done. Are you missing some commits here?
I had pushed all the changes and can see the changes reflected in the PR. Can you please try again and let me know if the older and current changes are still not visibile. |
easwars
left a comment
There was a problem hiding this comment.
Mostly LGTM at this point. Will make another pass shortly.
| // Start an xDS management server that pushes the EDS resource names onto a | ||
| // channel when requested. |
There was a problem hiding this comment.
This comment needs to be updated.
| // Check if we have a request for both EDS resources. If so, fire | ||
| // the event to unblock the test. |
There was a problem hiding this comment.
Nit: What this event signifies should ideally be documented at the place where the event is defined.
| watchers map[string]*watcherState // Set of watchers and associated state, keyed by cluster name. | ||
| lbCfg *lbConfig // Current load balancing configuration. | ||
| childLB balancer.Balancer // Child policy, built upon resolution of the cluster graph. | ||
| xdsClient xdsclient.XDSClient // xDS client to watch Cluster resources. |
There was a problem hiding this comment.
Nit: Clarify in this comment that this is only for dynamic subscriptions.
There was a problem hiding this comment.
This is not needed for dynamic subscription but is passed so that cluster_impl can use it for Load reporting. Here it is mainly used to get nodeID. So removed the comment.
| lbCfg *lbConfig // Current load balancing configuration. | ||
| childLB balancer.Balancer // Child policy, built upon resolution of the cluster graph. | ||
| xdsClient xdsclient.XDSClient // xDS client to watch Cluster resources. | ||
| clusterConfigs map[string]*xdsresource.ClusterResult // Map of cluster name to the last received result for that cluster. |
There was a problem hiding this comment.
Nit: Keep these trialing comments short and sweet.
Something like:
- Cluster name to the last received result for that cluster
- Hostname to priority config for that leaf cluster
... - For dynamic cluster unsubscription
- True if a dynamic cluster has been subscribed to
etc .. etc
There was a problem hiding this comment.
Nit: Please remove mentions of the serializer as it does not exist anymore.
| @@ -168,16 +162,22 @@ type cdsBalancer struct { | |||
|
|
|||
| xdsHIPtr *unsafe.Pointer // Accessed atomically. | |||
There was a problem hiding this comment.
It would be nice to state why this needs to be accessed atomically.
There was a problem hiding this comment.
Since this will move to clusterimpl, will change the comment there.
| xdsClient xdsclient.XDSClient // xDS client to watch Cluster resources. | ||
| watchers map[string]*watcherState // Set of watchers and associated state, keyed by cluster name. | ||
| lbCfg *lbConfig // Current load balancing configuration. | ||
| childLB balancer.Balancer // Child policy, built upon resolution of the cluster graph. |
There was a problem hiding this comment.
Also, it would be nice to state that all fields below are accessed only from methods on the balancer.Balancer interface. And because grpc guarantees that those methods will never be called concurrently, we don't need any extra serialization for accessing these fields.
| sortedNames := make([]string, len(names)) | ||
| copy(sortedNames, names) |
There was a problem hiding this comment.
I think this can be replaced with a single line if you use slices.Clone.
| names := req.GetResourceNames() | ||
| sortedNames := make([]string, len(names)) | ||
| copy(sortedNames, names) | ||
| sort.Strings(sortedNames) |
There was a problem hiding this comment.
Nit: Use slices.Sort here as well? gemini says its better than sort.Strings(). I know it doesn't matter here. But whatever.
| const clusterName1 = clusterName + "-cluster-1" | ||
| const clusterName2 = clusterName + "-cluster-2" | ||
|
|
||
| gotEDSRequests := grpcsync.NewEvent() |
There was a problem hiding this comment.
This probably needs a better name, and a comment.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and well-executed refactoring of the xDS implementation, primarily focusing on the CDS balancer. The core change involves centralizing resource watching logic within the dependency manager, which simplifies the CDS balancer by removing its responsibility for managing its own resource watchers. As part of this, the cluster_resolver balancer is completely removed, and the CDS balancer now uses the priority balancer as its child. The error handling for xDS resources has also been improved, making the system more robust. The tests have been thoroughly updated to reflect these architectural changes. My review includes one suggestion to improve the robustness of a test case.
This PR is part of A74 changes. This PR changes following:
RELEASE NOTES: