Skip to content

Conversation

@Pranjali-2501
Copy link
Contributor

@Pranjali-2501 Pranjali-2501 commented Dec 1, 2025

Fixes : #8153

This PR implements the "Changes to Logical DNS Clusters" section of gRFC A61 (IPv4/IPv6 Dual-stack Backends).

Currently, LOGICAL_DNS clusters in the xDS cluster resolver have their Load Balancing policy hard-coded to pick_first. This ensures the semantics of connecting to only one address at a time.

This PR updates the xds_cluster_resolver logic. The buildClusterImplConfigForDNS function removes the hard-coded pick_first policy restriction for LOGICAL_DNS clusters, allowing them to use the configured LB policy.
Also, in the DNS update handler, all resolved addresses are now grouped into a single resolver.Endpoint. This ensures that regardless of the configured parent LB policy, the child policy sees a single "backend" endpoint containing all addresses.

RELEASE NOTES:

  • xds: LOGICAL_DNS clusters now honor the LB policy configured in the cluster resource, rather than defaulting to a hardcoded pick_first policy.

@Pranjali-2501 Pranjali-2501 added this to the 1.78 Release milestone Dec 1, 2025
@Pranjali-2501 Pranjali-2501 added Type: Feature New features or improvements in behavior Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Dec 1, 2025
@codecov
Copy link

codecov bot commented Dec 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.31%. Comparing base (749af0c) to head (c68ffd6).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #8733   +/-   ##
=======================================
  Coverage   83.30%   83.31%           
=======================================
  Files         419      418    -1     
  Lines       32450    32365   -85     
=======================================
- Hits        27033    26964   -69     
+ Misses       4039     4028   -11     
+ Partials     1378     1373    -5     
Files with missing lines Coverage Δ
...rnal/xds/balancer/clusterresolver/configbuilder.go 94.16% <100.00%> (+0.04%) ⬆️
.../balancer/clusterresolver/resource_resolver_dns.go 74.19% <100.00%> (-1.20%) ⬇️

... and 45 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@easwars
Copy link
Contributor

easwars commented Dec 1, 2025

This change probably needs a release note.

@easwars easwars assigned Pranjali-2501 and unassigned easwars and arjan-bal Dec 1, 2025
// Endpoint picking policy for DNS is hardcoded to pick_first.
const childPolicy = "pick_first"
func buildClusterImplConfigForDNS(g *nameGenerator, endpoints []resolver.Endpoint, mechanism DiscoveryMechanism, xdsLBPolicy *internalserviceconfig.BalancerConfig) (string, *clusterimpl.LBConfig, []resolver.Endpoint) {
retEndpoints := make([]resolver.Endpoint, len(endpoints))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to return a slice that contains a single endpoint.

Comment on lines +146 to +150
// Use the canonical string representation for the locality to match
// the keys expected by the parent Load Balancing policy.
localityStr := xdsinternal.LocalityString(clients.Locality{})
retEndpoints[i] = hierarchy.SetInEndpoint(e, []string{pName, localityStr})
retEndpoints[i] = wrrlocality.SetAddrInfoInEndpoint(retEndpoints[i], wrrlocality.AddrInfo{LocalityWeight: 1})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not doing what we want.

The DNS resolver implementation returns addresses and not endpoints, because the DNS protocol does not know about endpoints. See:

state := resolver.State{Addresses: addrs}

We have code in the channel that converts these addresses to endpoints, with one address per endpoint. See:

func addressesToEndpoints(addrs []resolver.Address) []resolver.Endpoint {

But the above code will execute only when the DNS resolver is used on the gRPC channel. Here in the cluster_resolver LB policy, we are creating a DNS resolver ourselves, and therefore the code to convert from addresses to endpoints (that exists in the channel) will not be run here. So, the cluster_resolver will see a set of addresses and no endpoints from the DNS resolver in [resource_resolver_dns.go](https://github.com/grpc/grpc-go/pull/8733/files#diff-6249528a41f17a06cec41f598b885840f09ef05ee3740ab297fc5a905f2875ca), and will probably convert it into a single endpoint with all the addresses in it (this is the change that you have in line 138 of resource_resolver_dns.go).

But, at some point, we will change the DNS resolver implementation to do what is being done today in the channel, i.e., return a set of endpoints, with one address per endpoint. That means that this code should handle multiple endpoints as input, but output a single endpoint that contains all the addresses from the input endpoints.

return endpoint
}

func TestBuildClusterImplConfigForDNS(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to test all possibilities here:

  • one endpoint with one address
  • one endpoint with multiple addresses
  • multiple endpoints, all with one address each
  • multiple endpoints, all with multiple addresses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also want to make sure we have a more e2e style test to cover this new logic. A LOGICAL_DNS cluster that specifies a round_robin LB policy and the DNS resolver returning multiple addresses. In this case, because cluster_resolver will convert these addresses into a single endpoint, we should end up with all traffic going to the first address (because RR will delegate to PF).

@easwars easwars assigned Pranjali-2501 and unassigned easwars Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Behavior change for A61 update: remove special case for LOGICAL_DNS clusters

3 participants