Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster_resolver: wait until the configs have been propagated to all the children #7562

Closed

Conversation

aranjans
Copy link
Contributor

@aranjans aranjans commented Aug 26, 2024

What does this PR do?

This PR ensures that every time configuration update is triggered, it waits for child policy configuration update inline.

Approach:

Added a channel to ccUpdate of which that channel will be closed after child policies are updated. Same is verified in the test by having a ClientConnUpdateHook() which is called after the above mentioned channel is closed, which ensures child policy was updated synchronously.

RELEASE NOTES:

  • cluster_resolver: wait for child policy configuration to be updated inline

@aranjans aranjans force-pushed the aranjans_cluster_resolver_pckr_update branch from 626ab22 to a3da044 Compare August 26, 2024 14:46
@aranjans aranjans added this to the 1.67 Release milestone Aug 26, 2024
Copy link

codecov bot commented Aug 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.82%. Comparing base (3eb0145) to head (931ffd1).
Report is 84 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7562      +/-   ##
==========================================
+ Coverage   81.57%   81.82%   +0.25%     
==========================================
  Files         354      361       +7     
  Lines       27076    27819     +743     
==========================================
+ Hits        22088    22764     +676     
- Misses       3798     3854      +56     
- Partials     1190     1201      +11     
Files with missing lines Coverage Δ
...ternal/balancer/clusterresolver/clusterresolver.go 73.09% <100.00%> (+2.61%) ⬆️

... and 91 files with indirect coverage changes

@aranjans aranjans force-pushed the aranjans_cluster_resolver_pckr_update branch from a3da044 to fdcdc5b Compare August 26, 2024 15:13
@purnesh42H purnesh42H changed the title cluster_resolver: wait for child policy configuration update inline cluster_resolver: wait until the configs have been propagated to all the children Sep 5, 2024
@@ -306,6 +310,11 @@ func (b *clusterResolverBalancer) run() {
switch update := u.(type) {
case *ccUpdate:
b.handleClientConnUpdate(update)
if update.done != nil {
// Close the channel to convey to the consumers of updateCh
// that child policies config are updated inline.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Close the channel to signal to consumers of updateCh
// that the child policies' configuration has been updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -445,3 +446,64 @@ func (s) TestOutlierDetectionConfigPropagationToChildPolicy(t *testing.T) {
t.Fatalf("Timeout when waiting for child policy to receive its configuration")
}
}

// Test verifies that LB waits for child policy configuration update
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// TestChildPolicyConfigUpdateBlocking ensures that the load balancer
// blocks and waits for the child policy configuration to be updated
// in response to configuration changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aranjans need to change the comment too along with test name


// Test verifies that LB waits for child policy configuration update
// inline on receipt of configuration update.
func (s) TestChildPoliciesConfigUpdatedInline(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name suggestion: TestChildPolicyConfigUpdateBlocking

// Test verifies that LB waits for child policy configuration update
// inline on receipt of configuration update.
func (s) TestChildPoliciesConfigUpdatedInline(t *testing.T) {
// Override the newConfigHook to ensure picker was updated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Override the ClientConnUpdateHook to track when the picker is updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := managementServer.Update(ctx, resources); err != nil {
t.Fatal(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Failed to update resources on management server: %v", err

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

t.Fatalf("EmptyCall() failed: %v", err)
}

// Close the listener and ensure that the config is updated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Ensure that the child policy config update is processed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

// Close the listener and ensure that the config is updated.
lis.Close()
select {
case <-clientConnUpdateDone:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: can add a comment in success block // Success: config update was processed within the expected time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

@purnesh42H purnesh42H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@easwars
Copy link
Contributor

easwars commented Sep 18, 2024

I've started an internal discussion thread to discuss what would be the correct fix for this. Please bring your thoughts to that thread.

@purnesh42H
Copy link
Contributor

I've started an internal discussion thread to discuss what would be the correct fix for this. Please bring your thoughts to that thread.

@easwars Should we still keep this PR open or approach will change non trivially from here?

@easwars
Copy link
Contributor

easwars commented Sep 24, 2024

I think the approach will change non-trivially from here, so we can close this for now. Thanks.

@easwars easwars closed this Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants