roundrobin: randomize starting address when rebuilding the picker #2579

dfawley · 2019-01-16T23:33:46Z

No description provided.

dfawley · 2019-01-16T23:41:18Z

markdroth · 2019-01-17T15:12:31Z

The more I think about this, the less convinced I am that randomization is the right behavior.

As we've discussed, there are two cases here:

A new picker is created because one of the subchannels changed its connectivity state into or out of READY.
A new picker is created because the resolver gave us a new list of addresses.

In case 2, I think starting at zero is fine. Starting at a random index could be harmful if there are cases where there are a small rate of very expensive RPCs and a resolver or balancer gave us addresses in a particular order and expected us to follow that order. Admittedly, that's probably a rare case, so we could probably get away with using a random index in this case, but since starting at zero is probably easier to implement than randomizing, I still think it's the better choice.

Case 1 is the more interesting case. Starting at a random index is strictly better than starting over at zero, for the reasons you note in this PR. However, I still think the right behavior is that we should keep going from the index we were at previously. Otherwise, this means that any time an individual backend is flapping, we will be constantly skipping randomly throughout the list. Even worse, all clients will be doing that random skipping, which means that round_robin winds up degenerating into pick_random.

ejona86 · 2019-01-17T16:10:08Z

That doesn't sound that bad, as a degenerate case. Being the literal worst case, degenerating to pick_random... meh. Like, I have trouble caring.

To change that behavior would require a larger change in how subchannels are passed to the picker. So it sounds like this PR should still be merged, even if it is expected to be replaced later.

ejona86 · 2019-01-17T16:10:56Z

balancer/roundrobin/roundrobin.go

@@ -47,12 +48,19 @@ type rrPickerBuilder struct{}

 func (*rrPickerBuilder) Build(readySCs map[resolver.Address]balancer.SubConn) balancer.Picker {
 	grpclog.Infof("roundrobinPicker: newPicker called with readySCs: %v", readySCs)
+	if len(readySCs) == 0 {
+		return base.NewErrPicker(balancer.ErrNoSubConnAvailable)
+	}
 	var scs []balancer.SubConn
 	for _, sc := range readySCs {


So go is iterating through the channels in a shuffled order?

Effectively, yes. They are ordered by how they happen to be in the readySCs map, keyed by the address (as a string, plus a "type" and a pointer to optional user metadata). This is also not right, but I do think randomizing here is better than not randomizing, so we should merge this PR for now and file an issue to improve further.

Yes, I wasn't wanting to block this PR.

markdroth · 2019-01-17T16:23:23Z

That doesn't sound that bad, as a degenerate case. Being the literal worst case, degenerating to pick_random... meh. Like, I have trouble caring.

A task flapping is not that unusual of a case. I think this is common enough that it's something we should handle gracefully.

To change that behavior would require a larger change in how subchannels are passed to the picker.

I'm not sure exactly how Go (or Java) are passing the subchannels to the picker, but it's not obvious to me why that has to change. Presumably, the picker stores the index into its list. So couldn't we have the client channel let the new picker get some state from the old picker before swapping the new picker in? That would give us the ability to retain the last index used in the previous picker.

So it sounds like this PR should still be merged, even if it is expected to be replaced later.

If this PR only adds randomization for the case 1, not case 2, then I agree that it can go forward, since it's a strict improvement over resetting to 0 every time a subchannel's state changes. But we should follow up with a better fix in a separate PR.

ejona86 · 2019-01-17T16:48:16Z

Presumably, the picker stores the index into its list. So couldn't we have the client channel let the new picker get some state from the old picker before swapping the new picker in?

The problem is that the lists will be of different sizes. Go and Java pass a list of size equal to the number of ready subchannels.

In addition, because there is no locking around uses of the Picker the old and new picker are used simultaneously (or rather, old calls to the old picker may still be in progress when new calls to the new picker begin). So "get state from the old picker" is probably not a solution like it is in C. (Instead, we would probably share data between the two, but we have a GC so that doesn't cause lifetime issues.)

If this PR only adds randomization for the case 1, not case 2, then I agree that it can go forward, since it's a strict improvement over resetting to 0 every time a subchannel's state changes.

It would randomize on both cases. Case 2 is rare; while I can agree it'd be good to start at the beginning, I don't think it practically matters. The current implementation for case 1 is dire; it degrades to all RPCs being sent to one backend. It seems weird to me to block an obvious "this could cause an outage" fix because of an "occasionally we will do things in a slightly different order, but nobody will probably even notice".

dfawley · 2019-01-17T17:08:35Z

So "get state from the old picker" is probably not a solution like it is in C.

Correct. If maintaining the position in the list is a requirement, probably what I'd do is make it so there actually is only one picker permanently, and update its state atomically/with a lock when a change needs to be made.

Either way, I think this is a real improvement and the full fix is going to be lower priority for us. I will file an issue for the follow-up changes. In that, it would be nice if we could define the required/desired/acceptable behaviors for a RR balancer in gRPC in case we can find a user that would like to contribute a fix.

Also note that in case 2 we aren't even starting at the top today - as @ejona86 observed, we are shuffling our list of addresses due to the map from addresses to subchannels.

dfawley · 2019-01-17T17:18:23Z

Also note that in case 2 we aren't even starting at the top today - as @ejona86 observed, we are shuffling our list of addresses due to the map from addresses to subchannels.

And per my observation in #2580, starting at the top doesn't even seem like a reliable thing that can be tested or relied upon by users, given that the first RPC will go to the first-connected backend and not unconditionally to the first address in the list.

roundrobin: randomize starting address when rebuilding the picker

22d05f9

dfawley assigned menghanl Jan 16, 2019

dfawley requested a review from menghanl January 16, 2019 23:33

menghanl approved these changes Jan 16, 2019

View reviewed changes

use errpicker when no subconns

3d9053b

ejona86 reviewed Jan 17, 2019

View reviewed changes

dfawley merged commit efaac52 into grpc:master Jan 17, 2019

dfawley deleted the rrrand branch January 17, 2019 17:18

markdroth mentioned this pull request Jan 17, 2019

Update round_robin load balancer to meet gRPC requirements #2580

Open

dfawley added the Type: Bug label Jan 17, 2019

dfawley added this to the 1.19 Release milestone Jan 17, 2019

lock bot locked as resolved and limited conversation to collaborators Jul 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roundrobin: randomize starting address when rebuilding the picker #2579

roundrobin: randomize starting address when rebuilding the picker #2579

dfawley commented Jan 16, 2019

dfawley commented Jan 16, 2019

markdroth commented Jan 17, 2019

ejona86 commented Jan 17, 2019

ejona86 Jan 17, 2019

dfawley Jan 17, 2019

ejona86 Jan 17, 2019

markdroth commented Jan 17, 2019

ejona86 commented Jan 17, 2019

dfawley commented Jan 17, 2019

dfawley commented Jan 17, 2019

roundrobin: randomize starting address when rebuilding the picker #2579

roundrobin: randomize starting address when rebuilding the picker #2579

Conversation

dfawley commented Jan 16, 2019

dfawley commented Jan 16, 2019

markdroth commented Jan 17, 2019

ejona86 commented Jan 17, 2019

ejona86 Jan 17, 2019

Choose a reason for hiding this comment

dfawley Jan 17, 2019

Choose a reason for hiding this comment

ejona86 Jan 17, 2019

Choose a reason for hiding this comment

markdroth commented Jan 17, 2019

ejona86 commented Jan 17, 2019

dfawley commented Jan 17, 2019

dfawley commented Jan 17, 2019