-
Notifications
You must be signed in to change notification settings - Fork 107
Conversation
api/cluster.go
Outdated
// metric api.cluster.speculative.requests is how many peer queries resulted in speculation | ||
speculativeAttempts = stats.NewCounter32("api.cluster.speculative.attempts") | ||
|
||
// metric api.cluster.speculative.requests is how many peer queries were improved due to speculation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the metric name in the comment is not right
api/cluster.go
Outdated
|
||
// peerQuerySpeculative takes a request and the path to request it on, then fans it out | ||
// across the cluster, except to the local peer. If any peer fails requests to | ||
// other peers are aborted. If 95% of peers have been heard from, and we are missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
speculationThreshold is configurable now
} | ||
memberStartPartition := member.GetPartitions()[0] | ||
|
||
if _, ok := membersMap[memberStartPartition]; !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to determine whether a member is already in the map or not based on its first partition. But I can't see where we are sorting the partitions, if two MTs of a shard are configured to have the same partitions but they are specified in a different order wouldn't this break?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. IMO, it seems like the partitions should be sorted at start up. I could add a sort to the SetPartitions
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that sorting at startup makes the most sense, that way it only needs to be done once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that in cases where the partition ids are not sorted in the config it is important to first update to this version of MT without activating speculative queries, and only once all MTs are on this version enable speculative queries. Otherwise querying might temporarily be broken until all MTs are updated because the older ones still returned their partition IDs unsorted. But i guess there is nothing we can do about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we could sort the partitions of the cluster peers when we get them. As it stands right now, with or without speculation enabled, the partitions will need to be sorted. That is, unless I change the code as I mentioned in #956 (comment)
fc5823a
to
ca4439c
Compare
Status update: I'm using this on our prod setup. We get about 67 render requests/sec. We have 120 shard groups * 2 replicas (240 total peers). This results in about 8000 peer requests/sec. Speculation kicks in on darn near all requests, as with that many peers, it's highly likely that any is undergoing some GC. Here's a snapshot with some of the data: Of note:
You can see in this snapshot when we upped the load: Here's a comparison of our render response times before and after rollout (under very light load, 4 render reqs/sec): You can see how much smoother and improved the median and p90 response times are. |
looks great, but could you fix the test please? this might be what you want: afc60d2 |
I haven't verified it, but I'm pretty sure that the existing peer query mechanism required peers to have the same partitions as their comrades. Speculative peer queries absolutely requires it. Is it worth supporting something like
|
So far I've not heard of any cases where people configured their MT partitions in a way where the partitions are not the same for all instances in a shard (that's what you mean, right?). I think it's probably not worth the additional complexity just to support this very rare edge case. What do you think @woodsaj |
the docs state that when you have multiple replicas, partitions must be assigned in groups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
interestingly:
otherwise, all these numbers look great :) |
Correct. Here's a snapshot under heavier load (with real user queries)
Yes, this was still during the rollout. |
Of note, this optimization is likely of more value the larger the cluster. At our 120 shard group cluster, it's been great. |
For #954
I made speculation configurable, but one thing will still change even with speculation disabled: Local requests are now via HTTP rather than a special case. This is good and bad (good: happens in parallel with peer requests and tracing just works, bad: a bit of overhead)
TODO:
findSeries
) don't usepeerQuery
; fix that