-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throttle courier queues when the channel has rate limit redis key not… #382
Conversation
Codecov Report
@@ Coverage Diff @@
## main #382 +/- ##
==========================================
+ Coverage 71.56% 71.59% +0.02%
==========================================
Files 94 94
Lines 8269 8277 +8
==========================================
+ Hits 5918 5926 +8
Misses 1754 1754
Partials 597 597
Continue to review full report at Codecov.
|
0a35e6a
to
0a5a2c9
Compare
redis.call("zrem", KEYS[2] .. ":active", queue) | ||
return {"retry", ""} | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicpottier if you are willing, would love your review on this
This feels slightly odd in that it seems like we are basically using a redis key as a communication channel to move a channel into being throttled when we next try to pop a message off. Why are we doing it that way as opposed to moving the channel to the throttled list explicitly when we decide we are throttled? I guess the advantage of this method is we can set how long we throttle the channel for by tweaking the expiration but the downside is that every call to pop a message off a queue now has an extra lookup. I think I'd rather see our throttled logic expanded to just allow throttles more than a second if that's what is needed: That could work by having both the queue and say a counter in the throttle list and that gets decremented in the dethrottler, moving anything that now has a count <= 0 to the active queue set. |
Maybe an easier way to do this would be to move throttling to be a key based system instead of using a set. IE, we insert a |
Oh looking at this more closely I see why it is the way it is. We still use throttled queues to track # of workers so we can do fair queuing. And we don't want throttled to just expire entirely because we need to actively re-add those throttled channels back to the active set, so it can't just be a passive expiration. I guess the proposed solution makes sense then. The extra lookup per pop probably isn't a big deal. |
if rr.StatusCode == 429 || rr.StatusCode == 503 { | ||
rateLimitKey := fmt.Sprintf("rate_limit:%s", msg.Channel().UUID().String()) | ||
rc.Do("set", rateLimitKey, "engaged") | ||
rc.Do("expire", rateLimitKey, 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't the 429 response tell us when we can retry ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we get
{"meta":{"api_status":"stable","version":"2.35.4"},"errors":[{"code":1015,"title":"Too many requests","details":"Rate limiting engaged - Too Many Requests."}]}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we're backing down for 2 seconds but do we have any information if that's a reasonable back down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are pausing the queue to send for 2 seconds and try again once we have the 429 status we pause for another 2 seconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://developers.facebook.com/docs/whatsapp/api/rate-limits#capacity
The rate limit we are having is that we had more than 50 requests per second so stopping 2 seconds seems that will help reset the limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok can you add a little comment to the code here explaining the rational for 2 and maybe mention that in future it should use header values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and then let's get this out there!
… expired