-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Paraswap rate limiting #168
Conversation
Codecov Report
@@ Coverage Diff @@
## main #168 +/- ##
==========================================
- Coverage 64.81% 64.64% -0.17%
==========================================
Files 185 185
Lines 38398 38577 +179
==========================================
+ Hits 24889 24940 +51
- Misses 13509 13637 +128 |
crates/shared/src/http_client.rs
Outdated
let increased_back_off = self.next_back_off.mul_f64(self.back_off_growth_factor); | ||
self.next_back_off = std::cmp::min(increased_back_off, self.max_back_off); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be cleaner to store a rate_limited_responses_count: u64
and calculate next_back_off from scratch every time? I'm not sure.
pub fn response_rate_limited(&mut self, previous_rate_limits: u64) -> Option<Duration> { | ||
if self.times_rate_limited != previous_rate_limits { | ||
// Don't increase back off if somebody else already updated it in the meantime. | ||
return None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of returning None
, should we return the current backoff here, even if we don't increment the counter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In line with the comment above, we need some way to avoid printing duplicated messages.
In the first version of this implementation I logged messages while holding the Mutex
lock. The code for that was way cleaner and easier to understand but also made the critical section super long.
Now I only return Some(Duration)
in case we actually increased the back off and therefore need a log message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Can we add some instrumentation to count how many times we get rate limited and how many times we don't send a request because of back-off? Would be a good way to measure just how much we get rate limited this way.
617db15
to
67e496f
Compare
Makes sense. 👍 |
crates/shared/src/http_client.rs
Outdated
let mut back_off = self.min_back_off; | ||
for _ in 0..self.times_rate_limited { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use f64::powi or f64::powf instead of a loop. You can't get overflow panics with f64 because worst case it will stay at +INF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid an overflow with Duration::mul_f64().
Being able to compute the correct factor to call mul_f64()
only once wouldn't help us much if that would cause mul_f64()
to panic because the result would overflow Duration
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a checked_mul_f64
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be like
let factor = back_off_growth_factor.pow(times_rate_limited);
back_off *= factor;
I see now that Duration for some reason doesn't have a non panicking version of mul_f64 which is weird. In this case I feel it is reasonable to extract the f64 out of the duration, do the math, put it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there is only checked_mul and saturating_mul which both require a u32 as the argument.
I could have made the growth_factor
argument an u32
but that felt overly restrictive.
But I don't feel strongly about that so I could change it if you'd like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case I feel it is reasonable to extract the f64 out of the duration, do the math, put it back.
I hadn't considered that before. Good idea. 👍
Fixes #114
Because we sometimes get rate limited by the paraswap API we need a mechanism to automatically back off until we are able to get proper results again.
While we are being rate limited every request to the paraswap API will return immediately with the error "rate limited".
Successive 429 results increase the time we drop requests exponentially.
If we receive a non 429 result we no longer drop any requests.
Note that once we run into the rate limit we usually take ~5min to recover from that. That is already way better than the status quo because there were already times where we got rate limited by paraswap for hours on end.
CLI help:
Test Plan
Manual test for parsing CLI argument
Ignored unit test to see exponential back off in action
Logs