Skip to content

new token bucket impl#6893

Merged
alexpyattaev merged 5 commits intoanza-xyz:masterfrom
alexpyattaev:ratelimiter
Sep 22, 2025
Merged

new token bucket impl#6893
alexpyattaev merged 5 commits intoanza-xyz:masterfrom
alexpyattaev:ratelimiter

Conversation

@alexpyattaev
Copy link
Copy Markdown

@alexpyattaev alexpyattaev commented Jul 9, 2025

Problem

  • We use external crate for a token bucket which is overkill
  • The way we implement keyed rate limiting is somewhat inefficient

Summary of Changes

  • Make our own glorious token bucket
  • Also make a concurrent hashmap based version of the same that uses LazyLRU logic for cleanup instead of a flat loop

@alexpyattaev alexpyattaev added the noCI Suppress CI on this Pull Request label Jul 9, 2025
@alexpyattaev alexpyattaev requested a review from KirillLykov July 9, 2025 11:10
@alexpyattaev alexpyattaev added CI Pull Request is ready to enter CI and removed noCI Suppress CI on this Pull Request labels Jul 9, 2025
@anza-team anza-team removed the CI Pull Request is ready to enter CI label Jul 9, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jul 9, 2025

Codecov Report

❌ Patch coverage is 90.78498% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.1%. Comparing base (302ff5e) to head (eec664d).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #6893     +/-   ##
=========================================
- Coverage    83.1%    83.1%   -0.1%     
=========================================
  Files         815      816      +1     
  Lines      358629   358922    +293     
=========================================
+ Hits       298071   298309    +238     
- Misses      60558    60613     +55     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread streamer/src/nonblocking/connection_rate_limiter.rs Outdated
Comment thread streamer/src/nonblocking/connection_rate_limiter.rs Outdated
@alexpyattaev alexpyattaev force-pushed the ratelimiter branch 3 times, most recently from aa91a89 to 58d47a2 Compare July 13, 2025 18:12
@alexpyattaev alexpyattaev marked this pull request as ready for review July 20, 2025 06:01
Comment thread streamer/src/nonblocking/connection_rate_limiter.rs
Comment thread streamer/src/nonblocking/connection_rate_limiter.rs
Comment thread streamer/src/nonblocking/connection_rate_limiter.rs Outdated
Comment thread streamer/src/nonblocking/connection_rate_limiter.rs Outdated
Comment thread streamer/src/nonblocking/connection_rate_limiter.rs Outdated
Comment thread streamer/src/nonblocking/connection_rate_limiter.rs Outdated
Comment thread net-utils/src/token_bucket.rs
Comment thread net-utils/src/token_bucket.rs
Comment thread net-utils/src/token_bucket.rs Outdated
Comment thread net-utils/src/token_bucket.rs
Comment thread net-utils/src/token_bucket.rs
@alexpyattaev alexpyattaev requested a review from lijunwangs August 1, 2025 13:57
#[allow(clippy::arithmetic_side_effects)]
fn maybe_shrink(&self) {
let mut actual_len = 0;
let target_shard_size = self.target_capacity / self.data.shards().len();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if self.target_capacity < shards?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document it and assert probably.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch thank you! target_size becomes zero and data structure wipes all records every time :( Will patch.

@alexpyattaev alexpyattaev requested a review from a team as a code owner August 18, 2025 12:45
@@ -0,0 +1,106 @@
#![allow(clippy::arithmetic_side_effects)]
Copy link
Copy Markdown

@KirillLykov KirillLykov Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why haven't you used some benchmarking framework? And what are the results of the current benchmarking?

Copy link
Copy Markdown
Author

@alexpyattaev alexpyattaev Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benching frameworks are not well suited here since the bench is multithreaded, and requires peculiar setup to run. Running it in a loop 10000 times does not really show meaningful perf since you need thread contention.
Here are results:

Running bench_token_bucket...
Run complete over 5 seconds
Accepted 16667, Rejected: 39887821
processed 39904488 requests, 7980897.5 per second
==========
Running bench_token_bucket_eviction...
Run complete over 5 seconds
Max observed size was 406
processed 17113044 requests, 3422608.8 per second
Rejected: 95951
==========
Running bench_keyed_rate_limiter...
Run complete over 5 seconds
Accepted: 1024000 (target 1024000)
Rejected: 37008846
processed 38032846 requests, 7606569.5 per second

TL;DR we can process about 7 M requests per second per bucket, the KeyedRateLimiter may slow things down if there is a lot of churn to 3M requests per second.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds much more than we ever need

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we'd want real code to do things other than token buckets but I do not know how to make this substantially faster, I'm quite certain we are close to hitting HW limits here.

Comment thread net-utils/src/token_bucket.rs Outdated
Comment thread net-utils/src/token_bucket.rs
Comment thread net-utils/src/token_bucket.rs Outdated
}

impl Clone for TokenBucket {
fn clone(&self) -> Self {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clone looks a bit suspicions to me because it is a deep copy while I typically expect that atomics are copied by Arc-style. Do we really need it somewhere?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is used in the KeyedRateLimiter to clone buckets. It is generally nice to have around so you can mass-produce them from a prototype of some kind. There is no state in them that would become invalidated if we access atomics one at a time, so no reason not to implement clone.

Comment thread net-utils/src/token_bucket.rs Outdated
self.update_state(now);
match self
.tokens
.fetch_update(Ordering::SeqCst, Ordering::SeqCst, |tokens| {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SeqCst -- although it is the safest option, it might be less performant in comparison to Release/Acquire combination. Have you seen any difference?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was hunting a nasty concurrency bug here for a while, now I have found it and dropped to AcqRel and Acquire where applicable.

@alexpyattaev
Copy link
Copy Markdown
Author

@vadorovsky I believe I have finally tracked down the concurrency bug that was eating my brain here, so should be good to go. Now the system that credits tokens is super braindead and relies on simple CAS logic to gate which thread credits for which time interval.

KirillLykov
KirillLykov previously approved these changes Sep 12, 2025
Copy link
Copy Markdown

@KirillLykov KirillLykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern here was atomics ordering, if @vadorovsky double checks that it is ok, to me looks good now.

Adds TokenBucket and KeyedRateLimiter to replace governor crate
and in general allow for better control over rate-limiting options
@alexpyattaev
Copy link
Copy Markdown
Author

My main concern here was atomics ordering, if @vadorovsky double checks that it is ok, to me looks good now.

Added shuttle test based on @vadorovsky 's design, the coast is clear=)

/// depositing new tokens (if appropriate)
fn update_state(&self, now: u64) {
// fetch last update time
let last = self.last_update.load(Ordering::SeqCst);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Ordering::Acquire would be sufficient here. But SeqCst is not incorrect. In case you don't see any perf difference, I guess it's fine to leave it as it is.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No perf difference:(

.fetch_add(time_to_return, Ordering::Relaxed);
}
Err(_) => {
// Another thread advanced last_update first → nothing we can do now.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should actually do something in that case. There is a slight chance that the current thread's now is still higher than the new value set by an another thread. To handle that case, we could retry the update.

    fn update_state(&self, now: u64) {
        // WE MAKE IT `mut`
        let mut last = self.last_update.load(Ordering::SeqCst);

        // If time has not advanced, nothing to do.
        while now > last {
            match self.last_update.compare_exchange(
                last,
                now,
                Ordering::AcqRel,  // winner publishes new timestamp
                Ordering::Acquire, // loser observes updates
            ) {
                Ok(_) => {
                    // success case...
                }
                // THE DIFFERENCE: If CAS failed, we retry with the new value.
                Err(last_update) => {
                    last = current;
                }
            }
        }
    }

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having a loop here was a deliberate choice, it reduces the time spent per request, which is what we want way more than accuracy (since this code gets called per packet, and we have on the order of millions of packets).

In the current version we will just mint tokens on the next call, which is good enough for intended use, since probability that whatever other thread did not consume is enough to mint many tokens is low.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

Copy link
Copy Markdown
Member

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Copy Markdown

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain more why we need this new token bucket please? As we've previously discussed it sounds like we use the governor crate but that is bloated and has some bugs. i think one of those bugs was letting in too much traffic. how much of a problem is this? I am just wary of reimplementing something from scratch especially in a core part of the validator. seems like we need a lot of testing for this. shuttle_test_token_bucket_race is great!

// much of the testing is impossible outside of real multithreading in release mode.
impl TokenBucket {
/// Allocate a new TokenBucket
pub fn new(initial_tokens: u64, max_tokens: u64, new_tokens_per_second: f64) -> Self {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we switch the floating point math to fixed point? over time it looks like these small rouding errors could add up and create some inconsistent behavior.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over a billion requests these would add up to a few milliseconds, it is not really a concern. And switching to fixed point would not eliminate them, just reduce them a bit (since we would still have finite precision). And the code would get really ugly (I have tried it already, it becomes very hard to follow). Perf difference is non-existent.

@lijunwangs
Copy link
Copy Markdown

I would like to see some data on the difference makes, like the variation of input requests and the requests passed through given some limiting configuration over period of time between this and governor.
Do we know the CPU usage difference?

@alexpyattaev
Copy link
Copy Markdown
Author

I would like to see some data on the difference makes, like the variation of input requests and the requests passed through given some limiting configuration over period of time between this and governor. Do we know the CPU usage difference?

I have done benchmarks before and Governor compares as follows:

8 threads poking at KeyedRateLimiter:

  START             solana-streamer nonblocking::connection_rate_limiter::test::bench_token_bucket
Run complete over 5 seconds
Accepted: 1024000 (target 1024000)  
Rejected:  33741839 

Same setup with Governor crate:

  START             solana-streamer nonblocking::connection_rate_limiter::test::bench_governor
running 1 test
Run complete over 5 seconds
Accepted: 1330995 (target 1024000) // 10 % off!
Rejected:  25997438

so atomic token bucket is better perf and a fair bit more accurate.

The benches for TokenBucket also ensure that it is limiting consistently over each 100ms interval and not just over long time interval.

Copy link
Copy Markdown

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@alexpyattaev alexpyattaev merged commit cc3f387 into anza-xyz:master Sep 22, 2025
54 checks passed
@alexpyattaev alexpyattaev deleted the ratelimiter branch September 22, 2025 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants