fix: reject ratelimit changes#4087
fix: reject ratelimit changes#4087Flo4604 merged 4 commits into10-07-test_rejected_requests_should_not_modify_ratelimit_statefrom
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughRefactors rate limiting from slice-based to single-request API, adds a new atomic batch method, updates service interface, and adapts callers. The API handler now uses the single-request path; key validation chooses single vs batch paths. The service implements deterministic locking, batch evaluation, and helper routines for atomic multi-request checks. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Client
participant H as API Handler
participant K as Key Validation
participant RL as Ratelimit Service
participant L as Bucket Locks
participant R as Redis
rect rgb(245,248,255)
note over C,H: Single-limit request
C->>H: POST /v2/ratelimit
H->>RL: Ratelimit(req)
RL->>RL: Validate window, compute key
alt Local decision possible
RL->>RL: Sliding window check
else Needs backfill
RL->>R: Fetch window state
R-->>RL: State
RL->>RL: Decide
end
RL->>R: Buffer/increment on success
RL-->>H: RatelimitResponse
H-->>C: 200/429 with Remaining/Reset
end
rect rgb(245,255,245)
note over C,K: Multi-limit (atomic)
C->>K: Validate API key (N limits)
alt N == 1
K->>RL: Ratelimit(req)
RL-->>K: Response
else N > 1
K->>RL: RatelimitMany(reqs[])
RL->>RL: Build keys, sort reqs
RL->>L: Lock buckets in order
RL->>RL: Check all with locks
alt All pass
RL->>R: Batch increments/buffer
else Any fail
RL->>RL: Mark denied, clamp remaining
end
L-->>RL: Unlock
RL-->>K: Responses[]
end
K-->>C: Validation outcome
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
go/apps/api/routes/v2_ratelimit_limit/handler.go(1 hunks)go/internal/services/keys/validation.go(2 hunks)go/internal/services/ratelimit/interface.go(1 hunks)go/internal/services/ratelimit/service.go(4 hunks)
| for _, rwk := range reqsWithKeys { | ||
| bucket := bucketMap[rwk.key] | ||
|
|
||
| // Check limit with lock already held | ||
| res, err := s.checkBucketWithLockHeld(ctx, rwk.req, bucket, rwk.key) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| responses[rwk.index] = res | ||
|
|
||
| if !res.Success { | ||
| allPassed = false | ||
| for j := range i { | ||
| s.rollback(reqs[j]) | ||
| } | ||
| span.SetAttributes(attribute.Bool("denied", true)) | ||
| break | ||
| // Don't break - check all limits to return complete status | ||
| } | ||
| } | ||
|
|
||
| // If all passed, increment all counters (still holding locks!) | ||
| if allPassed { | ||
| span.SetAttributes(attribute.Bool("passed", true)) | ||
| for _, rwk := range reqsWithKeys { | ||
| bucket := bucketMap[rwk.key] | ||
| currentWindow, _ := bucket.getCurrentWindow(rwk.req.Time) | ||
| currentWindow.counter += rwk.req.Cost | ||
|
|
||
| for _, req := range reqs { | ||
| s.replayBuffer.Buffer(req) | ||
| // Buffer for async replay to Redis | ||
| s.replayBuffer.Buffer(rwk.req) | ||
| } | ||
| } else { | ||
| // When batch fails, add back the cost since we're not consuming tokens | ||
| for i := range responses { | ||
| responses[i].Remaining += reqs[i].Cost | ||
| span.SetAttributes(attribute.Bool("denied", true)) | ||
|
|
||
| // At least one failed - adjust remaining values | ||
| for i, res := range responses { | ||
| if res.Success { | ||
| responses[i].Remaining += reqs[i].Cost | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Clamp all Remaining values to 0 before returning | ||
| // Clamp all remaining values | ||
| for i := range responses { | ||
| responses[i].Remaining = max(0, responses[i].Remaining) | ||
| } | ||
|
|
||
| return responses, nil |
There was a problem hiding this comment.
Ensure Current reflects actual bucket state after batch processing
When one or more requests pass in RatelimitMany, the Current value we return is inconsistent with the real bucket state:
- In the “all passed” path,
checkBucketWithLockHeldreturns success responses whoseCurrenteither comes fromeffectiveCount(local branch) or from the pre-increment counter (origin branch). We increment the buckets later if every limit passes, but the response still contains the stale count (missing the just-applied cost). - In the rollback path (
allPassed == false), we add the cost back toRemaining, yet the success entries keep the inflatedCurrentthat included the tentative cost even though we never incremented the bucket, so clients see tokens “consumed” that were actually rolled back.
RatelimitResponse.Current is documented as “how many tokens have been consumed in this window”; callers rely on that number to report usage. Returning stale or rolled-back values breaks that contract.
A straightforward fix is to normalize the responses after we know the final outcome: for every successful entry, read the bucket’s current window (while we still hold the locks) and set Current to currentWindow.counter, which now reflects the true persisted state (with the cost applied only when allPassed is true). Example:
@@
if allPassed {
span.SetAttributes(attribute.Bool("passed", true))
for _, rwk := range reqsWithKeys {
bucket := bucketMap[rwk.key]
currentWindow, _ := bucket.getCurrentWindow(rwk.req.Time)
currentWindow.counter += rwk.req.Cost
// Buffer for async replay to Redis
s.replayBuffer.Buffer(rwk.req)
}
} else {
span.SetAttributes(attribute.Bool("denied", true))
// At least one failed - adjust remaining values
for i, res := range responses {
if res.Success {
responses[i].Remaining += reqs[i].Cost
}
}
}
+ // Ensure successful responses report the actual bucket count after commit/rollback.
+ for _, rwk := range reqsWithKeys {
+ if responses[rwk.index].Success {
+ currentWindow, _ := bucketMap[rwk.key].getCurrentWindow(rwk.req.Time)
+ responses[rwk.index].Current = currentWindow.counter
+ }
+ }
+
// Clamp all remaining values
for i := range responses {
responses[i].Remaining = max(0, responses[i].Remaining)
}This keeps the observable state consistent for both the fast path and the batch path. Please adjust accordingly.
Also applies to: 415-496
* refactor: deduplicate and make more maintaianable * test: correct remaining counts * fix: span attributes
7af2b69
into
10-07-test_rejected_requests_should_not_modify_ratelimit_state
What does this PR do?
Fixes # (issue)
If there is not an issue for this, please create one first. This is used to tracking purposes and also helps use understand why this PR exists
Type of change
How should this be tested?
Checklist
Required
pnpm buildpnpm fmtconsole.logsgit pull origin mainAppreciated