Conversation
WalkthroughThis change introduces a more flexible and configurable rate limiting system. The configuration schema and codebase are refactored to support per-key rate limiting via a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Router image scan passed✅ No security vulnerabilities found in image: |
Router-nonroot image scan passed✅ No security vulnerabilities found in image: |
There was a problem hiding this comment.
Actionable comments posted: 3
🔭 Outside diff range comments (3)
router/core/ratelimiter.go (3)
37-56: Remove dead code for rejectStatusCode initialization.The check
if rl.rejectStatusCode == 0at lines 46-48 is redundant becauserejectStatusCodeis never initialized before this check, so it will always be 0. This appears to be leftover code from the refactoring.Apply this diff to remove the dead code:
rl = &CosmoRateLimiter{ client: opts.RedisClient, limiter: limiter, debug: opts.Debug, baseRateLimitKey: opts.BaseRateLimitKey, rateLimitConfig: opts.RateLimitConfig, } - if rl.rejectStatusCode == 0 { - rl.rejectStatusCode = 200 - } if opts.KeySuffixExpression != "" {
58-70: Remove unusedrejectStatusCodefield.The
rejectStatusCodefield at line 63 appears to be unused after the refactoring, as this is now part of the per-key configuration inRateLimitSimpleStrategyEntry.Apply this diff to remove the unused field:
type CosmoRateLimiter struct { client rd.RDCloser limiter *redis_rate.Limiter debug bool - rejectStatusCode int - keySuffixProgram *vm.Program // TODO: To decouple from the config rateLimitConfig config.RateLimitSimpleStrategy baseRateLimitKey string }
137-139: RejectStatusCode() is still referenced in graphql_handler.go
TheRejectStatusCode()method isn’t dead code—it's invoked here:
- router/core/graphql_handler.go (around the call to
httpWriter.WriteHeader(h.rateLimiter.RejectStatusCode()))If you’ve moved
rejectStatusCodeinto a per-key config API, update this call to use the new per-key accessor (e.g.h.rateLimiter.ConfigFor(key).RejectStatusCode()) before deleting the method and field.
🧹 Nitpick comments (1)
router/pkg/config/config.schema.json (1)
1832-1835: Misleading description forenabledfieldThe description for
enabledcurrently repeats the text meant forrate, which is confusing.- "enabled": { - "type": "boolean", - "description": "The rate at which the requests are allowed. The rate is specified as a number of requests per second." + "enabled": { + "type": "boolean", + "description": "Enable / disable this specific key mapping entry. When omitted, the global `enabled` setting applies." },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
router/core/graph_server.go(1 hunks)router/core/graphql_handler.go(2 hunks)router/core/ratelimiter.go(5 hunks)router/core/ratelimiter_test.go(8 hunks)router/pkg/authentication/jwks_token_decoder.go(1 hunks)router/pkg/config/config.go(2 hunks)router/pkg/config/config.schema.json(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
router/pkg/authentication/jwks_token_decoder.go (1)
Learnt from: SkArchon
PR: #2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.
router/pkg/config/config.schema.json (4)
Learnt from: SkArchon
PR: #2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.
Learnt from: SkArchon
PR: #2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.
Learnt from: endigma
PR: #2009
File: router/pkg/config/config.go:0-0
Timestamp: 2025-07-03T10:33:25.778Z
Learning: The CardinalityLimit field in the Metrics struct (router/pkg/config/config.go) is validated at the JSON schema level in config.schema.json with a minimum value constraint of 1, preventing zero or negative values without requiring runtime validation.
Learnt from: SkArchon
PR: #1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.
router/pkg/config/config.go (1)
Learnt from: endigma
PR: #2009
File: router/pkg/config/config.go:0-0
Timestamp: 2025-07-03T10:33:25.778Z
Learning: The CardinalityLimit field in the Metrics struct (router/pkg/config/config.go) is validated at the JSON schema level in config.schema.json with a minimum value constraint of 1, preventing zero or negative values without requiring runtime validation.
🔇 Additional comments (9)
router/core/ratelimiter_test.go (1)
43-43: LGTM! Test updates correctly match the new method signature.The addition of
nilas the second parameter togenerateKeycalls properly reflects the updated method signature that now accepts both*resolve.Contextand*resolve.FetchInfo. The tests continue to verify the key generation logic under various scenarios (headers, claims, fallback logic) and the nilFetchInfoparameter is appropriate for these test cases.Also applies to: 54-54, 65-65, 76-76, 87-87, 98-98, 109-109, 120-120
router/core/graph_server.go (1)
1318-1325: LGTM! Clean transition to the new rate limiting configuration.The changes properly update the rate limiter options construction to support the new per-key rate limiting system:
- The commented-out
RejectStatusCodeindicates this field was removed from the options structRateLimitConfigandBaseRateLimitKeyreplace the previous static parameters with a more flexible configuration approach- The new structure supports dynamic key resolution and per-key rate limit mappings as outlined in the PR objectives
router/core/graphql_handler.go (2)
272-273: LGTM! Proper simplification of rate limiting configuration access.The changes correctly remove the nested
SimpleStrategyreferences and access configuration fields directly fromrateLimitConfig. The commented-outRateLimitKeyassignment aligns with the new per-key rate limiting system where keys are generated dynamically rather than set statically.
316-316: LGTM! Consistent configuration access pattern.The update to check
rateLimitConfig.HideStatsFromResponseExtensiondirectly maintains the same functionality while aligning with the simplified configuration structure that removes nestedSimpleStrategyreferences.router/pkg/config/config.go (2)
510-514: LGTM! Good refactoring to make stats visibility a global setting.Moving
HideStatsFromResponseExtensionto the top-levelRateLimitConfigurationmakes sense as it's a global setting that should apply across all rate limit strategies.
527-545: Good modular design for per-key rate limiting configuration.The refactoring effectively supports per-key rate limiting by:
- Embedding
RateLimitSimpleStrategyEntryfor default/global configuration- Adding
KeyMappingfor per-key overridesThis design maintains backward compatibility while enabling the new per-subgraph rate limiting feature.
router/core/ratelimiter.go (3)
25-35: LGTM! Options struct properly updated for the new configuration model.The addition of
BaseRateLimitKeyandRateLimitConfigaligns well with the new per-key rate limiting feature.
72-113: Well-implemented dynamic rate limit resolution logic.The implementation correctly handles all scenarios:
- Key-specific configurations take precedence
- Falls back to default configuration when appropriate
- Properly checks enabled flags at both levels
The code is clear and maintainable.
115-135: Excellent implementation of dynamic key generation.The method properly:
- Returns both the raw suffix key (for config lookup) and the full key (for Redis)
- Clones the expression context to avoid side effects
- Augments the context with subgraph information
This enables the per-subgraph rate limiting feature as designed.
| }, | ||
| PrioritizeHTTP: true, | ||
| RefreshUnknownKID: rate.NewLimiter(rate.Every(5*time.Minute), 1), | ||
| RefreshUnknownKID: rate.NewLimiter(rate.Every(2*time.Second), 1), |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Verify the dramatic increase in JWKS refresh frequency.
The refresh rate for unknown Key IDs has been increased from once every 5 minutes to once every 2 seconds - a 150x increase in frequency. This could significantly impact performance and potentially overwhelm JWKS providers with requests.
Please verify this change is intentional and consider the following:
- Will external JWKS providers handle this increased request frequency?
- Has this been tested with production-like traffic patterns?
- Is there documentation explaining why this aggressive refresh rate is necessary?
🏁 Script executed:
#!/bin/bash
# Description: Search for any configuration or documentation explaining the JWKS refresh rate change
# Expected: Find configuration options, comments, or documentation explaining the rationale
# Search for JWKS refresh rate configurations or comments
rg -A 5 -B 5 "RefreshUnknownKID|JWKS.*refresh|refresh.*JWKS" --type goLength of output: 1639
Document and validate JWKS unknown‐KID refresh rate change
The hard-coded RefreshUnknownKID limiter was lowered from 5 minutes to 2 seconds (150× increase), but there’s no accompanying comment or configuration entry explaining why. This could lead to a surge of HTTP requests against JWKS endpoints.
Please address the following in router/pkg/authentication/jwks_token_decoder.go:99:
- Confirm that external JWKS providers can sustain refresh attempts every 2 seconds under production load.
- Add a code comment or expose this interval via configuration (e.g., in
authentication.JWKSConfig) with a clear rationale. - Update any related documentation (README or design docs) to explain why such an aggressive refresh rate is necessary.
🤖 Prompt for AI Agents
In router/pkg/authentication/jwks_token_decoder.go at line 99, the
RefreshUnknownKID rate limiter was changed from 5 minutes to 2 seconds without
explanation. Confirm with the team or documentation that external JWKS providers
can handle refreshes every 2 seconds in production. Add a clear code comment
near this line explaining the reason for this aggressive refresh rate.
Additionally, expose this interval as a configurable parameter in
authentication.JWKSConfig instead of hardcoding it. Finally, update any relevant
documentation or README files to describe the rationale behind this change and
its impact.
| }, | ||
| "key_mapping": { | ||
| "type": "object", | ||
| "description": "The configuration per key entry.", | ||
| "additionalProperties": { |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add basic constraints to key_mapping to avoid silent configuration errors
key_mapping is currently declared as a totally free-form object. A typo such as "raet": 50 or an empty object {} will happily pass validation but later explode at runtime.
Consider tightening validation with e.g. patternProperties, propertyNames, or at least a minProperties: 1 to ensure that every entry contains something meaningful.
"key_mapping": {
"type": "object",
"description": "The configuration per key entry.",
+ "minProperties": 1,
"additionalProperties": {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| }, | |
| "key_mapping": { | |
| "type": "object", | |
| "description": "The configuration per key entry.", | |
| "additionalProperties": { | |
| }, | |
| "key_mapping": { | |
| "type": "object", | |
| "description": "The configuration per key entry.", | |
| "minProperties": 1, | |
| "additionalProperties": { |
🤖 Prompt for AI Agents
In router/pkg/config/config.schema.json around lines 1823 to 1827, the
key_mapping object lacks constraints, allowing empty or misspelled entries to
pass validation and cause runtime errors. Add validation rules such as
patternProperties or propertyNames to restrict allowed keys, and include
minProperties: 1 to ensure the object is not empty, thereby preventing silent
configuration errors.
| "rate": { | ||
| "type": "integer", | ||
| "description": "The rate at which the requests are allowed. The rate is specified as a number of requests per second.", | ||
| "minimum": 1 | ||
| }, | ||
| "burst": { | ||
| "type": "integer", | ||
| "description": "The maximum number of requests that are allowed to exceed the rate. The burst is specified as a number of requests.", | ||
| "minimum": 1 | ||
| }, | ||
| "period": { | ||
| "type": "string", | ||
| "description": "The period of time over which the rate limit is enforced. The period is specified as a string with a number and a unit, e.g. 10ms, 1s, 1m, 1h. The supported units are 'ms', 's', 'm', 'h'.", | ||
| "duration": { | ||
| "minimum": "1s" | ||
| } | ||
| }, | ||
| "reject_exceeding_requests": { | ||
| "type": "boolean", | ||
| "description": "Reject the requests that exceed the rate limit. If the value is true, the requests that exceed the rate limit are rejected." | ||
| }, | ||
| "reject_status_code": { | ||
| "type": "integer", | ||
| "description": "The status code to return when the request is rejected. The default value is 200 (OK) as we're returning a well formed GraphQL response.", | ||
| "default": 200 | ||
| } | ||
| } |
There was a problem hiding this comment.
Require at least one rate-limiting parameter inside each mapping entry
All properties inside a single entry are optional, so a definition like:
rate_limit:
simple_strategy:
key_mapping:
employees: {}would pass schema validation yet leave the router with undefined values at runtime.
At minimum, enforce the same trio as the global defaults:
"additionalProperties": false,
...
-"description": "The configuration for all subgraphs. The configuration is used to configure the traffic shaping for all subgraphs.",
+"description": "Per-key rate-limit settings. At least one of rate | burst | period must be supplied.",
...
- "properties": {
+ "properties": {
...
},
+ "anyOf": [
+ { "required": ["rate"] },
+ { "required": ["burst"] },
+ { "required": ["period"] }
+ ]Alternatively, replicate the global required: ["rate","burst","period"] if you don’t want to inherit omitted values.
🤖 Prompt for AI Agents
In router/pkg/config/config.schema.json around lines 1836 to 1862, the
rate-limiting entries currently have all properties optional, allowing empty
objects that cause undefined runtime values. To fix this, add a "required" field
to each rate-limiting entry schema that enforces at least the "rate", "burst",
and "period" properties to be present, matching the global defaults. This
ensures each entry has the necessary parameters defined and prevents invalid
configurations.
|
Closing till requirements are reclarified |
We currently have the ability to specify dynamic rate limiting keys, we want to be able to specify rate limits per subgraphs. Since the expression context can contain the subgraph name, we utilize that information to allow the user to specify a rate limit key using expressions, and specify configurations per keys.
Example Configuration
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Checklist