-
Notifications
You must be signed in to change notification settings - Fork 716
feat: skip router when worker id is pre-determined #2117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes introduce an optional Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Preprocessor
participant Router
participant Chooser
Client->>Preprocessor: Send request (may include nvext.backend_instance_id)
Preprocessor->>Preprocessor: Extract backend_instance_id from nvext (if present)
Preprocessor->>Router: Pass PreprocessedRequest (with or without backend_instance_id)
alt backend_instance_id is present
Router->>Router: Use backend_instance_id for routing (overlap_amount = 0)
else
Router->>Chooser: find_best_match(token_ids)
Chooser-->>Router: Return best matching instance and overlap_amount
end
Router-->>Client: Return response
Estimated code review effort🎯 2 (Simple) | ⏱️ ~7 minutes Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
lib/llm/src/kv_router.rs (1)
347-347: Undefined variable:context_idnot in scope.The variable
context_idis referenced but doesn't appear to be defined in the current scope. This should likely be extracted from the request context or derived from another available variable.#!/bin/bash # Description: Check for context_id definition and usage in the method # Expected: Should find where context_id is defined or how it should be obtained echo "=== Context ID definitions ===" rg -B 5 -A 5 'context_id.*=' echo -e "\n=== Available context variables ===" ast-grep --pattern 'let ($_, context) = request.into_parts();'
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
lib/llm/src/kv_router.rs(1 hunks)lib/llm/src/preprocessor.rs(1 hunks)lib/llm/src/protocols/common/preprocessor.rs(1 hunks)lib/llm/src/protocols/openai/nvext.rs(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: Build and Test - vllm
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (.)
🔇 Additional comments (3)
lib/llm/src/protocols/common/preprocessor.rs (1)
59-61: Well-implemented addition of backend instance ID field.The new
backend_instance_idfield is correctly added with appropriate builder annotations and follows the existing pattern in the struct. TheOption<i64>type is suitable for an optional backend instance identifier.lib/llm/src/protocols/openai/nvext.rs (1)
65-70: Excellent implementation with clear documentation.The new
backend_instance_idfield is properly implemented with:
- Appropriate serde annotations for optional serialization
- Consistent builder pattern with other fields
- Clear documentation explaining the routing behavior
- Proper typing with
Option<i64>lib/llm/src/preprocessor.rs (1)
254-257: Clean implementation of backend instance ID extraction.The logic correctly extracts the
backend_instance_idfrom thenvextextension when present and safely sets it on the request builder. The implementation follows good practices with proper null checking.
030b063 to
4e5e264
Compare
4e5e264 to
4bef6b2
Compare
|
Changes are merged as part of #2450 |
Overview:
When
nvext.backend_instance_idis set, forward request directly the pre-scheduled worker instance.This by-passes the routing logic.
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Bug Fixes