Conversation
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
WalkthroughThe changes introduce dynamic batching data-parallel rank (DpRank) propagation throughout the inference gateway pipeline. DpRank is extracted from routing results, propagated through decode and prefill scorers, set in request headers, and handled in C FFI bindings to support downstream scheduling logic. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/kubernetes/inference-gateway.md`:
- Line 23: Typo: replace the incorrect phrase "kGateways Inference Gateway" with
the correct wording "kGateway Inference Gateway" in the compatibility bullet;
update the string in docs/kubernetes/inference-gateway.md where the phrase
appears (search for "kGateways Inference Gateway" or the exact sentence
containing it) so the document reads "Currently, these setups are only tested
with the kGateway Inference Gateway."
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3a91cbc8-a691-4782-a8f9-5053745c1b27
📒 Files selected for processing (5)
deploy/inference-gateway/epp/pkg/plugins/disagg/decode_scorer.godeploy/inference-gateway/epp/pkg/plugins/disagg/prefill_scorer.godeploy/inference-gateway/epp/pkg/plugins/dynamo_kv_scorer/plugin.godocs/kubernetes/inference-gateway.mdlib/bindings/c/src/lib.rs
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
Overview:
feat: enable DP (data parallelism) for GAIE
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Documentation