Add kimi-k2.5 nvfp4 GB200 vllm-disagg configs for 8k1k#234
Add kimi-k2.5 nvfp4 GB200 vllm-disagg configs for 8k1k#234kyleliang-nv wants to merge 2 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds four new disaggregated vLLM deployment recipes for Kimi-K2.5 targeting GB200 GPUs, each defining distinct prefill/decode node/worker splits, vLLM backend/frontend settings, KV-transfer and FP8/FP4 options, CUDA graph capture and async decode parameters, and sa-bench benchmark blocks. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Frontend as Dynamo Frontend
participant Prefill as Prefill Nodes (vLLM)
participant Decode as Decode Nodes (vLLM)
participant Store as Model Artifact Store
rect rgba(200,230,255,0.5)
Client->>Frontend: send request
end
rect rgba(200,255,200,0.5)
Frontend->>Prefill: route prefill work (KV transfer)
Prefill->>Store: load/model shard & KV cache
Prefill-->>Frontend: prefill responses / KV state
end
rect rgba(255,230,200,0.5)
Frontend->>Decode: route decode tasks (use KV cache)
Decode->>Store: fetch model shards if needed
Decode-->>Frontend: decoded tokens/response
end
Frontend->>Client: return response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@recipes/vllm/kimi-k2.5/disagg-gb200-1p4d-tep4.yaml`:
- Line 1: The recipe name string
"coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep" is missing the trailing
"4" and should match the filename; update the name field in the YAML to
"coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep4" so sweep/result IDs
align (edit the name value in the top-level YAML entry).
In `@recipes/vllm/kimi-k2.5/disagg-gb200-6p1d-dep16.yaml`:
- Line 12: The YAMLs (e.g., recipes/vllm/kimi-k2.5/disagg-gb200-6p1d-dep16.yaml)
reference setup_script: install-deps.sh but that script is missing; either add a
new executable script named install-deps.sh at the repository root (or the
expected scripts/ location) containing the dependency installation steps used by
your launcher, or update the setup_script value in all four affected YAMLs to
point to an existing script name/path in the repo (ensure the referenced script
is executable and contains the required install commands).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b486cd50-24fc-4aee-91a3-7071c0ccb647
📒 Files selected for processing (4)
recipes/vllm/kimi-k2.5/disagg-gb200-1p4d-tep4.yamlrecipes/vllm/kimi-k2.5/disagg-gb200-3p1d.yamlrecipes/vllm/kimi-k2.5/disagg-gb200-5p1d.yamlrecipes/vllm/kimi-k2.5/disagg-gb200-6p1d-dep16.yaml
| @@ -0,0 +1,98 @@ | |||
| name: "coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep" | |||
There was a problem hiding this comment.
Recipe name looks truncated (-tep vs -tep4).
This creates avoidable mismatch with the filename and can confuse sweep/result identification.
Suggested fix
-name: "coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep"
+name: "coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep4"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| name: "coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep" | |
| name: "coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep4" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@recipes/vllm/kimi-k2.5/disagg-gb200-1p4d-tep4.yaml` at line 1, The recipe
name string "coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep" is missing
the trailing "4" and should match the filename; update the name field in the
YAML to "coreai_devtech_all-sa.kimi-vllm-disagg-gb200-1p4d-tep4" so sweep/result
IDs align (edit the name value in the top-level YAML entry).
There was a problem hiding this comment.
🧹 Nitpick comments (1)
recipes/vllm/kimi-k2.5/disagg-gb200-3p1d.yaml (1)
32-45: Consider reducing repeated YAML with anchors/aliases.
prefill_environment/decode_environmentand several vLLM keys are duplicated; anchors would make edits safer.♻️ Optional YAML dedup pattern
+ _common_environment: &common_environment + VLLM_USE_FLASHINFER_MOE_FP4: "1" + VLLM_USE_NCCL_SYMM_MEM: "1" + NCCL_CUMEM_ENABLE: "1" + NCCL_MNNVL_ENABLE: "1" + NCCL_NVLS_ENABLE: "1" + - prefill_environment: - VLLM_USE_FLASHINFER_MOE_FP4: "1" - VLLM_USE_NCCL_SYMM_MEM: "1" - NCCL_CUMEM_ENABLE: "1" - NCCL_MNNVL_ENABLE: "1" - NCCL_NVLS_ENABLE: "1" + prefill_environment: *common_environment - decode_environment: - VLLM_USE_FLASHINFER_MOE_FP4: "1" - VLLM_USE_NCCL_SYMM_MEM: "1" - NCCL_CUMEM_ENABLE: "1" - NCCL_MNNVL_ENABLE: "1" - NCCL_NVLS_ENABLE: "1" + decode_environment: *common_environmentAlso applies to: 48-95
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@recipes/vllm/kimi-k2.5/disagg-gb200-3p1d.yaml` around lines 32 - 45, The YAML repeats the same environment entries under prefill_environment and decode_environment (and again in the later block); refactor by extracting the shared map into a YAML anchor (e.g., &vllm_env) containing the VLLM_USE_FLASHINFER_MOE_FP4, VLLM_USE_NCCL_SYMM_MEM, NCCL_CUMEM_ENABLE, NCCL_MNNVL_ENABLE, NCCL_NVLS_ENABLE keys and then reference it with aliases (*vllm_env) under prefill_environment and decode_environment (and the corresponding later section) so edits to the vllm env only need to be done once.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@recipes/vllm/kimi-k2.5/disagg-gb200-3p1d.yaml`:
- Around line 32-45: The YAML repeats the same environment entries under
prefill_environment and decode_environment (and again in the later block);
refactor by extracting the shared map into a YAML anchor (e.g., &vllm_env)
containing the VLLM_USE_FLASHINFER_MOE_FP4, VLLM_USE_NCCL_SYMM_MEM,
NCCL_CUMEM_ENABLE, NCCL_MNNVL_ENABLE, NCCL_NVLS_ENABLE keys and then reference
it with aliases (*vllm_env) under prefill_environment and decode_environment
(and the corresponding later section) so edits to the vllm env only need to be
done once.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a40cd91f-9f0c-42e9-a416-7ebe7328f147
📒 Files selected for processing (4)
recipes/vllm/kimi-k2.5/disagg-gb200-1p4d-tep4.yamlrecipes/vllm/kimi-k2.5/disagg-gb200-3p1d.yamlrecipes/vllm/kimi-k2.5/disagg-gb200-5p1d.yamlrecipes/vllm/kimi-k2.5/disagg-gb200-6p1d-dep16.yaml
✅ Files skipped from review due to trivial changes (1)
- recipes/vllm/kimi-k2.5/disagg-gb200-1p4d-tep4.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- recipes/vllm/kimi-k2.5/disagg-gb200-6p1d-dep16.yaml
- recipes/vllm/kimi-k2.5/disagg-gb200-5p1d.yaml
Summary by CodeRabbit