-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #7191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #7191
Conversation
📝 WalkthroughWalkthroughThis PR standardizes CacheTransceiverConfig.backend values to uppercase ("DEFAULT", "UCX", "NIXL", "MPI") across code, tests, and config files. Documentation and examples are updated to reference UCX-based KV cache (TRTLLM_USE_UCX_KVCACHE=1) and remove MPI-specific FAQ content. Numerous YAML/test configs change backend strings to uppercase. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
Signed-off-by: Shixiaowei02 <[email protected]>
Signed-off-by: Shixiaowei02 <[email protected]>
Signed-off-by: Shixiaowei02 <[email protected]>
6087101 to
e8da23a
Compare
|
/bot --help |
GitHub Bot Help
Provide a user friendly way for developers to interact with a Jenkins server. Run See details below for each supported subcommand.
Launch build/test pipelines. All previously running jobs will be killed.
kill
Kill all running builds associated with pull request. skip
Skip testing for latest commit on pull request. reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break. |
|
/bot run --add-multi-gpu-test |
|
PR_Github #16350 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
examples/disaggregated/slurm/gen_yaml.py (1)
1-1: Missing NVIDIA copyright header.Per coding guidelines, prepend the current year header to all Python sources.
Apply at file top:
+# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.examples/disaggregated/README.md (1)
127-131: Fix typo in metadata config: refersh_interval -> refresh_intervalUser-facing YAML has a key misspelled:
- refersh_interval
- refresh_interval
Apply this one-liner change in the snippet:
- refersh_interval: 10.0 + refresh_interval: 10.0
🧹 Nitpick comments (42)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (1)
14-14: Nit: Consider quoting backend values for consistency across configs.Some files use quoted strings (e.g., "UCX"), while others use bare scalars (DEFAULT). YAML treats both as strings, but a single convention improves grepability and reduces style churn.
Suggested tweak:
- backend: DEFAULT + backend: "DEFAULT"(repeat for both occurrences)
Also applies to: 24-24
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml (1)
17-17: Nit: Align quoting style with the rest of the suite.If standardizing on quoted strings, update as:
- backend: DEFAULT + backend: "DEFAULT"tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (1)
18-18: Nit: Prefer consistent YAML quoting.To match files that quote backend values:
- backend: DEFAULT + backend: "DEFAULT"(apply to both occurrences)
Also applies to: 33-33
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml (1)
17-17: Nit: Unify quoting with other configs for readability/tooling.If adopting the quoted-string convention:
- backend: DEFAULT + backend: "DEFAULT"(apply to both occurrences)
Also applies to: 28-28
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml (1)
18-18: Good standardization; keep config loader lenient if feasible.
- The uppercase transition looks good here.
- If the loader doesn’t already normalize case, consider adding normalization upstream so legacy configs with lowercase continue to work (reduces churn in downstream forks/tests). Not blocking this PR.
Optional style tweak:
- backend: DEFAULT + backend: "DEFAULT"If you want, I can open a follow-up to add a small normalizer (e.g., backend = backend.upper()) in the config parsing path.
Also applies to: 33-33
tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml (1)
19-19: LGTM on value normalization; consider quoting for consistency.
- Change matches the uppercase token policy.
- Quoting keeps style consistent with other string fields in the same file.
- backend: DEFAULT + backend: "DEFAULT"Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml (1)
13-13: Aligned with TRT backend config; minor style polish available.
- Using "DEFAULT" is consistent with the updated accepted literals.
- Optional: quote for stylistic uniformity and safer YAML parsing across tools.
- backend: DEFAULT + backend: "DEFAULT"Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml (1)
16-16: Optional: quote the string for consistency across configsIn other files, backend is quoted (e.g., "MPI"). Quoting here avoids accidental future YAML tooling treating bare words specially and keeps style uniform.
Apply this minimal adjustment:
- backend: DEFAULT + backend: "DEFAULT"tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml (1)
12-12: Support case-insensitive backend parsing for backward compatibilityIt looks like there’s currently no normalization of the
backendargument (e.g. intensorrt_llm/llmapi/llm_args.py), so configs using"mpi"(lowercase) would be treated differently than"MPI". To avoid breaking legacy configs, I recommend normalizing the backend string during argument parsing while still documenting uppercase as the preferred form.— Add immediately after you extract
backendintensorrt_llm/llmapi/llm_args.py:class TrtLlmArgs(…): def __post_init__(self): - backend = kwargs.get("backend", None) + # normalize backend for case-insensitive matching (legacy configs) + backend = kwargs.get("backend", None) + if isinstance(backend, str): + backend = backend.upper() + kwargs["backend"] = backend if backend == "PYTORCH": … # existing logic• Location: tensorrt_llm/llmapi/llm_args.py (inside the initialization block where
backendis first read)
• Rationale: ensures"mpi","MPI", or"Mpi"all map to the same backend and prevents surprise breakage in release/1.0.tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (1)
13-13: Nit: unify quoting and boolean style within the fileYou have mixed styles in nearby fields across configs (e.g., true vs True in other files, quoted vs unquoted strings). For this file, consider quoting DEFAULT for stylistic consistency and to reduce diff noise later from linters/formatters.
Suggested tweak:
- backend: DEFAULT + backend: "DEFAULT"Also applies to: 25-25
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml (1)
19-19: Minor consistency: quote string enums; consider standardizing booleans across YAMLs
- Quote DEFAULT for parity with places using "MPI".
- Optional: align boolean literals consistently (true/false vs True/False) per repo style guide to prevent churn from formatters.
Patch for quoting:
- backend: DEFAULT + backend: "DEFAULT"(and the same for Line 37)
Also applies to: 37-37
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_bs1.yaml (1)
20-20: Nit: quote the backend string for uniformityKeep the representation consistent with other configs that use quotes around enum-like strings.
Proposed change:
- backend: DEFAULT + backend: "DEFAULT"(repeat for Line 35)
Also applies to: 35-35
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (1)
21-21: Normalize quoting ofbackendacross disaggregated test configsOur scan confirms that every
cache_transceiver_config.backendvalue is one of the allowed enums (DEFAULT,UCX,NIXL,MPI) and there are no lowercase or non-enum entries. However, many of these enum values—most notablyDEFAULT—are currently unquoted, whereas other string literals (e.g."pytorch", URLs) are consistently quoted. To prevent future YAML-typing pitfalls and improve readability, it’s best to quote allbackendvalues uniformly.• In
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml
- Line 21:
cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT"- Line 36:
cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT"• Similar unquoted
backend: DEFAULTentries exist in many other*.yamlundertests/integration/defs/disaggregated/test_configs/(e.g.disagg_config_cache_reuse_deepseek_v3.yaml,disagg_config_overlap.yaml, etc.).
Consider a one-liner to update them all:find tests/integration/defs/disaggregated/test_configs -type f -name '*.yaml' \ -exec sed -Ei 's@^( *)(backend: )DEFAULT$@\1\2"DEFAULT"@' {} +tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml (1)
19-19: Consider quoting backend for stylistic consistencyThis file quotes other string values (e.g., "pytorch", URLs). To keep a consistent style, quote the DEFAULT token.
cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT" ... cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT"Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (1)
20-20: Optional: add a guard in config parsing to accept legacy lowercase valuesIf older configs or user overrides still use lowercase tokens, consider normalizing in the loader (e.g., value.upper()) before validation to remain backward compatible. Not required for this PR but helps reduce friction.
Also applies to: 36-36
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml (1)
17-17: Quote backend for consistency with other string fieldsTo match the surrounding style and avoid future YAML typing surprises, quote the backend values.
cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT" ... cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT"Also applies to: 26-26
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml (1)
19-19: Unify quoting forbackendvalues in disaggregated test configsSeveral YAML configs under
tests/integration/defs/disaggregated/test_configs/use unquoted uppercase tokens forbackend. For consistency with other string fields, please wrap these values in quotes.Affected locations:
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml
- Line 19
- Line 34
Suggested diff for each occurrence:
cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT"Please apply the same change across all disaggregated test configs (you can use the helper script below to locate remaining unquoted tokens):
#!/bin/bash set -euo pipefail # Show unquoted uppercase backend values rg -nP --glob 'tests/integration/defs/disaggregated/test_configs/**/*.yaml' \ -C2 '^\s*backend:\s*(DEFAULT|UCX|NIXL|MPI)\s*$'tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml (1)
19-19: Optional: normalize quoting style for consistency.Some configs quote the string ("DEFAULT"), others leave it unquoted. YAML treats both the same; if you want uniformity across test configs, consider quoting here as well.
Apply this minimal diff if you prefer quoting:
- backend: DEFAULT + backend: "DEFAULT"Also applies to: 34-34
examples/disaggregated/slurm/gen_yaml.py (1)
200-201: Set DEFAULT via a constant (or CLI option) to avoid magic strings and ease future changes.Two occurrences hardcode 'DEFAULT'. Suggest either:
- Option A (minimal): introduce a module constant and use it here.
- Option B (flexible): add a CLI arg (choices: DEFAULT/UCX/NIXL/MPI) and plumb it into gen_config_file.
Minimal in-place diffs for the changed lines:
- 'backend': 'DEFAULT', + 'backend': DEFAULT_CACHE_TRANSCEIVER_BACKEND,- 'backend': 'DEFAULT', + 'backend': DEFAULT_CACHE_TRANSCEIVER_BACKEND,Add this near the imports (outside the changed ranges):
# Allowed: "DEFAULT", "UCX", "NIXL", "MPI" DEFAULT_CACHE_TRANSCEIVER_BACKEND = "DEFAULT"If you want the CLI option instead, I can provide a follow-up patch that adds:
- parser.add_argument("--cache_transceiver_backend", choices=["DEFAULT","UCX","NIXL","MPI"], default="DEFAULT")
- a new gen_config_file parameter (defaulting to "DEFAULT")
- wiring that value into both cache_transceiver_config blocks.
Also applies to: 228-229
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml (1)
13-13: Nit: quote backend values for YAML style consistencyOther configs in this PR use quotes (e.g.,
"NIXL"). Consider quoting here as well for uniformity.- backend: DEFAULT + backend: "DEFAULT" ... - backend: DEFAULT + backend: "DEFAULT"Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml (1)
14-14: Nit: adopt quoted style for consistency with other filesPrefer quoting to match files that use
"NIXL"/"UCX".- backend: DEFAULT + backend: "DEFAULT" ... - backend: DEFAULT + backend: "DEFAULT"Also applies to: 23-23
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml (1)
19-19: Nit: quote string literals for uniform YAMLAlign with other configs that quote backend values.
- backend: DEFAULT + backend: "DEFAULT" ... - backend: DEFAULT + backend: "DEFAULT"Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml (1)
13-13: Nit: consider quoting for consistencyRecommend quoting backend values across all YAMLs.
- backend: DEFAULT + backend: "DEFAULT" ... - backend: DEFAULT + backend: "DEFAULT"Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1)
12-12: LGTM: NIXL backend uppercased and quotedValues are correctly set to
"NIXL". Thanks for keeping quoting consistent here. Ensure the rest of the suite also uses quoted string literals for parity.Also applies to: 20-20
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml (1)
11-11: Nit: quote YAML string values for consistency and to avoid future churn.
Other configs in this PR quote the same value; keep a single style across files.Apply this diff:
- backend: DEFAULT + backend: "DEFAULT" @@ - backend: DEFAULT + backend: "DEFAULT"Also applies to: 19-19
tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml (1)
22-22: Unify quoting style within this file.
One is unquoted (Line 22), the other quoted (Line 41). Prefer a single style.Apply this diff to quote Line 22 for consistency with Line 41:
- backend: DEFAULT + backend: "DEFAULT"If the repo prefers unquoted, then unquote Line 41 instead:
- backend: "DEFAULT" + backend: DEFAULTPick one style and apply it repo-wide to minimize diffs.
Also applies to: 41-41
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml (1)
21-21: Optional: adopt a single quoting convention across configs.
These are unquoted here but quoted in other files. Aligning the style reduces noise in future changes.Example (if preferring quotes):
- backend: DEFAULT + backend: "DEFAULT" @@ - backend: DEFAULT + backend: "DEFAULT"Also applies to: 36-36
tensorrt_llm/llmapi/llm_args.py (3)
1042-1045: Uppercase Literal switch is correct; consider BC and default semanticsThe change to
Literal["DEFAULT", "UCX", "NIXL", "MPI"]matches the repo-wide uppercase convention. Two small improvements to lower risk:
- Backward compatibility: accept lowercase inputs by normalizing to uppercase before validation.
- Default semantics: docs imply a default backend; setting the Python-side default to "DEFAULT" would make behavior explicit and align configs/tests.
Suggested in-place tweak (optional):
- backend: Optional[Literal["DEFAULT", "UCX", "NIXL", "MPI"]] = Field( - default=None, + backend: Optional[Literal["DEFAULT", "UCX", "NIXL", "MPI"]] = Field( + default="DEFAULT", description= "The communication backend type to use for the cache transceiver.")And add a validator to preserve compatibility with legacy configs that still pass lowercase tokens (added outside this hunk):
@field_validator('backend', mode='before') @classmethod def normalize_backend(cls, v): if v is None: return v if isinstance(v, str): return v.upper() return vPlease confirm that setting default to "DEFAULT" won’t collide with any downstream logic that previously relied on
Noneto signal “auto” behavior.
1051-1055: Guard against None to avoid runtime errors in pybind conversionIf
backendisNoneat runtime (e.g., programmatic construction),_CacheTransceiverBackendType.from_string(self.backend)could raise. Defensive mapping to "DEFAULT" here is cheap and safe:def _to_pybind(self): backend = self.backend or "DEFAULT" return _CacheTransceiverConfig( backend=_CacheTransceiverBackendType.from_string(backend), max_tokens_in_buffer=self.max_tokens_in_buffer)
1037-1056: Consider using a StrEnum mirror for consistencyElsewhere (e.g., BatchingType, CapacitySchedulerPolicy) we mirror pybind enums with StrEnum. Doing the same for CacheTransceiverBackendType would:
- Centralize validation (schema shows enum values)
- Remove direct string handling in
_to_pybind- Improve IDE completion
Optional, non-blocking for this PR.
examples/disaggregated/README.md (2)
15-18: Clarify “default backend” vs theDEFAULTtoken and casing requirements
- This sentence says the default is
UCX. Elsewhere configs useDEFAULT. Please add one line clarifying thatDEFAULTresolves toUCX(if that’s intended), or revise the statement to match actual behavior.- Explicitly note that backend values are case-sensitive and must be uppercase to match API validation.
Example addendum:
“Backend values are case-sensitive and must be one of: DEFAULT, UCX, NIXL, MPI. DEFAULT currently resolves to UCX.”
198-198: Minor heading grammar: “Know Issues” → “Known Issues”Polish the section heading for correctness.
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml (1)
12-12: Nit: quote string scalars for consistency across YAMLsYou used unquoted DEFAULT here, while other configs (e.g., disagg_config_ngram.yaml) quote the value. Quoting avoids style drift and keeps diffs smaller long-term.
Apply to this file:
cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT" @@ cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT"Also applies to: 23-23
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml (1)
24-24: Nit: align quoting with the rest of the config setSame as above: consider quoting DEFAULT for consistency with other YAMLs touched in this PR.
cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT" @@ cache_transceiver_config: - backend: DEFAULT + backend: "DEFAULT"Also applies to: 38-38
tests/integration/defs/disaggregated/test_disaggregated_etcd.py (1)
241-259: Confirm desired behavior: config uses DEFAULT while env forces UCXcreate_config_files writes backend: "DEFAULT" in both extra-llm-api configs, but start_context_server/start_generation_server set TRTLLM_USE_UCX_KVCACHE="1". If the intent is to always exercise the UCX KV cache in this test, consider declaring it explicitly in the config to avoid reliance on env precedence and make intent clearer.
If explicit UCX is preferred, tweak only the two lines below:
context_config_content = """pytorch_backend_config: disable_overlap_scheduler: True cache_transceiver_config: - backend: "DEFAULT" + backend: "UCX" max_tokens_in_buffer: 2048""" @@ generation_config_content = """cache_transceiver_config: - backend: "DEFAULT" + backend: "UCX" max_tokens_in_buffer: 2048"""Follow-up check: please confirm that llm_args backend selection logic treats "DEFAULT"+TRTLLM_USE_UCX_KVCACHE=1 equivalently to "UCX", or opt into the explicit config change above.
tests/unittest/llmapi/test_llm_args.py (1)
1-1: Missing NVIDIA copyright header.Per coding guidelines, prepend the 2025 NVIDIA header.
+# Copyright (c) 2025, NVIDIA CORPORATION. +# +""" Existing content… """tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (2)
134-134: Optional: avoid repetition with a single toggle for backend.If you expect to switch backends in CI (e.g., A/B UCX vs NIXL), centralize the token via env or module-level const to reduce diffs.
Within this file:
@@ -import pytest +import os +import pytest @@ - cache_transceiver_configs = [ - CacheTransceiverConfig(backend="DEFAULT") for _ in range(2) - ] + backend_token = os.getenv("TRTLLM_CACHE_TRANSCEIVER_BACKEND", "DEFAULT") + cache_transceiver_configs = [ + CacheTransceiverConfig(backend=backend_token) for _ in range(2) + ]Also applies to: 277-277, 380-380
1-1: Missing NVIDIA copyright header.Add the 2025 header per guidelines.
+# Copyright (c) 2025, NVIDIA CORPORATION. +# """ Existing content… """tests/integration/defs/accuracy/test_disaggregated_serving.py (4)
263-264: Optional: single source of truth for backend token.Minimize edit hotspots by threading a backend token from env/test param.
Inside run_parallel_test():
@@ def run_parallel_test(...): - ctx_server_config = { + backend_token = os.getenv("TRTLLM_CACHE_TRANSCEIVER_BACKEND", "DEFAULT") + ctx_server_config = { @@ - "cache_transceiver_config": {"backend": "DEFAULT"} + "cache_transceiver_config": {"backend": backend_token} @@ - "cache_transceiver_config": {"backend": "DEFAULT"} + "cache_transceiver_config": {"backend": backend_token}Also applies to: 272-273
516-523: LGTM: NIXL backend paths for DeepSeekV3 Lite tests.Uppercased and consistent. If CI nodes may lack NIXL support, consider a skip marker keyed off capability detection.
651-652: LGTM: NIXL in Qwen3-8B backend tests.Looks correct; same optional capability-guarding suggestion applies.
Also applies to: 657-658
1-1: Missing NVIDIA copyright header.Please prepend the standard 2025 header.
+# Copyright (c) 2025, NVIDIA CORPORATION. +# """ Existing content… """
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (50)
benchmarks/cpp/README.md(1 hunks)docs/source/advanced/disaggregated-service.md(0 hunks)examples/cpp/executor/README.md(1 hunks)examples/disaggregated/README.md(3 hunks)examples/disaggregated/disagg_config.yaml(1 hunks)examples/disaggregated/slurm/gen_yaml.py(2 hunks)tensorrt_llm/llmapi/llm_args.py(1 hunks)tests/integration/defs/accuracy/test_disaggregated_serving.py(12 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_bs1.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_trtllm_sampler.yaml(2 hunks)tests/integration/defs/disaggregated/test_disaggregated.py(1 hunks)tests/integration/defs/disaggregated/test_disaggregated_etcd.py(1 hunks)tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py(3 hunks)tests/unittest/llmapi/test_llm_args.py(1 hunks)
💤 Files with no reviewable changes (1)
- docs/source/advanced/disaggregated-service.md
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
tensorrt_llm/llmapi/llm_args.pytests/integration/defs/disaggregated/test_disaggregated_single_gpu.pytests/unittest/llmapi/test_llm_args.pytests/integration/defs/accuracy/test_disaggregated_serving.pyexamples/disaggregated/slurm/gen_yaml.pytests/integration/defs/disaggregated/test_disaggregated.pytests/integration/defs/disaggregated/test_disaggregated_etcd.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
tensorrt_llm/llmapi/llm_args.pytests/integration/defs/disaggregated/test_disaggregated_single_gpu.pytests/unittest/llmapi/test_llm_args.pytests/integration/defs/accuracy/test_disaggregated_serving.pyexamples/disaggregated/slurm/gen_yaml.pytests/integration/defs/disaggregated/test_disaggregated.pytests/integration/defs/disaggregated/test_disaggregated_etcd.py
🧬 Code graph analysis (3)
tensorrt_llm/llmapi/llm_args.py (1)
tests/unittest/llmapi/apps/_test_openai_reasoning.py (1)
backend(18-19)
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (1)
tensorrt_llm/llmapi/llm_args.py (1)
CacheTransceiverConfig(1037-1054)
tests/unittest/llmapi/test_llm_args.py (1)
tensorrt_llm/llmapi/llm_args.py (1)
CacheTransceiverConfig(1037-1054)
🪛 LanguageTool
examples/cpp/executor/README.md
[grammar] ~127-~127: There might be a mistake here.
Context: ...RTLLM_USE_UCX_KVCACHE=1` is required to run disaggregated executor. For example, yo...
(QB_NEW_EN)
[grammar] ~127-~127: There might be a mistake here.
Context: ... required to run disaggregated executor. For example, you can run : ``` export TR...
(QB_NEW_EN)
🪛 Ruff (0.12.2)
tests/unittest/llmapi/test_llm_args.py
664-664: CacheTransceiverConfig may be undefined, or defined from star imports
(F405)
672-672: CacheTransceiverConfig may be undefined, or defined from star imports
(F405)
🔇 Additional comments (61)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (1)
14-14: Uppercasing backend tokens matches the updated API contract. LGTM.Adopts the new uppercase literals and should satisfy the Literal["DEFAULT","UCX","NIXL","MPI"] validation downstream.
Also applies to: 24-24
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml (1)
12-12: UCX backend uppercased and already quoted — looks good.Matches the new accepted enum and is consistent if we adopt an “always quote strings” YAML style.
Also applies to: 20-20
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml (1)
17-17: Uppercase DEFAULT aligns with the new enum.All good from a parsing/validation standpoint.
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (1)
18-18: Backend tokens uppercased — consistent with the new Literal set.Change is correct and low risk.
Also applies to: 33-33
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml (1)
17-17: Uppercased DEFAULT is correct for the new validation.Looks good.
Also applies to: 28-28
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml (1)
14-14: No remaining lowercase backends; add quotes for consistencyThe repo-wide regex check returned no lowercase
backend:values, confirming all tokens have been standardized to uppercase.Nitpick: quoting improves YAML style consistency and guards against parser quirks.
- backend: DEFAULT + backend: "DEFAULT"Apply the same change at line 23.
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml (1)
17-17: Quote DEFAULT for YAML consistency
- Using
backend: "DEFAULT"aligns with the updated allowed values and will unblock CI.- Quoting the value maintains symmetry with other string fields and avoids potential YAML tooling edge-cases.
- backend: DEFAULT + backend: "DEFAULT"Also applies to line 26.
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml (2)
16-16: Backend token normalization to DEFAULT looks correctMatches the repo-wide move to uppercase backend enums and should interoperate with updated llm_args validation.
16-16: No lowercase backend values detected in YAML configsA ripgrep scan over all
.yaml/.ymlfiles intests/,docs/,examples/, andbenchmarks/returned zero matches for lowercase backends (default,mpi,ucx,nixl). Allbackend:entries are correctly uppercase.No further action is needed here.
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml (1)
12-12: MPI backend uppercased correctly in both blocksBoth context and generation servers now use "MPI", aligning with the new enum set. Looks good.
Also applies to: 20-20
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (1)
13-13: DEFAULT backend normalization aligns with enum updateBoth sections correctly reflect the uppercase token.
Also applies to: 25-25
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml (1)
19-19: DEFAULT backend updates look goodBoth cache_transceiver_config blocks now match the expected enum.
Also applies to: 37-37
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_bs1.yaml (2)
20-20: DEFAULT backend change is correct in both sectionsConsistent with the repository-wide enum update.
Also applies to: 35-35
20-20: The script to enumerate allbackendvalues undercache_transceiver_confighas been dispatched—once it completes, I’ll use the results to confirm which backends are currently in use and update the review comment accordingly.tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (1)
21-21: Uppercasing backend is aligned with the new enum setChanging cache_transceiver_config.backend to DEFAULT matches the new uppercase literals (DEFAULT/UCX/NIXL/MPI). Looks good.
Also applies to: 36-36
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml (1)
19-19: Uppercase backend tokens are correctDEFAULT matches the updated, uppercase-only accepted values. Change is appropriate.
Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (1)
20-20: LGTM: backend normalized to uppercase and already quotedBoth occurrences use "DEFAULT" (quoted) which is consistent and aligns with the new enum. No further changes needed.
Also applies to: 36-36
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml (1)
17-17: Correct enum usageDEFAULT is the expected uppercase literal. Change aligns with the broader standardization.
Also applies to: 26-26
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml (1)
19-19: Uppercase normalization looks goodDEFAULT is in the accepted set and matches the PR’s intent.
Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml (2)
19-19: Uppercasing backend to DEFAULT aligns with updated API. LGTM.Matches the repository-wide shift to uppercase tokens and should parse identically in YAML.
Also applies to: 34-34
19-19: Fix verification script for cache_transceiver_config backend casingYour current regex is hitting PCRE2 match limits (many “PCRE2: error matching: match limit exceeded” messages) across all disaggregated YAML files, causing the “OK” messages to be misleading. Please update this check to reliably detect any lowercase
backendvalues undercache_transceiver_config. For example:
Simplified grep pipeline:
# List all backend entries, then filter out only the lowercase ones rg -n 'backend:' -g 'tests/integration/defs/disaggregated/**/*.yaml' \ | grep -Ei 'backend:\s*(default|ucx|nixl|mpi)' \ | grep -vEi 'backend:\s*(DEFAULT|UCX|NIXL|MPI)' \ && echo "No lowercase cache_transceiver backends found"YAML-aware validation with yq:
# Extract every cache_transceiver_config.backend value and flag any not in the allowed set yq e '.. | select(.cache_transceiver_config) | .cache_transceiver_config.backend' -d'*' \ tests/integration/defs/disaggregated/**/*.yaml \ | grep -vE '^(DEFAULT|UCX|NIXL|MPI)$' \ && echo "All cache_transceiver backends are uppercase and valid"Run one of these updated checks and confirm it returns no results. Apply the same verification at line 34 of
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml.tests/integration/defs/disaggregated/test_configs/disagg_config_trtllm_sampler.yaml (1)
19-19: Uppercase "DEFAULT" is consistent with the new schema. LGTM.Change is scoped and safe; quoting is fine.
Also applies to: 35-35
examples/disaggregated/disagg_config.yaml (1)
14-14: Backend standardized to "DEFAULT". Looks good.Matches the updated accepted literals and remains backward-compatible in YAML parsing.
Also applies to: 22-22
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml (1)
13-13: DEFAULT backend update acknowledged.Consistent with the rest of the PR; no functional risk detected.
Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml (2)
13-13: Uppercased backend tokens match new schema — LGTMSwitching to
DEFAULTaligns these test configs with the updated allowed literals. No functional concerns here.Also applies to: 21-21
13-13: cache_transceiver_config backends are all uppercaseI ran a repo-wide search for any lowercase
cache_transceiver_config.backendvalues (default | ucx | nixl | mpi) and found none—every entry is already uppercase as expected. No further changes needed here.tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml (1)
14-14: Uppercase normalization looks correct
DEFAULTmatches the updated allowed set and keeps context/gen configs in sync.Also applies to: 23-23
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml (1)
19-19: Backend token casing standardized — goodBoth occurrences updated to
DEFAULT. No issues spotted.Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml (1)
13-13: Casing update aligns with API expectations
DEFAULTis consistent with the new Literal set; looks good.Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml (2)
11-11: Uppercased backend tokens align with new accepted literals. LGTM.
These updates match the repo-wide move to uppercase backends and should keep these tests consistent with the parser.Also applies to: 19-19
11-11: No lowercase backend values detected; no changes required
Searches acrosstests,examples, anddocsreturned zero matches for lowercase backend values (default,ucx,nixl,mpi). Allbackend:entries remain correctly uppercased (e.g.DEFAULT,UCX,NIXL,MPI). The environment variablesTRTLLM_USE_MPI_KVCACHEandTRTLLM_USE_UCX_KVCACHEare still intentionally present and documented for their respective backends.tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml (1)
22-22: Uppercase backend standardization looks correct.
These values should parse under the updated Literal set.Also applies to: 41-41
benchmarks/cpp/README.md (1)
339-339: Switch to UCX KV cache env var is consistent with the PR direction.
Updating toTRTLLM_USE_UCX_KVCACHE=1matches the broader migration away from the MPI-based toggle.Also applies to: 347-347
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml (1)
21-21: Backend tokens uppercased — consistent with parser expectations.
No functional concerns spotted in this config snippet.Also applies to: 36-36
examples/cpp/executor/README.md (2)
127-127: Update to UCX KV cache toggle is correct and consistent with the rest of the PR.
This keeps the disaggregated executor docs in sync with the new backend guidance.Also applies to: 130-130
127-127: All disaggregated executor docs now consistently reference only the UCX toggle. No stale MPI flags remain.– Verified across docs/, examples/, and benchmarks/ that no occurrences of
TRTLLM_USE_MPI_KVCACHEappear in “disagg” or “disaggregated” contexts.
– Confirmed onlyTRTLLM_USE_UCX_KVCACHEis mentioned in those contexts.
– The README.md snippet at lines 127 and 130 is correct and requires no changes.tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml (2)
13-13: Backend token normalization to uppercase looks goodUpdating backend from "default" to "DEFAULT" aligns this test config with the updated Literal in code and reduces ambiguity. No functional issues spotted.
Also applies to: 21-21
13-13: No lowercase backend tokens remain
Ran a smart-case search across all YAML and test files fordefault|ucx|nixl|mpi; every match is in uppercase (DEFAULT,UCX,NIXL,MPI), and no lowercase occurrences were found. Resolving this comment.tensorrt_llm/llmapi/llm_args.py (1)
1042-1045: Doc/code alignment check: what is the actual default?The README states “default backend is UCX,” while tests use
DEFAULT. Please ensure:
- Either
DEFAULTresolves to UCX,- Or docs clearly state “omit backend or set to DEFAULT (resolves to UCX by default).”
If code keeps
default=None, confirm that the downstreamfrom_stringmapsNoneto UCX (or adjust as suggested above).tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml (1)
17-17: Consistent uppercase backend tokensBoth occurrences updated to
DEFAULT. Matches the new typing contract and should keep the integration test aligned with parser expectations.Also applies to: 27-27
examples/disaggregated/README.md (3)
23-40: Config snippets look correct and consistentThe example YAML uses
UCXand showsmax_tokens_in_buffer; this matches the new API. No issues.
42-51: Launch commands and ports align with the sample disagg_configThe ports (8001/8002 for context, 8003 for generation) match the later disagg_config block. Good consistency.
112-115: ETCD + UCX dynamic scaling commands look goodThe
TRTLLM_USE_UCX_KVCACHE=1export is a helpful hint. Commands are consistent with file names introduced above.tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml (1)
13-13: Uppercase backend tokens are correctMatches the updated
Literalvalues and keeps tests aligned. No other behavioral changes observed.Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml (1)
12-12: Backend tokens correctly uppercased to match new literalsBoth cache_transceiver_config.backend entries now use "DEFAULT" and align with the updated accepted values. No functional concerns.
Also applies to: 20-20
tests/integration/defs/disaggregated/test_disaggregated.py (2)
1279-1281: Good: benchmark configs now pass uppercase backend identifiersSwitching to "NIXL" and "UCX" matches the updated Literal values and avoids case-mismatch issues during validation.
1279-1281: Lowercase backend tokens audit passedAll tests and YAML configs have been scanned for lowercase
backendvalues (default,ucx,nixl,mpi) and no occurrences were found. The migration to uppercase (DEFAULT,UCX,NIXL,MPI) is complete.tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml (2)
19-19: Uppercase backend tokens look correct and consistent with the updated Literal type.Both context_servers and generation_servers now use backend: DEFAULT, matching Literal["DEFAULT","UCX","NIXL","MPI"] in CacheTransceiverConfig. No functional issues spotted.
Also applies to: 34-34
19-19: Sanity check passed: no lowercase backend values found in cache_transceiver_config blocks.No further action required.
tests/unittest/llmapi/test_llm_args.py (3)
664-667: LGTM: switched to uppercase "UCX" in positive-path test.Aligns with the new Literal and assert matches.
672-672: LGTM: invalid-args path updated to "UCX".Keeps the negative test aligned with the new accepted tokens.
664-664: Insert explicit import for CacheTransceiverConfig to silence F405Verified that adding the following line immediately after the star import in
tests/unittest/llmapi/test_llm_args.pyremoves the F405 errors forCacheTransceiverConfig:from tensorrt_llm.llmapi.llm_args import * # Explicit to silence F405 where referenced directly below +from tensorrt_llm.llmapi.llm_args import CacheTransceiverConfig• Targets F405 at lines 663 and 671—both errors no longer appear once the import is in place.
• The remaining F403 on the star import and other undefined-name errors are outside the scope of this change, per the original suggestion to keep*for now.tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (1)
134-134: LGTM: standardized to backend="DEFAULT" across contexts.Matches new type constraints; no behavioral changes expected.
Also applies to: 277-277, 380-380
tests/integration/defs/accuracy/test_disaggregated_serving.py (8)
263-264: LGTM: run_parallel_test uses uppercase DEFAULT in both ctx/gen server configs.Consistent with new Literal; no functional concerns.
Also applies to: 272-273
312-314: LGTM: DEFAULT applied in Llama3.1 auto_dtype test.Matches the standardized tokens.
354-355: LGTM: DEFAULT in NGram speculative decoding configs.Consistent change; no issues.
Also applies to: 362-363
407-408: LGTM: DEFAULT for Eagle3 configs.Aligned with backend type changes; OK.
Also applies to: 421-422
475-476: LGTM: DEFAULT in Llama4-Scout auto_dtype.Consistent with the new enum values.
600-601: LGTM: DEFAULT in Gemma-3 configs.No concerns.
Also applies to: 607-608
689-690: LGTM: DEFAULT in Qwen3-8B auto_dtype.Matches the standardized tokens.
Also applies to: 696-697
263-264: No legacy lowercase backend tokens found
Ran a repo-wide scan of Python and YAML files forbackend: "default|ucx|nixl|mpi"and there were zero matches—no lowercase backends remain. Feel free to mark this as resolved.
...gregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml
Show resolved
Hide resolved
|
PR_Github #16350 [ run ] completed with state |
…commands (#7191) Signed-off-by: Shixiaowei02 <[email protected]>
Summary by CodeRabbit
Documentation
Chores
Tests