[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #7191

Shixiaowei02 · 2025-08-25T02:15:33Z

Summary by CodeRabbit

Documentation
- Updated disaggregated serving guides and examples to use UCX-based KV cache and standardized “KV cache” terminology.
- Replaced inline commands with clearer YAML snippets and clarified server start guidance.
- Removed outdated MPI-specific FAQ content.
Chores
- Backend identifiers in configurations now require uppercase values: DEFAULT, UCX, NIXL, MPI; examples and templates adjusted.
Tests
- Updated test configs and cases to use uppercase backend values across scenarios.

coderabbitai · 2025-08-25T02:15:43Z

📝 Walkthrough

Walkthrough

This PR standardizes CacheTransceiverConfig.backend values to uppercase ("DEFAULT", "UCX", "NIXL", "MPI") across code, tests, and config files. Documentation and examples are updated to reference UCX-based KV cache (TRTLLM_USE_UCX_KVCACHE=1) and remove MPI-specific FAQ content. Numerous YAML/test configs change backend strings to uppercase.

Changes

Cohort / File(s)	Summary
API: CacheTransceiverConfig backend literals `tensorrt_llm/llmapi/llm_args.py`	Change Literal accepted values from lowercase to uppercase: Optional[Literal["DEFAULT","UCX","NIXL","MPI"]].
Integration tests: Python `tests/integration/defs/accuracy/test_disaggregated_serving.py`, `tests/integration/defs/disaggregated/test_disaggregated.py`, `tests/integration/defs/disaggregated/test_disaggregated_etcd.py`, `tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py`	Update backend identifiers passed in configs/constructors to uppercase (e.g., "DEFAULT", "UCX", "NIXL").
Integration tests: YAML configs `tests/integration/defs/disaggregated/test_configs/*`	Uppercase cache_transceiver_config.backend values across many configs: default→DEFAULT, ucx→UCX, nixl→NIXL, mpi→MPI. No structural changes.
Examples: disaggregated configs/generation `examples/disaggregated/disagg_config.yaml`, `examples/disaggregated/slurm/gen_yaml.py`	Change backend string values from "default" to "DEFAULT" in example YAML and Slurm YAML generator.
Docs and READMEs: UCX env var and terminology `benchmarks/cpp/README.md`, `examples/cpp/executor/README.md`, `docs/source/advanced/disaggregated-service.md`, `examples/disaggregated/README.md`	Switch docs from MPI KV cache env var to UCX (`TRTLLM_USE_UCX_KVCACHE=1`), remove MPI-specific FAQ, standardize “KV cache” capitalization, add explicit YAML config examples.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #6323 — Also switches CacheTransceiverConfig.backend literals to uppercase and aligns configs/tests.
[None][fix] Update tests to use standardized uppercase backend identifiers #6921 — Normalizes backend identifiers to uppercase across tests/config generation, overlapping the same areas.
[TRTLLM-6471] Infra: unwaive nixl tests and some disagg-serve tests #6095 — Touches disaggregated backend identifiers/tests (NIXL), intersecting with this PR’s backend token changes.

Suggested labels

Documentation

Suggested reviewers

chuangz0
nv-guomingz
kaiyux

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

Signed-off-by: Shixiaowei02 <[email protected]>

Shixiaowei02 · 2025-08-25T02:16:52Z

/bot --help

github-actions · 2025-08-25T02:17:00Z

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Shixiaowei02 · 2025-08-25T02:17:37Z

/bot run --add-multi-gpu-test

tensorrt-cicd · 2025-08-25T02:23:20Z

PR_Github #16350 [ run ] triggered by Bot

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

examples/disaggregated/slurm/gen_yaml.py (1)
1-1: Missing NVIDIA copyright header.

Per coding guidelines, prepend the current year header to all Python sources.

Apply at file top:
+# Copyright (c) 2025, NVIDIA CORPORATION.  All rights reserved.
examples/disaggregated/README.md (1)
127-131: Fix typo in metadata config: refersh_interval -> refresh_interval

User-facing YAML has a key misspelled:

refersh_interval

refresh_interval

Apply this one-liner change in the snippet:
- refersh_interval: 10.0
+ refresh_interval: 10.0

🧹 Nitpick comments (42)

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (1)
14-14: Nit: Consider quoting backend values for consistency across configs.

Some files use quoted strings (e.g., "UCX"), while others use bare scalars (DEFAULT). YAML treats both as strings, but a single convention improves grepability and reduces style churn.

Suggested tweak:
-    backend: DEFAULT
+    backend: "DEFAULT"
(repeat for both occurrences)

Also applies to: 24-24
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml (1)
17-17: Nit: Align quoting style with the rest of the suite.

If standardizing on quoted strings, update as:
-    backend: DEFAULT
+    backend: "DEFAULT"
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (1)
18-18: Nit: Prefer consistent YAML quoting.

To match files that quote backend values:
-    backend: DEFAULT
+    backend: "DEFAULT"
(apply to both occurrences)

Also applies to: 33-33
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml (1)
17-17: Nit: Unify quoting with other configs for readability/tooling.

If adopting the quoted-string convention:
-    backend: DEFAULT
+    backend: "DEFAULT"
(apply to both occurrences)

Also applies to: 28-28
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml (1)
18-18: Good standardization; keep config loader lenient if feasible.

The uppercase transition looks good here.

If the loader doesn’t already normalize case, consider adding normalization upstream so legacy configs with lowercase continue to work (reduces churn in downstream forks/tests). Not blocking this PR.

Optional style tweak:
-    backend: DEFAULT
+    backend: "DEFAULT"
If you want, I can open a follow-up to add a small normalizer (e.g., backend = backend.upper()) in the config parsing path.

Also applies to: 33-33
tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml (1)
19-19: LGTM on value normalization; consider quoting for consistency.

Change matches the uppercase token policy.

Quoting keeps style consistent with other string fields in the same file.
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml (1)
13-13: Aligned with TRT backend config; minor style polish available.

Using "DEFAULT" is consistent with the updated accepted literals.

Optional: quote for stylistic uniformity and safer YAML parsing across tools.
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml (1)
16-16: Optional: quote the string for consistency across configs

In other files, backend is quoted (e.g., "MPI"). Quoting here avoids accidental future YAML tooling treating bare words specially and keeps style uniform.

Apply this minimal adjustment:
-    backend: DEFAULT
+    backend: "DEFAULT"
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml (1)
12-12: Support case-insensitive backend parsing for backward compatibility

It looks like there’s currently no normalization of the backend argument (e.g. in tensorrt_llm/llmapi/llm_args.py), so configs using "mpi" (lowercase) would be treated differently than "MPI". To avoid breaking legacy configs, I recommend normalizing the backend string during argument parsing while still documenting uppercase as the preferred form.

— Add immediately after you extract backend in tensorrt_llm/llmapi/llm_args.py:
 class TrtLlmArgs(…):
     def __post_init__(self):
-        backend = kwargs.get("backend", None)
+        # normalize backend for case-insensitive matching (legacy configs)
+        backend = kwargs.get("backend", None)
+        if isinstance(backend, str):
+            backend = backend.upper()
+            kwargs["backend"] = backend
         if backend == "PYTORCH":
             …  # existing logic
• Location: tensorrt_llm/llmapi/llm_args.py (inside the initialization block where backend is first read)
• Rationale: ensures "mpi", "MPI", or "Mpi" all map to the same backend and prevents surprise breakage in release/1.0.
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (1)
13-13: Nit: unify quoting and boolean style within the file

You have mixed styles in nearby fields across configs (e.g., true vs True in other files, quoted vs unquoted strings). For this file, consider quoting DEFAULT for stylistic consistency and to reduce diff noise later from linters/formatters.

Suggested tweak:
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 25-25
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml (1)
19-19: Minor consistency: quote string enums; consider standardizing booleans across YAMLs

Quote DEFAULT for parity with places using "MPI".

Optional: align boolean literals consistently (true/false vs True/False) per repo style guide to prevent churn from formatters.

Patch for quoting:
-    backend: DEFAULT
+    backend: "DEFAULT"
(and the same for Line 37)

Also applies to: 37-37
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_bs1.yaml (1)
20-20: Nit: quote the backend string for uniformity

Keep the representation consistent with other configs that use quotes around enum-like strings.

Proposed change:
-    backend: DEFAULT
+    backend: "DEFAULT"
(repeat for Line 35)

Also applies to: 35-35
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (1)
21-21: Normalize quoting of backend across disaggregated test configs

Our scan confirms that every cache_transceiver_config.backend value is one of the allowed enums (DEFAULT, UCX, NIXL, MPI) and there are no lowercase or non-enum entries. However, many of these enum values—most notably DEFAULT—are currently unquoted, whereas other string literals (e.g. "pytorch", URLs) are consistently quoted. To prevent future YAML-typing pitfalls and improve readability, it’s best to quote all backend values uniformly.

• In tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml
Line 21:
cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
Line 36:
cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
• Similar unquoted backend: DEFAULT entries exist in many other *.yaml under tests/integration/defs/disaggregated/test_configs/ (e.g. disagg_config_cache_reuse_deepseek_v3.yaml, disagg_config_overlap.yaml, etc.).
Consider a one-liner to update them all:
find tests/integration/defs/disaggregated/test_configs -type f -name '*.yaml' \
  -exec sed -Ei 's@^( *)(backend: )DEFAULT$@\1\2"DEFAULT"@' {} +
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml (1)
19-19: Consider quoting backend for stylistic consistency

This file quotes other string values (e.g., "pytorch", URLs). To keep a consistent style, quote the DEFAULT token.
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
 ...
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (1)

20-20: Optional: add a guard in config parsing to accept legacy lowercase values

If older configs or user overrides still use lowercase tokens, consider normalizing in the loader (e.g., value.upper()) before validation to remain backward compatible. Not required for this PR but helps reduce friction.

Also applies to: 36-36
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml (1)
17-17: Quote backend for consistency with other string fields

To match the surrounding style and avoid future YAML typing surprises, quote the backend values.
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
 ...
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 26-26
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml (1)
19-19: Unify quoting for backend values in disaggregated test configs

Several YAML configs under tests/integration/defs/disaggregated/test_configs/ use unquoted uppercase tokens for backend. For consistency with other string fields, please wrap these values in quotes.

Affected locations:

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml

Line 19

Line 34

Suggested diff for each occurrence:
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
Please apply the same change across all disaggregated test configs (you can use the helper script below to locate remaining unquoted tokens):
#!/bin/bash
set -euo pipefail

# Show unquoted uppercase backend values
rg -nP --glob 'tests/integration/defs/disaggregated/test_configs/**/*.yaml' \
   -C2 '^\s*backend:\s*(DEFAULT|UCX|NIXL|MPI)\s*$'
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml (1)
19-19: Optional: normalize quoting style for consistency.

Some configs quote the string ("DEFAULT"), others leave it unquoted. YAML treats both the same; if you want uniformity across test configs, consider quoting here as well.

Apply this minimal diff if you prefer quoting:
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 34-34
examples/disaggregated/slurm/gen_yaml.py (1)
200-201: Set DEFAULT via a constant (or CLI option) to avoid magic strings and ease future changes.

Two occurrences hardcode 'DEFAULT'. Suggest either:

Option A (minimal): introduce a module constant and use it here.

Option B (flexible): add a CLI arg (choices: DEFAULT/UCX/NIXL/MPI) and plumb it into gen_config_file.

Minimal in-place diffs for the changed lines:
-                'backend': 'DEFAULT',
+                'backend': DEFAULT_CACHE_TRANSCEIVER_BACKEND,
-                'backend': 'DEFAULT',
+                'backend': DEFAULT_CACHE_TRANSCEIVER_BACKEND,
Add this near the imports (outside the changed ranges):
# Allowed: "DEFAULT", "UCX", "NIXL", "MPI"
DEFAULT_CACHE_TRANSCEIVER_BACKEND = "DEFAULT"
If you want the CLI option instead, I can provide a follow-up patch that adds:

parser.add_argument("--cache_transceiver_backend", choices=["DEFAULT","UCX","NIXL","MPI"], default="DEFAULT")

a new gen_config_file parameter (defaulting to "DEFAULT")

wiring that value into both cache_transceiver_config blocks.

Also applies to: 228-229
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml (1)
13-13: Nit: quote backend values for YAML style consistency

Other configs in this PR use quotes (e.g., "NIXL"). Consider quoting here as well for uniformity.
-    backend: DEFAULT
+    backend: "DEFAULT"
...
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml (1)
14-14: Nit: adopt quoted style for consistency with other files

Prefer quoting to match files that use "NIXL"/"UCX".
-    backend: DEFAULT
+    backend: "DEFAULT"
...
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 23-23
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml (1)
19-19: Nit: quote string literals for uniform YAML

Align with other configs that quote backend values.
-    backend: DEFAULT
+    backend: "DEFAULT"
...
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml (1)
13-13: Nit: consider quoting for consistency

Recommend quoting backend values across all YAMLs.
-    backend: DEFAULT
+    backend: "DEFAULT"
...
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 21-21
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1)

12-12: LGTM: NIXL backend uppercased and quoted

Values are correctly set to "NIXL". Thanks for keeping quoting consistent here. Ensure the rest of the suite also uses quoted string literals for parity.

Also applies to: 20-20
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml (1)
11-11: Nit: quote YAML string values for consistency and to avoid future churn.
Other configs in this PR quote the same value; keep a single style across files.

Apply this diff:
-    backend: DEFAULT
+    backend: "DEFAULT"
@@
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 19-19
tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml (1)
22-22: Unify quoting style within this file.
One is unquoted (Line 22), the other quoted (Line 41). Prefer a single style.

Apply this diff to quote Line 22 for consistency with Line 41:
-    backend: DEFAULT
+    backend: "DEFAULT"
If the repo prefers unquoted, then unquote Line 41 instead:
-    backend: "DEFAULT"
+    backend: DEFAULT
Pick one style and apply it repo-wide to minimize diffs.

Also applies to: 41-41
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml (1)
21-21: Optional: adopt a single quoting convention across configs.
These are unquoted here but quoted in other files. Aligning the style reduces noise in future changes.

Example (if preferring quotes):
-    backend: DEFAULT
+    backend: "DEFAULT"
@@
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 36-36
tensorrt_llm/llmapi/llm_args.py (3)
1042-1045: Uppercase Literal switch is correct; consider BC and default semantics

The change to Literal["DEFAULT", "UCX", "NIXL", "MPI"] matches the repo-wide uppercase convention. Two small improvements to lower risk:

Backward compatibility: accept lowercase inputs by normalizing to uppercase before validation.

Default semantics: docs imply a default backend; setting the Python-side default to "DEFAULT" would make behavior explicit and align configs/tests.

Suggested in-place tweak (optional):
-    backend: Optional[Literal["DEFAULT", "UCX", "NIXL", "MPI"]] = Field(
-        default=None,
+    backend: Optional[Literal["DEFAULT", "UCX", "NIXL", "MPI"]] = Field(
+        default="DEFAULT",
         description=
         "The communication backend type to use for the cache transceiver.")
And add a validator to preserve compatibility with legacy configs that still pass lowercase tokens (added outside this hunk):
@field_validator('backend', mode='before')
@classmethod
def normalize_backend(cls, v):
    if v is None:
        return v
    if isinstance(v, str):
        return v.upper()
    return v
Please confirm that setting default to "DEFAULT" won’t collide with any downstream logic that previously relied on None to signal “auto” behavior.

1051-1055: Guard against None to avoid runtime errors in pybind conversion

If backend is None at runtime (e.g., programmatic construction), _CacheTransceiverBackendType.from_string(self.backend) could raise. Defensive mapping to "DEFAULT" here is cheap and safe:
def _to_pybind(self):
    backend = self.backend or "DEFAULT"
    return _CacheTransceiverConfig(
        backend=_CacheTransceiverBackendType.from_string(backend),
        max_tokens_in_buffer=self.max_tokens_in_buffer)
1037-1056: Consider using a StrEnum mirror for consistency

Elsewhere (e.g., BatchingType, CapacitySchedulerPolicy) we mirror pybind enums with StrEnum. Doing the same for CacheTransceiverBackendType would:

Centralize validation (schema shows enum values)

Remove direct string handling in _to_pybind

Improve IDE completion

Optional, non-blocking for this PR.
examples/disaggregated/README.md (2)

15-18: Clarify “default backend” vs the DEFAULT token and casing requirements

This sentence says the default is UCX. Elsewhere configs use DEFAULT. Please add one line clarifying that DEFAULT resolves to UCX (if that’s intended), or revise the statement to match actual behavior.

Explicitly note that backend values are case-sensitive and must be uppercase to match API validation.

Example addendum:
“Backend values are case-sensitive and must be one of: DEFAULT, UCX, NIXL, MPI. DEFAULT currently resolves to UCX.”

198-198: Minor heading grammar: “Know Issues” → “Known Issues”

Polish the section heading for correctness.
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml (1)
12-12: Nit: quote string scalars for consistency across YAMLs

You used unquoted DEFAULT here, while other configs (e.g., disagg_config_ngram.yaml) quote the value. Quoting avoids style drift and keeps diffs smaller long-term.

Apply to this file:
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
@@
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 23-23
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml (1)
24-24: Nit: align quoting with the rest of the config set

Same as above: consider quoting DEFAULT for consistency with other YAMLs touched in this PR.
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
@@
   cache_transceiver_config:
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to: 38-38
tests/integration/defs/disaggregated/test_disaggregated_etcd.py (1)
241-259: Confirm desired behavior: config uses DEFAULT while env forces UCX

create_config_files writes backend: "DEFAULT" in both extra-llm-api configs, but start_context_server/start_generation_server set TRTLLM_USE_UCX_KVCACHE="1". If the intent is to always exercise the UCX KV cache in this test, consider declaring it explicitly in the config to avoid reliance on env precedence and make intent clearer.

If explicit UCX is preferred, tweak only the two lines below:
 context_config_content = """pytorch_backend_config:
   disable_overlap_scheduler: True
 cache_transceiver_config:
-  backend: "DEFAULT"
+  backend: "UCX"
   max_tokens_in_buffer: 2048"""
@@
 generation_config_content = """cache_transceiver_config:
-  backend: "DEFAULT"
+  backend: "UCX"
   max_tokens_in_buffer: 2048"""
Follow-up check: please confirm that llm_args backend selection logic treats "DEFAULT"+TRTLLM_USE_UCX_KVCACHE=1 equivalently to "UCX", or opt into the explicit config change above.
tests/unittest/llmapi/test_llm_args.py (1)
1-1: Missing NVIDIA copyright header.

Per coding guidelines, prepend the 2025 NVIDIA header.
+# Copyright (c) 2025, NVIDIA CORPORATION.
+#
+"""
 Existing content…
 """
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (2)
134-134: Optional: avoid repetition with a single toggle for backend.

If you expect to switch backends in CI (e.g., A/B UCX vs NIXL), centralize the token via env or module-level const to reduce diffs.

Within this file:
@@
-import pytest
+import os
+import pytest
@@
-    cache_transceiver_configs = [
-        CacheTransceiverConfig(backend="DEFAULT") for _ in range(2)
-    ]
+    backend_token = os.getenv("TRTLLM_CACHE_TRANSCEIVER_BACKEND", "DEFAULT")
+    cache_transceiver_configs = [
+        CacheTransceiverConfig(backend=backend_token) for _ in range(2)
+    ]
Also applies to: 277-277, 380-380

1-1: Missing NVIDIA copyright header.

Add the 2025 header per guidelines.
+# Copyright (c) 2025, NVIDIA CORPORATION.
+#
 """
 Existing content…
 """
tests/integration/defs/accuracy/test_disaggregated_serving.py (4)
263-264: Optional: single source of truth for backend token.

Minimize edit hotspots by threading a backend token from env/test param.

Inside run_parallel_test():
@@ def run_parallel_test(...):
-    ctx_server_config = {
+    backend_token = os.getenv("TRTLLM_CACHE_TRANSCEIVER_BACKEND", "DEFAULT")
+    ctx_server_config = {
@@
-        "cache_transceiver_config": {"backend": "DEFAULT"}
+        "cache_transceiver_config": {"backend": backend_token}
@@
-        "cache_transceiver_config": {"backend": "DEFAULT"}
+        "cache_transceiver_config": {"backend": backend_token}
Also applies to: 272-273

516-523: LGTM: NIXL backend paths for DeepSeekV3 Lite tests.

Uppercased and consistent. If CI nodes may lack NIXL support, consider a skip marker keyed off capability detection.

651-652: LGTM: NIXL in Qwen3-8B backend tests.

Looks correct; same optional capability-guarding suggestion applies.

Also applies to: 657-658

1-1: Missing NVIDIA copyright header.

Please prepend the standard 2025 header.
+# Copyright (c) 2025, NVIDIA CORPORATION.
+#
 """
 Existing content…
 """

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 030598a and e8da23a.

📒 Files selected for processing (50)

benchmarks/cpp/README.md (1 hunks)
docs/source/advanced/disaggregated-service.md (0 hunks)
examples/cpp/executor/README.md (1 hunks)
examples/disaggregated/README.md (3 hunks)
examples/disaggregated/disagg_config.yaml (1 hunks)
examples/disaggregated/slurm/gen_yaml.py (2 hunks)
tensorrt_llm/llmapi/llm_args.py (1 hunks)
tests/integration/defs/accuracy/test_disaggregated_serving.py (12 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_bs1.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_trtllm_sampler.yaml (2 hunks)
tests/integration/defs/disaggregated/test_disaggregated.py (1 hunks)
tests/integration/defs/disaggregated/test_disaggregated_etcd.py (1 hunks)
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (3 hunks)
tests/unittest/llmapi/test_llm_args.py (1 hunks)

💤 Files with no reviewable changes (1)

docs/source/advanced/disaggregated-service.md

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else

Files:

tensorrt_llm/llmapi/llm_args.py
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py
tests/unittest/llmapi/test_llm_args.py
tests/integration/defs/accuracy/test_disaggregated_serving.py
examples/disaggregated/slurm/gen_yaml.py
tests/integration/defs/disaggregated/test_disaggregated.py
tests/integration/defs/disaggregated/test_disaggregated_etcd.py

**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend NVIDIA copyright header (current year) to all source files

Files:

tensorrt_llm/llmapi/llm_args.py
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py
tests/unittest/llmapi/test_llm_args.py
tests/integration/defs/accuracy/test_disaggregated_serving.py
examples/disaggregated/slurm/gen_yaml.py
tests/integration/defs/disaggregated/test_disaggregated.py
tests/integration/defs/disaggregated/test_disaggregated_etcd.py

🧬 Code graph analysis (3)

tensorrt_llm/llmapi/llm_args.py (1)

tests/unittest/llmapi/apps/_test_openai_reasoning.py (1)

backend (18-19)

tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (1)

tensorrt_llm/llmapi/llm_args.py (1)

CacheTransceiverConfig (1037-1054)

tests/unittest/llmapi/test_llm_args.py (1)

tensorrt_llm/llmapi/llm_args.py (1)

CacheTransceiverConfig (1037-1054)

🪛 LanguageTool

examples/cpp/executor/README.md

[grammar] ~127-~127: There might be a mistake here.
Context: ...RTLLM_USE_UCX_KVCACHE=1` is required to run disaggregated executor. For example, yo...

(QB_NEW_EN)

[grammar] ~127-~127: There might be a mistake here.
Context: ... required to run disaggregated executor. For example, you can run : ``` export TR...

(QB_NEW_EN)

🪛 Ruff (0.12.2)

tests/unittest/llmapi/test_llm_args.py

664-664: CacheTransceiverConfig may be undefined, or defined from star imports

(F405)

672-672: CacheTransceiverConfig may be undefined, or defined from star imports

(F405)

🔇 Additional comments (61)

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (1)

14-14: Uppercasing backend tokens matches the updated API contract. LGTM.

Adopts the new uppercase literals and should satisfy the Literal["DEFAULT","UCX","NIXL","MPI"] validation downstream.

Also applies to: 24-24

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml (1)

12-12: UCX backend uppercased and already quoted — looks good.

Matches the new accepted enum and is consistent if we adopt an “always quote strings” YAML style.

Also applies to: 20-20

tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml (1)

17-17: Uppercase DEFAULT aligns with the new enum.

All good from a parsing/validation standpoint.

tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (1)

18-18: Backend tokens uppercased — consistent with the new Literal set.

Change is correct and low risk.

Also applies to: 33-33

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml (1)

17-17: Uppercased DEFAULT is correct for the new validation.

Looks good.

Also applies to: 28-28
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml (1)
14-14: No remaining lowercase backends; add quotes for consistency

The repo-wide regex check returned no lowercase backend: values, confirming all tokens have been standardized to uppercase.

Nitpick: quoting improves YAML style consistency and guards against parser quirks.
-    backend: DEFAULT
+    backend: "DEFAULT"
Apply the same change at line 23.
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml (1)
17-17: Quote DEFAULT for YAML consistency

Using backend: "DEFAULT" aligns with the updated allowed values and will unblock CI.

Quoting the value maintains symmetry with other string fields and avoids potential YAML tooling edge-cases.
-    backend: DEFAULT
+    backend: "DEFAULT"
Also applies to line 26.
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml (2)

16-16: Backend token normalization to DEFAULT looks correct

Matches the repo-wide move to uppercase backend enums and should interoperate with updated llm_args validation.

16-16: No lowercase backend values detected in YAML configs

A ripgrep scan over all .yaml/.yml files in tests/, docs/, examples/, and benchmarks/ returned zero matches for lowercase backends (default, mpi, ucx, nixl). All backend: entries are correctly uppercase.

No further action is needed here.

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml (1)

12-12: MPI backend uppercased correctly in both blocks

Both context and generation servers now use "MPI", aligning with the new enum set. Looks good.

Also applies to: 20-20

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (1)

13-13: DEFAULT backend normalization aligns with enum update

Both sections correctly reflect the uppercase token.

Also applies to: 25-25

tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml (1)

19-19: DEFAULT backend updates look good

Both cache_transceiver_config blocks now match the expected enum.

Also applies to: 37-37

tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_bs1.yaml (2)

20-20: DEFAULT backend change is correct in both sections

Consistent with the repository-wide enum update.

Also applies to: 35-35

20-20: The script to enumerate all backend values under cache_transceiver_config has been dispatched—once it completes, I’ll use the results to confirm which backends are currently in use and update the review comment accordingly.

tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (1)

21-21: Uppercasing backend is aligned with the new enum set

Changing cache_transceiver_config.backend to DEFAULT matches the new uppercase literals (DEFAULT/UCX/NIXL/MPI). Looks good.

Also applies to: 36-36

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml (1)

19-19: Uppercase backend tokens are correct

DEFAULT matches the updated, uppercase-only accepted values. Change is appropriate.

Also applies to: 34-34

tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (1)

20-20: LGTM: backend normalized to uppercase and already quoted

Both occurrences use "DEFAULT" (quoted) which is consistent and aligns with the new enum. No further changes needed.

Also applies to: 36-36

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml (1)

17-17: Correct enum usage

DEFAULT is the expected uppercase literal. Change aligns with the broader standardization.

Also applies to: 26-26

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml (1)

19-19: Uppercase normalization looks good

DEFAULT is in the accepted set and matches the PR’s intent.

Also applies to: 34-34
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml (2)
19-19: Uppercasing backend to DEFAULT aligns with updated API. LGTM.

Matches the repository-wide shift to uppercase tokens and should parse identically in YAML.

Also applies to: 34-34

19-19: Fix verification script for cache_transceiver_config backend casing

Your current regex is hitting PCRE2 match limits (many “PCRE2: error matching: match limit exceeded” messages) across all disaggregated YAML files, causing the “OK” messages to be misleading. Please update this check to reliably detect any lowercase backend values under cache_transceiver_config. For example:
Simplified grep pipeline:
# List all backend entries, then filter out only the lowercase ones
rg -n 'backend:' -g 'tests/integration/defs/disaggregated/**/*.yaml' \
  | grep -Ei 'backend:\s*(default|ucx|nixl|mpi)' \
  | grep -vEi 'backend:\s*(DEFAULT|UCX|NIXL|MPI)' \
  && echo "No lowercase cache_transceiver backends found"
YAML-aware validation with yq:
# Extract every cache_transceiver_config.backend value and flag any not in the allowed set
yq e '.. | select(.cache_transceiver_config) | .cache_transceiver_config.backend' -d'*' \
  tests/integration/defs/disaggregated/**/*.yaml \
  | grep -vE '^(DEFAULT|UCX|NIXL|MPI)$' \
  && echo "All cache_transceiver backends are uppercase and valid"
Run one of these updated checks and confirm it returns no results. Apply the same verification at line 34 of tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml.
tests/integration/defs/disaggregated/test_configs/disagg_config_trtllm_sampler.yaml (1)

19-19: Uppercase "DEFAULT" is consistent with the new schema. LGTM.

Change is scoped and safe; quoting is fine.

Also applies to: 35-35

examples/disaggregated/disagg_config.yaml (1)

14-14: Backend standardized to "DEFAULT". Looks good.

Matches the updated accepted literals and remains backward-compatible in YAML parsing.

Also applies to: 22-22

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml (1)

13-13: DEFAULT backend update acknowledged.

Consistent with the rest of the PR; no functional risk detected.

Also applies to: 21-21

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml (2)

13-13: Uppercased backend tokens match new schema — LGTM

Switching to DEFAULT aligns these test configs with the updated allowed literals. No functional concerns here.

Also applies to: 21-21

13-13: cache_transceiver_config backends are all uppercase

I ran a repo-wide search for any lowercase cache_transceiver_config.backend values (default | ucx | nixl | mpi) and found none—every entry is already uppercase as expected. No further changes needed here.

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml (1)

14-14: Uppercase normalization looks correct

DEFAULT matches the updated allowed set and keeps context/gen configs in sync.

Also applies to: 23-23

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml (1)

19-19: Backend token casing standardized — good

Both occurrences updated to DEFAULT. No issues spotted.

Also applies to: 34-34

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml (1)

13-13: Casing update aligns with API expectations

DEFAULT is consistent with the new Literal set; looks good.

Also applies to: 21-21

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml (2)

11-11: Uppercased backend tokens align with new accepted literals. LGTM.
These updates match the repo-wide move to uppercase backends and should keep these tests consistent with the parser.

Also applies to: 19-19

11-11: No lowercase backend values detected; no changes required
Searches across tests, examples, and docs returned zero matches for lowercase backend values (default, ucx, nixl, mpi). All backend: entries remain correctly uppercased (e.g. DEFAULT, UCX, NIXL, MPI). The environment variables TRTLLM_USE_MPI_KVCACHE and TRTLLM_USE_UCX_KVCACHE are still intentionally present and documented for their respective backends.

tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml (1)

22-22: Uppercase backend standardization looks correct.
These values should parse under the updated Literal set.

Also applies to: 41-41

benchmarks/cpp/README.md (1)

339-339: Switch to UCX KV cache env var is consistent with the PR direction.
Updating to TRTLLM_USE_UCX_KVCACHE=1 matches the broader migration away from the MPI-based toggle.

Also applies to: 347-347

tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml (1)

21-21: Backend tokens uppercased — consistent with parser expectations.
No functional concerns spotted in this config snippet.

Also applies to: 36-36

examples/cpp/executor/README.md (2)

127-127: Update to UCX KV cache toggle is correct and consistent with the rest of the PR.
This keeps the disaggregated executor docs in sync with the new backend guidance.

Also applies to: 130-130

127-127: All disaggregated executor docs now consistently reference only the UCX toggle. No stale MPI flags remain.

– Verified across docs/, examples/, and benchmarks/ that no occurrences of TRTLLM_USE_MPI_KVCACHE appear in “disagg” or “disaggregated” contexts.
– Confirmed only TRTLLM_USE_UCX_KVCACHE is mentioned in those contexts.
– The README.md snippet at lines 127 and 130 is correct and requires no changes.

tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml (2)

13-13: Backend token normalization to uppercase looks good

Updating backend from "default" to "DEFAULT" aligns this test config with the updated Literal in code and reduces ambiguity. No functional issues spotted.

Also applies to: 21-21

13-13: No lowercase backend tokens remain
Ran a smart-case search across all YAML and test files for default|ucx|nixl|mpi; every match is in uppercase (DEFAULT, UCX, NIXL, MPI), and no lowercase occurrences were found. Resolving this comment.

tensorrt_llm/llmapi/llm_args.py (1)

1042-1045: Doc/code alignment check: what is the actual default?

The README states “default backend is UCX,” while tests use DEFAULT. Please ensure:

Either DEFAULT resolves to UCX,

Or docs clearly state “omit backend or set to DEFAULT (resolves to UCX by default).”

If code keeps default=None, confirm that the downstream from_string maps None to UCX (or adjust as suggested above).

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml (1)

17-17: Consistent uppercase backend tokens

Both occurrences updated to DEFAULT. Matches the new typing contract and should keep the integration test aligned with parser expectations.

Also applies to: 27-27

examples/disaggregated/README.md (3)

23-40: Config snippets look correct and consistent

The example YAML uses UCX and shows max_tokens_in_buffer; this matches the new API. No issues.

42-51: Launch commands and ports align with the sample disagg_config

The ports (8001/8002 for context, 8003 for generation) match the later disagg_config block. Good consistency.

112-115: ETCD + UCX dynamic scaling commands look good

The TRTLLM_USE_UCX_KVCACHE=1 export is a helpful hint. Commands are consistent with file names introduced above.

tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml (1)

13-13: Uppercase backend tokens are correct

Matches the updated Literal values and keeps tests aligned. No other behavioral changes observed.

Also applies to: 21-21

tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml (1)

12-12: Backend tokens correctly uppercased to match new literals

Both cache_transceiver_config.backend entries now use "DEFAULT" and align with the updated accepted values. No functional concerns.

Also applies to: 20-20

tests/integration/defs/disaggregated/test_disaggregated.py (2)

1279-1281: Good: benchmark configs now pass uppercase backend identifiers

Switching to "NIXL" and "UCX" matches the updated Literal values and avoids case-mismatch issues during validation.

1279-1281: Lowercase backend tokens audit passed

All tests and YAML configs have been scanned for lowercase backend values (default, ucx, nixl, mpi) and no occurrences were found. The migration to uppercase (DEFAULT, UCX, NIXL, MPI) is complete.

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml (2)

19-19: Uppercase backend tokens look correct and consistent with the updated Literal type.

Both context_servers and generation_servers now use backend: DEFAULT, matching Literal["DEFAULT","UCX","NIXL","MPI"] in CacheTransceiverConfig. No functional issues spotted.

Also applies to: 34-34

19-19: Sanity check passed: no lowercase backend values found in cache_transceiver_config blocks.

No further action required.
tests/unittest/llmapi/test_llm_args.py (3)
664-667: LGTM: switched to uppercase "UCX" in positive-path test.

Aligns with the new Literal and assert matches.

672-672: LGTM: invalid-args path updated to "UCX".

Keeps the negative test aligned with the new accepted tokens.

664-664: Insert explicit import for CacheTransceiverConfig to silence F405

Verified that adding the following line immediately after the star import in tests/unittest/llmapi/test_llm_args.py removes the F405 errors for CacheTransceiverConfig:
 from tensorrt_llm.llmapi.llm_args import *
 # Explicit to silence F405 where referenced directly below
+from tensorrt_llm.llmapi.llm_args import CacheTransceiverConfig
• Targets F405 at lines 663 and 671—both errors no longer appear once the import is in place.
• The remaining F403 on the star import and other undefined-name errors are outside the scope of this change, per the original suggestion to keep * for now.
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (1)

134-134: LGTM: standardized to backend="DEFAULT" across contexts.

Matches new type constraints; no behavioral changes expected.

Also applies to: 277-277, 380-380

tests/integration/defs/accuracy/test_disaggregated_serving.py (8)

263-264: LGTM: run_parallel_test uses uppercase DEFAULT in both ctx/gen server configs.

Consistent with new Literal; no functional concerns.

Also applies to: 272-273

312-314: LGTM: DEFAULT applied in Llama3.1 auto_dtype test.

Matches the standardized tokens.

354-355: LGTM: DEFAULT in NGram speculative decoding configs.

Consistent change; no issues.

Also applies to: 362-363

407-408: LGTM: DEFAULT for Eagle3 configs.

Aligned with backend type changes; OK.

Also applies to: 421-422

475-476: LGTM: DEFAULT in Llama4-Scout auto_dtype.

Consistent with the new enum values.

600-601: LGTM: DEFAULT in Gemma-3 configs.

No concerns.

Also applies to: 607-608

689-690: LGTM: DEFAULT in Qwen3-8B auto_dtype.

Matches the standardized tokens.

Also applies to: 696-697

263-264: No legacy lowercase backend tokens found
Ran a repo-wide scan of Python and YAML files for backend: "default|ucx|nixl|mpi" and there were zero matches—no lowercase backends remain. Feel free to mark this as resolved.

benchmarks/cpp/README.md

examples/cpp/executor/README.md

...gregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml

tensorrt-cicd · 2025-08-25T12:06:49Z

PR_Github #16350 [ run ] completed with state SUCCESS
/LLM/release-1.0/L0_MergeRequest_PR pipeline #286 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

…commands (#7191) Signed-off-by: Shixiaowei02 <[email protected]>

Shixiaowei02 requested a review from a team as a code owner August 25, 2025 02:15

Shixiaowei02 added 3 commits August 25, 2025 10:15

update

3e69a3f

Signed-off-by: Shixiaowei02 <[email protected]>

update

c1e176d

Signed-off-by: Shixiaowei02 <[email protected]>

update

e8da23a

Signed-off-by: Shixiaowei02 <[email protected]>

Shixiaowei02 force-pushed the release/1.0-fix-2 branch from 6087101 to e8da23a Compare August 25, 2025 02:15

Shixiaowei02 changed the title ~~[none][fix] CI Test~~ [None][fix] CI Test Aug 25, 2025

coderabbitai bot reviewed Aug 25, 2025

View reviewed changes

benchmarks/cpp/README.md Show resolved Hide resolved

examples/cpp/executor/README.md Show resolved Hide resolved

...gregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml Show resolved Hide resolved

Shixiaowei02 changed the title ~~[None][fix] CI Test~~ [TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands Aug 25, 2025

Shixiaowei02 requested a review from kaiyux August 25, 2025 12:20

kaiyux approved these changes Aug 25, 2025

View reviewed changes

Shixiaowei02 merged commit d010b20 into NVIDIA:release/1.0 Aug 25, 2025
6 of 7 checks passed

Shixiaowei02 deleted the release/1.0-fix-2 branch August 25, 2025 12:21

Shixiaowei02 mentioned this pull request Aug 25, 2025

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #6780

Closed

yuanjingx87 pushed a commit that referenced this pull request Aug 28, 2025

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual …

88b1446

…commands (#7191) Signed-off-by: Shixiaowei02 <[email protected]>

dominicshanshan mentioned this pull request Sep 4, 2025

[None][chore] Mass integration of release/1.0 - 3rd #7519

Merged

1 task

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #7191

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #7191

Uh oh!

Conversation

Shixiaowei02 commented Aug 25, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

Shixiaowei02 commented Aug 25, 2025

Uh oh!

github-actions bot commented Aug 25, 2025

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

Shixiaowei02 commented Aug 25, 2025

Uh oh!

tensorrt-cicd commented Aug 25, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tensorrt-cicd commented Aug 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Shixiaowei02 commented Aug 25, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 25, 2025 •

edited

Loading