chore: Fix cuda lock in trtllm dockerfile by indrajit96 · Pull Request #3684 · ai-dynamo/dynamo

indrajit96 · 2025-10-16T22:11:11Z

Overview:

Fix cuda lock in trtllm dockerfile, by bringing back initial lock

Details:

Add pip install "cuda-python>=12,<13"

Where should the reviewer start?

container/Dockerfile.trtllm

Summary by CodeRabbit

Bug Fixes
- Improved container build stability by pinning CUDA Python dependencies to prevent compatibility issues with the TensorRT-LLM runtime environment.

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

coderabbitai · 2025-10-16T22:11:24Z

Walkthrough

Adds a version constraint for cuda-python (>=12, <13) to the Dockerfile.trtllm before TensorRT-LLM installation in both build paths. This pins cuda-python to ensure compatibility with tensorrt-llm 1.0.0rc6 without modifying existing conditional logic or error handling.

Changes

Cohort / File(s)	Summary
CUDA Python version pinning `container/Dockerfile.trtllm`	Introduces cuda-python version constraint (>=12, <13) in two trtllm build contexts, placed immediately before TensorRT-LLM wheel installation to lock the dependency

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A hop, a skip, to CUDA's shore,
Version twelve we now ensure,
TensorRT-LLM shall run so free,
When pinned at twelve, not thirteen! ✨

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "chore: Fix cuda lock in trtllm dockerfile" is directly related to the main change in the changeset, which adds a CUDA Python version constraint (>=12,<13) to the trtllm Dockerfile. The title is concise, clear, and follows conventional commit style with a meaningful prefix that helps categorize the change. A teammate scanning the git history would immediately understand that this PR addresses a CUDA lock-related fix in the trtllm Dockerfile build configuration.
Description Check	✅ Passed	The PR description follows the required template structure with three of four sections properly filled out: Overview section clearly states the purpose ("Fix cuda lock in trtllm dockerfile, by bringing back initial lock"), Details section specifies the exact change (pip install command with version constraint), and "Where should the reviewer start?" section provides a direct link to the affected file. The only missing section is "Related Issues," which appears to be a non-critical section that may not apply when a PR does not address a specific GitHub issue.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

container/Dockerfile.trtllm (1)
198-199: Optional: Consider consolidating RUN commands for Docker build efficiency.

The cuda-python installation is currently a separate RUN layer. If docker build layer caching is a concern, consider merging it with the subsequent large RUN block (lines 201–235) to reduce the final image size and rebuild time:
-# NOTE: locking cuda-python version to <13 to avoid breaks with tensorrt-llm 1.0.0rc6.
-RUN uv pip install "cuda-python>=12,<13"
-
 # Note: TensorRT needs to be uninstalled before installing the TRTLLM wheel
 # because there might be mismatched versions of TensorRT between the NGC PyTorch
 # and the TRTLLM wheel.
 RUN [ -f /etc/pip/constraint.txt ] && : > /etc/pip/constraint.txt || true && \
     # Clean up any existing conflicting CUDA repository configurations and GPG keys
     rm -f /etc/apt/sources.list.d/cuda*.list && \
     rm -f /usr/share/keyrings/cuda-archive-keyring.gpg && \
     rm -f /etc/apt/trusted.gpg.d/cuda*.gpg && \
+    # Install cuda-python with version lock to avoid breaks with tensorrt-llm 1.0.0rc6
+    uv pip install "cuda-python>=12,<13" && \
     if [ "$HAS_TRTLLM_CONTEXT" = "1" ]; then \
This is optional and depends on your layer-caching strategy, but keeping related pip installs together can improve build performance and readability.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4cd71f and 7f7c634.

📒 Files selected for processing (1)

container/Dockerfile.trtllm (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: trtllm (amd64)
GitHub Check: vllm (amd64)
GitHub Check: vllm (arm64)
GitHub Check: sglang
GitHub Check: trtllm (arm64)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (2)

container/Dockerfile.trtllm (2)

198-199: Clear and well-placed cuda-python version lock.

The constraint >=12,<13 aligns with the CUDA 12.9.1 runtime image (line 9) and is positioned correctly before TensorRT-LLM installation. The comment adequately explains the motivation (compatibility with tensorrt-llm 1.0.0rc6).

198-235: Verify that unconditional cuda-python installation is intentional.

Line 199 installs cuda-python outside the conditional block (line 209), meaning it executes regardless of whether HAS_TRTLLM_CONTEXT is set to "1" or "0". If TensorRT-LLM installation is optional based on this flag, confirm whether cuda-python should also be conditional. If it's required by other dependencies in both branches (lines 217–224 and 233–234), this unconditional placement is correct.

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

indrajit96 added 2 commits October 16, 2025 10:08

Fix cuda locak

881f604

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

Fix cuda locak

7f7c634

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

indrajit96 requested review from a team, nv-anants and rmccorm4 October 16, 2025 22:11

indrajit96 requested review from a team as code owners October 16, 2025 22:11

pull-request-size Bot added the size/XS label Oct 16, 2025

indrajit96 changed the title ~~Fix cuda lock in trtllm dockerfile~~ chore: Fix cuda lock in trtllm dockerfile Oct 16, 2025

github-actions Bot added the chore label Oct 16, 2025

coderabbitai Bot reviewed Oct 16, 2025

View reviewed changes

rmccorm4 approved these changes Oct 17, 2025

View reviewed changes

rmccorm4 reviewed Oct 17, 2025

View reviewed changes

Comment thread container/Dockerfile.trtllm

rmccorm4 merged commit 2c2f7c7 into main Oct 17, 2025
20 of 22 checks passed

rmccorm4 deleted the ibhosale_cuda_lock branch October 17, 2025 17:37

indrajit96 added a commit that referenced this pull request Oct 17, 2025

chore: Fix cuda lock in trtllm dockerfile (#3684)

f922c2c

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

saturley-hall pushed a commit that referenced this pull request Oct 17, 2025

chore: Fix cuda lock in trtllm dockerfile (#3684) (#3704)

c77b5dd

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

ziqifan617 pushed a commit that referenced this pull request Oct 20, 2025

chore: Fix cuda lock in trtllm dockerfile (#3684)

2fde12c

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

nv-kmcgill53 pushed a commit that referenced this pull request Oct 23, 2025

chore: Fix cuda lock in trtllm dockerfile (#3684)

f031e48

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

yao531441 pushed a commit to yao531441/dynamo that referenced this pull request May 13, 2026

chore: Fix cuda lock in trtllm dockerfile (ai-dynamo#3684)

fb8e722

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Fix cuda lock in trtllm dockerfile#3684

chore: Fix cuda lock in trtllm dockerfile#3684
rmccorm4 merged 2 commits into
mainfrom
ibhosale_cuda_lock

indrajit96 commented Oct 16, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 16, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

indrajit96 commented Oct 16, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

indrajit96 commented Oct 16, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 16, 2025 •

edited

Loading