Skip to content

chore: Fix cuda lock in trtllm dockerfile#3684

Merged
rmccorm4 merged 2 commits into
mainfrom
ibhosale_cuda_lock
Oct 17, 2025
Merged

chore: Fix cuda lock in trtllm dockerfile#3684
rmccorm4 merged 2 commits into
mainfrom
ibhosale_cuda_lock

Conversation

@indrajit96

@indrajit96 indrajit96 commented Oct 16, 2025

Copy link
Copy Markdown
Contributor

Overview:

Fix cuda lock in trtllm dockerfile, by bringing back initial lock

Details:

Add pip install "cuda-python>=12,<13"

Where should the reviewer start?

container/Dockerfile.trtllm

Summary by CodeRabbit

  • Bug Fixes
    • Improved container build stability by pinning CUDA Python dependencies to prevent compatibility issues with the TensorRT-LLM runtime environment.

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
@indrajit96 indrajit96 requested review from a team, nv-anants and rmccorm4 October 16, 2025 22:11
@indrajit96 indrajit96 requested review from a team as code owners October 16, 2025 22:11
@coderabbitai

coderabbitai Bot commented Oct 16, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Adds a version constraint for cuda-python (>=12, <13) to the Dockerfile.trtllm before TensorRT-LLM installation in both build paths. This pins cuda-python to ensure compatibility with tensorrt-llm 1.0.0rc6 without modifying existing conditional logic or error handling.

Changes

Cohort / File(s) Summary
CUDA Python version pinning
container/Dockerfile.trtllm
Introduces cuda-python version constraint (>=12, <13) in two trtllm build contexts, placed immediately before TensorRT-LLM wheel installation to lock the dependency

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A hop, a skip, to CUDA's shore,
Version twelve we now ensure,
TensorRT-LLM shall run so free,
When pinned at twelve, not thirteen!

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "chore: Fix cuda lock in trtllm dockerfile" is directly related to the main change in the changeset, which adds a CUDA Python version constraint (>=12,<13) to the trtllm Dockerfile. The title is concise, clear, and follows conventional commit style with a meaningful prefix that helps categorize the change. A teammate scanning the git history would immediately understand that this PR addresses a CUDA lock-related fix in the trtllm Dockerfile build configuration.
Description Check ✅ Passed The PR description follows the required template structure with three of four sections properly filled out: Overview section clearly states the purpose ("Fix cuda lock in trtllm dockerfile, by bringing back initial lock"), Details section specifies the exact change (pip install command with version constraint), and "Where should the reviewer start?" section provides a direct link to the affected file. The only missing section is "Related Issues," which appears to be a non-critical section that may not apply when a PR does not address a specific GitHub issue.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@indrajit96 indrajit96 changed the title Fix cuda lock in trtllm dockerfile chore: Fix cuda lock in trtllm dockerfile Oct 16, 2025
@github-actions github-actions Bot added the chore label Oct 16, 2025

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
container/Dockerfile.trtllm (1)

198-199: Optional: Consider consolidating RUN commands for Docker build efficiency.

The cuda-python installation is currently a separate RUN layer. If docker build layer caching is a concern, consider merging it with the subsequent large RUN block (lines 201–235) to reduce the final image size and rebuild time:

-# NOTE: locking cuda-python version to <13 to avoid breaks with tensorrt-llm 1.0.0rc6.
-RUN uv pip install "cuda-python>=12,<13"
-
 # Note: TensorRT needs to be uninstalled before installing the TRTLLM wheel
 # because there might be mismatched versions of TensorRT between the NGC PyTorch
 # and the TRTLLM wheel.
 RUN [ -f /etc/pip/constraint.txt ] && : > /etc/pip/constraint.txt || true && \
     # Clean up any existing conflicting CUDA repository configurations and GPG keys
     rm -f /etc/apt/sources.list.d/cuda*.list && \
     rm -f /usr/share/keyrings/cuda-archive-keyring.gpg && \
     rm -f /etc/apt/trusted.gpg.d/cuda*.gpg && \
+    # Install cuda-python with version lock to avoid breaks with tensorrt-llm 1.0.0rc6
+    uv pip install "cuda-python>=12,<13" && \
     if [ "$HAS_TRTLLM_CONTEXT" = "1" ]; then \

This is optional and depends on your layer-caching strategy, but keeping related pip installs together can improve build performance and readability.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4cd71f and 7f7c634.

📒 Files selected for processing (1)
  • container/Dockerfile.trtllm (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: vllm (amd64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: sglang
  • GitHub Check: trtllm (arm64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (2)
container/Dockerfile.trtllm (2)

198-199: Clear and well-placed cuda-python version lock.

The constraint >=12,<13 aligns with the CUDA 12.9.1 runtime image (line 9) and is positioned correctly before TensorRT-LLM installation. The comment adequately explains the motivation (compatibility with tensorrt-llm 1.0.0rc6).


198-235: Verify that unconditional cuda-python installation is intentional.

Line 199 installs cuda-python outside the conditional block (line 209), meaning it executes regardless of whether HAS_TRTLLM_CONTEXT is set to "1" or "0". If TensorRT-LLM installation is optional based on this flag, confirm whether cuda-python should also be conditional. If it's required by other dependencies in both branches (lines 217–224 and 233–234), this unconditional placement is correct.

Comment thread container/Dockerfile.trtllm
@rmccorm4 rmccorm4 merged commit 2c2f7c7 into main Oct 17, 2025
20 of 22 checks passed
@rmccorm4 rmccorm4 deleted the ibhosale_cuda_lock branch October 17, 2025 17:37
indrajit96 added a commit that referenced this pull request Oct 17, 2025
Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
saturley-hall pushed a commit that referenced this pull request Oct 17, 2025
Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
ziqifan617 pushed a commit that referenced this pull request Oct 20, 2025
Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
nv-kmcgill53 pushed a commit that referenced this pull request Oct 23, 2025
Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
yao531441 pushed a commit to yao531441/dynamo that referenced this pull request May 13, 2026
Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants