-
Notifications
You must be signed in to change notification settings - Fork 690
feat: Test cuda 12.9 #3064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Test cuda 12.9 #3064
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,10 +6,10 @@ ARG BASE_IMAGE="nvcr.io/nvidia/cuda-dl-base" | |
| # Please check https://github.com/ai-dynamo/dynamo/pull/1065 | ||
| # for details and reproducer to manually test if the image | ||
| # can be updated to later versions. | ||
| ARG BASE_IMAGE_TAG="25.01-cuda12.8-devel-ubuntu24.04" | ||
| ARG BASE_IMAGE_TAG="25.01-cuda12.9-devel-ubuntu24.04" | ||
| ARG RELEASE_BUILD | ||
| ARG RUNTIME_IMAGE="nvcr.io/nvidia/cuda" | ||
| ARG RUNTIME_IMAGE_TAG="12.8.1-runtime-ubuntu24.04" | ||
| ARG RUNTIME_IMAGE_TAG="12.9.1-runtime-ubuntu24.04" | ||
|
Comment on lines
+9
to
+12
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainSame base/runtime tag check as vLLM; 25.01 + CUDA 12.9 likely invalid Ensure If you confirm a newer HPC track (e.g., 25.09) is required for 12.9, update both this file and vLLM to the same track to keep builds consistent. 🏁 Script executed: #!/bin/bash
set -euo pipefail
docker manifest inspect nvcr.io/nvidia/cuda-dl-base:25.01-cuda12.9-devel-ubuntu24.04 >/dev/null \
&& echo "FOUND base tag" || echo "MISSING base tag"
docker manifest inspect nvcr.io/nvidia/cuda:12.9.1-runtime-ubuntu24.04 >/dev/null \
&& echo "FOUND runtime tag" || echo "MISSING runtime tag"Length of output: 126 Base/runtime tag mismatch — 25.01 doesn't ship CUDA 12.9; update both to a matching 25.xx track Local manifest check couldn't run here (docker not installed); NVIDIA release notes show 25.01 images are built with CUDA 12.8 while CUDA 12.9.1 is provided in later CUDA‑DL releases (25.06). (docs.nvidia.com) nvcr.io/nvidia/cuda-dl-base:25.01-cuda12.9-devel-ubuntu24.04 is therefore likely invalid — change BASE_IMAGE_TAG to nvcr.io/nvidia/cuda-dl-base:25.06-cuda12.9-devel-ubuntu24.04 (or revert CUDA to 12.8 for 25.01) and ensure the runtime tag is on the same 25.xx track; 12.9.1 CUDA images are available on Docker Hub / NGC. (docker.aityp.com) File: container/Dockerfile.sglang (lines 9–12). |
||
|
|
||
| # Make sure to update the dependency version in pyproject.toml when updating this | ||
| ARG SGLANG_VERSION="0.5.0rc2" | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Tag likely invalid: cuda-dl-base 25.01 + CUDA 12.9 combo probably doesn’t exist
25.01 predates CUDA 12.9; nvcr.io/nvidia/cuda-dl-base is unlikely to publish
25.01-cuda12.9-*. This will 404 at build time. Align the HPC track to the first release that actually ships CUDA 12.9 (e.g., 25.09+), or drop the 25.01 constraint if you’re unblocking tests.Run to confirm the exact valid tag (may require NGC auth):
Optional example fix (adjust to the tag you find):
🏁 Script executed:
Length of output: 591
Invalid base image tag — fix before merge
container/Dockerfile:15 — ARG BASE_IMAGE_TAG="25.01-cuda12.9-devel-ubuntu24.04"
The supplied manifest check shows none of the tested nvcr.io/nvidia/cuda-dl-base:*-cuda12.9-devel-ubuntu24.04 tags exist (docker manifest inspect returned no matches). This will 404 at build time.
Action: replace the ARG with a published nvcr.io/nvidia/cuda-dl-base tag that actually includes CUDA 12.9 (or remove the 25.01 track constraint) and verify with docker manifest inspect or the NGC registry before merging.
🤖 Prompt for AI Agents