docker: Add CUDA 13.2 Docker containers by bkryu · Pull Request #2843 · flashinfer-ai/flashinfer

bkryu · 2026-03-20T21:15:36Z

📌 Description

Add Dockerfile.cu132 and Dockerfile.cu132.dev based on nvidia/cuda:13.2.0-devel-ubuntu24.04
Replace cu131 with cu132 in the CI Docker release workflow, since PyTorch skipped cu131 wheels
Use PyTorch nightly (nightly/cu132) until a stable release is available for cu132

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

Chores
- Updated CI/CD pipeline to support CUDA 13.2, replacing the previous 13.1 version for all image builds
- Added Docker runtime container image for CUDA 13.2 with Python 3.12 and production dependencies
- Added Docker development container image for CUDA 13.2 with comprehensive developer tools, MPI support, and shell utilities

gemini-code-assist · 2026-03-20T21:15:54Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates support for CUDA 13.2 into the project's Docker build system. It introduces new Dockerfiles for both standard and development environments, enabling the use of the latest NVIDIA GPU capabilities and PyTorch versions. This update ensures compatibility with current hardware and software stacks, facilitating ongoing development and testing.

Highlights

New CUDA 13.2 Dockerfiles: Introduced Dockerfile.cu132 for production and Dockerfile.cu132.dev for development, both based on nvidia/cuda:13.2.0-devel-ubuntu24.04.
PyTorch Integration: Configured Docker images to use PyTorch nightly builds for CUDA 13.2, with plans to switch to stable releases once available.
CI Workflow Update: Updated the CI Docker release workflow to use cu132 instead of cu131, aligning with PyTorch's wheel availability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/release-ci-docker.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-03-20T21:17:28Z

📝 Walkthrough

Walkthrough

This pull request updates the CI workflow to target CUDA 13.2 instead of 13.1 and adds two new Docker images: a production image and a development image based on CUDA 13.2 with Ubuntu 24.04. The workflow matrix and manifest generation now reference the new cu132 variant for all image builds.

Changes

Cohort / File(s)	Summary
CI Workflow Configuration `.github/workflows/release-ci-docker.yml`	Updated the build matrix, manifest creation matrix, and generated tag entries to replace `cu131` with `cu132` for CUDA variant targeting across all image build jobs and downstream manifest operations.
Production Docker Image `docker/Dockerfile.cu132`	Added a new CUDA 13.2 production Docker image based on NVIDIA CUDA 13.2.0 on Ubuntu 24.04 with Conda py312 environment, cuBLAS path configuration, and nightly cu132 Python dependencies including tilelang, cuda-tile, and mpi4py.
Development Docker Image `docker/Dockerfile.cu132.dev`	Added a new CUDA 13.2 development Docker image with developer tools (clang-format, clangd, git, curl), non-root user with sudo access, pre-commit hooks, oh-my-zsh shell customization, and the same py312 environment and dependencies as the production image.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

ci: set LD_LIBRARY_PATH in Docker images for correct cuBLAS detection #2468: Updates LD_LIBRARY_PATH configuration in CUDA Dockerfiles to prioritize pip-installed cuBLAS libraries, aligned with cuBLAS path handling in this PR.
Update Docker CI tags to 20260105-a97b5d7 #2289: Modifies the ci/docker-tags.yml manifest that this PR updates when switching CUDA variants from cu131 to cu132.
docker: add CUDA 13.1 Dockerfiles with cuda-tile #2774: Adds CUDA-specific Dockerfiles and updates the release-ci-docker workflow with CUDA matrix entries, directly parallel to this PR's changes.

Suggested labels

docker

Suggested reviewers

yzh119
cyx-6

Poem

🐰 Hopping through CUDA lands so new,
cu132 shines in every view,
Docker containers, dev and prod,
Ubuntu 24.04, tools that are mod,
Python py312, a merry crew! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'docker: Add CUDA 13.2 Docker containers' accurately and concisely summarizes the main changes: adding two new CUDA 13.2 Dockerfiles and updating the CI workflow for cu132 support.
Description check	✅ Passed	The PR description covers the key changes (new Dockerfiles, cu131→cu132 migration, PyTorch nightly usage), marks the PR checklist items, but lacks explicit links to related issues in the 'Related Issues' section.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request adds Dockerfiles for CUDA 13.2. The changes look good overall. I've made a few suggestions to improve the Dockerfiles by following best practices, such as optimizing apt-get commands to reduce image size and improve build performance. Specifically, I've recommended cleaning up the apt cache in one file and consolidating package installations in the other to reduce redundant operations and image layers.

gemini-code-assist · 2026-03-20T21:17:33Z

docker/Dockerfile.cu132

+RUN apt-get update && apt-get install -y \
+    curl \
+    git \
+    wget


It's a Docker best practice to clean up the apt cache within the same RUN layer that apt-get update is called. This reduces the final image size. Please add && rm -rf /var/lib/apt/lists/* to this command.

RUN apt-get update && apt-get install -y \ curl \ git \ wget \ && rm -rf /var/lib/apt/lists/*

gemini-code-assist · 2026-03-20T21:17:33Z

docker/Dockerfile.cu132.dev

+RUN apt-get update && apt-get install -y \
+    curl \
+    git \
+    wget \
+    clang-format \
+    clangd-19 \
+    vim \
+    zsh \
+    && rm -rf /var/lib/apt/lists/*


To optimize Docker layers and reduce build time, it's best to install all apt packages in a single RUN instruction. Consider adding sudo here, as it's installed in a separate layer later on. This will also allow removing a redundant apt-get update.

RUN apt-get update && apt-get install -y \ curl \ git \ wget \ clang-format \ clangd-19 \ vim \ zsh \ sudo \ && rm -rf /var/lib/apt/lists/*

gemini-code-assist · 2026-03-20T21:17:33Z

docker/Dockerfile.cu132.dev

+RUN groupadd --gid $USER_GID $USERNAME \
+    && useradd --uid $USER_UID --gid $USER_GID -m $USERNAME \
+    # [Optional] Add sudo support
+    && apt-get update \
+    && apt-get install -y sudo \
+    && echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
+    && chmod 0440 /etc/sudoers.d/$USERNAME \
+    && rm -rf /var/lib/apt/lists/*


Following the previous suggestion to install sudo earlier, this RUN instruction can be simplified to only handle user creation and sudoers configuration. This avoids an unnecessary apt-get update and apt-get install, making the Dockerfile more efficient.

RUN groupadd --gid $USER_GID $USERNAME \ && useradd --uid $USER_UID --gid $USER_GID -m $USERNAME \ # [Optional] Add sudo support && echo $USERNAME ALL=$root$ NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \ && chmod 0440 /etc/sudoers.d/$USERNAME

yongwww

looks good to me

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/release-ci-docker.yml (1)
39-61: ⚠️ Potential issue | 🟠 Major

The new .dev Dockerfile variant has no build/publish automation.

The workflow builds only docker/Dockerfile.${{ matrix.cuda }} (line 61), which resolves to the standard variants (cu132, not cu132.dev). No other automation path in the codebase references .dev Dockerfiles or -dev images. If docker/Dockerfile.cu132.dev is being added, either add it to the matrix or implement a separate build path for it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release-ci-docker.yml around lines 39 - 61, The workflow
only builds docker/Dockerfile.${{ matrix.cuda }} (the "Build and push ${{
matrix.cuda }} ${{ matrix.arch }} image" step), so any new .dev Dockerfiles
(e.g., docker/Dockerfile.cu132.dev) are never built or published; fix by either
adding the .dev variant to the matrix (expand matrix.cuda to include cu132.dev)
or add a dedicated job/step that builds and pushes docker/Dockerfile.${{
matrix.cuda }}.dev (and tags the image with a -dev suffix), ensuring the step
uses docker/build-push-action@v5 and mirrors the existing context/file/tag/push
settings used by the current Build and push step.

🧹 Nitpick comments (2)

.github/workflows/release-ci-docker.yml (1)
37-40: Add fail-fast: true to this matrix.

With another CUDA entry in the fan-out, this job burns more self-hosted time after the first hard failure. Failing fast seems like the better default here.
Suggested diff
   build:
     runs-on: [self-hosted, linux, x64, cpu, on-demand]
     needs: generate-tag
     strategy:
+      fail-fast: true
       matrix:
         cuda: [cu126, cu128, cu129, cu130, cu132]
         arch: [amd64, arm64]
Based on learnings, in GitHub Actions workflow files under .github/workflows, set fail-fast: true for matrix jobs to reduce overall test time.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release-ci-docker.yml around lines 37 - 40, The matrix job
under the strategy block currently enumerates cuda and arch entries (matrix: {
cuda: [...], arch: [...] }) and should include fail-fast: true so the matrix
stops all remaining jobs after the first hard failure; add fail-fast: true at
the same indentation level as matrix under the strategy block (adjacent to
matrix) to enable fast failure for the cuda/arch fan-out.
docker/Dockerfile.cu132 (1)
6-9: Trim this apt layer.

Please add --no-install-recommends and clear /var/lib/apt/lists in the same layer. The dev variant already does the cleanup, and keeping it here avoids extra image size and pull latency on every CI pull.
Suggested diff
-RUN apt-get update && apt-get install -y \
+RUN apt-get update && apt-get install -y --no-install-recommends \
     curl \
     git \
-    wget
+    wget \
+    && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile.cu132` around lines 6 - 9, The RUN apt-get layer in
Dockerfile.cu132 installs curl/git/wget without pruning caches; update the RUN
command that starts with "apt-get update && apt-get install -y" to use
"--no-install-recommends" and perform apt list cleanup in the same layer (e.g.,
append "&& rm -rf /var/lib/apt/lists/*" after the install) so no extra image
layer or cache remains.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker/Dockerfile.cu132.dev`:
- Around line 43-51: The dev Dockerfile is missing the CUDA wheel library path
override, causing different CUDA user-space ordering than CI; update the
Dockerfile.cu132.dev to prepend the cu132 wheel libs into LD_LIBRARY_PATH (e.g.,
set ENV LD_LIBRARY_PATH to include /home/$USERNAME/conda/envs/py312/lib before
$LD_LIBRARY_PATH) so the nightly/cu132 Python wheel libs are searched first
(make the change near the existing ENV PATH settings or the RUN that installs
nightly/cu132).

---

Outside diff comments:
In @.github/workflows/release-ci-docker.yml:
- Around line 39-61: The workflow only builds docker/Dockerfile.${{ matrix.cuda
}} (the "Build and push ${{ matrix.cuda }} ${{ matrix.arch }} image" step), so
any new .dev Dockerfiles (e.g., docker/Dockerfile.cu132.dev) are never built or
published; fix by either adding the .dev variant to the matrix (expand
matrix.cuda to include cu132.dev) or add a dedicated job/step that builds and
pushes docker/Dockerfile.${{ matrix.cuda }}.dev (and tags the image with a -dev
suffix), ensuring the step uses docker/build-push-action@v5 and mirrors the
existing context/file/tag/push settings used by the current Build and push step.

---

Nitpick comments:
In @.github/workflows/release-ci-docker.yml:
- Around line 37-40: The matrix job under the strategy block currently
enumerates cuda and arch entries (matrix: { cuda: [...], arch: [...] }) and
should include fail-fast: true so the matrix stops all remaining jobs after the
first hard failure; add fail-fast: true at the same indentation level as matrix
under the strategy block (adjacent to matrix) to enable fast failure for the
cuda/arch fan-out.

In `@docker/Dockerfile.cu132`:
- Around line 6-9: The RUN apt-get layer in Dockerfile.cu132 installs
curl/git/wget without pruning caches; update the RUN command that starts with
"apt-get update && apt-get install -y" to use "--no-install-recommends" and
perform apt list cleanup in the same layer (e.g., append "&& rm -rf
/var/lib/apt/lists/*" after the install) so no extra image layer or cache
remains.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c4ba18c8-9a44-4047-b5e2-d64593bec871

📥 Commits

Reviewing files that changed from the base of the PR and between 8e1642a and d16eb34.

📒 Files selected for processing (3)

.github/workflows/release-ci-docker.yml
docker/Dockerfile.cu132
docker/Dockerfile.cu132.dev

coderabbitai · 2026-03-20T21:27:40Z

docker/Dockerfile.cu132.dev

+RUN echo "source activate py312" >> ~/.bashrc
+ENV PATH="/home/$USERNAME/conda/bin:$PATH"
+ENV PATH="/home/$USERNAME/conda/envs/py312/bin:$PATH"
+
+# Install torch and other python packages
+# use nightly/cu132 temporarily and change to cu132 when torch releases stable version
+COPY requirements.txt /install/requirements.txt
+COPY docker/install/install_python_packages.sh /install/install_python_packages.sh
+RUN bash /install/install_python_packages.sh nightly/cu132 && pip3 install pre-commit


⚠️ Potential issue | 🟠 Major

Keep the dev image on the same CUDA library search order as the CI image.

docker/Dockerfile.cu132 adds the cu13 wheel libs to LD_LIBRARY_PATH, but this dev variant installs the same nightly/cu132 stack without that override. That can make local repro load a different CUDA user-space than CI.

Suggested diff

RUN echo "source activate py312" >> ~/.bashrc ENV PATH="/home/$USERNAME/conda/bin:$PATH" ENV PATH="/home/$USERNAME/conda/envs/py312/bin:$PATH" +ENV LD_LIBRARY_PATH="/home/$USERNAME/conda/envs/py312/lib/python3.12/site-packages/nvidia/cu13/lib/:$LD_LIBRARY_PATH"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile.cu132.dev` around lines 43 - 51, The dev Dockerfile is missing the CUDA wheel library path override, causing different CUDA user-space ordering than CI; update the Dockerfile.cu132.dev to prepend the cu132 wheel libs into LD_LIBRARY_PATH (e.g., set ENV LD_LIBRARY_PATH to include /home/$USERNAME/conda/envs/py312/lib before $LD_LIBRARY_PATH) so the nightly/cu132 Python wheel libs are searched first (make the change near the existing ENV PATH settings or the RUN that installs nightly/cu132).

johnnynunez · 2026-03-22T09:53:12Z

super... this adds Jetson AGX Orin SBSA Support in flashinfer

Add CUDA 13.2 Dockerfiles

d16eb34

bkryu requested review from aleozlx, cyx-6, jimmyzho, kahyunnam, nv-yunzheq, saltyminty, sricketts, yongwww, yyihuang and yzh119 as code owners March 20, 2026 21:15

bkryu added the run-ci label Mar 20, 2026

gemini-code-assist bot reviewed Mar 20, 2026

View reviewed changes

yongwww approved these changes Mar 20, 2026

View reviewed changes

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

yzh119 approved these changes Mar 21, 2026

View reviewed changes

yongwww merged commit ff86ea0 into flashinfer-ai:main Mar 22, 2026
64 of 97 checks passed

Conversation

bkryu commented Mar 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Mar 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

yongwww left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

johnnynunez commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bkryu commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 20, 2026 •

edited

Loading