[docker] feat: update stable image to vllm==0.12.0, sglang==0.5.6 by Begunner · Pull Request #4653 · verl-project/verl

Begunner · 2025-12-23T10:10:16Z

What does this PR do?

Update dockerfiles of vllm/sglang stable images, vllm from v0.11.0 to v0.12.0 and sglang from v0.5.5 to v0.5.6.
Use nvidia/cuda:12.9.1-devel-ubuntu22.04 as base image for vllm.
Add hacking code for updating megatron from 0.14.0 to 0.15.0.

CLAassistant · 2025-12-23T10:10:25Z

All committers have signed the CLA.

gemini-code-assist

Code Review

This pull request updates the Docker images for vllm and sglang to newer versions, along with their dependencies like TransformerEngine and Megatron-LM. The base image for vllm has been changed to a more minimal CUDA image, requiring explicit installation of dependencies, which is a good step towards a more controlled environment. My review focuses on Dockerfile best practices to ensure image size optimization and build reproducibility. I've suggested combining some RUN layers and pinning a dependency version for consistency. The Python code changes appear to be necessary workarounds for issues arising from the library updates and seem correct.

gemini-code-assist · 2025-12-23T10:12:55Z

docker/Dockerfile.stable.sglang

+RUN pip install --no-cache-dir --no-build-isolation flash_attn
+#==2.8.1


For reproducible builds, it's recommended to pin package versions. The version for flash_attn has been unpinned. This could lead to different versions being installed in the future, potentially causing unexpected behavior or build failures. The Dockerfile.stable.vllm file pins flash_attn to version 2.8.1. For consistency and reproducibility, consider pinning this dependency as well.

RUN pip install --no-cache-dir --no-build-isolation flash_attn==2.8.1

gemini-code-assist · 2025-12-23T10:12:55Z

docker/Dockerfile.stable.sglang

+RUN wget https://developer.nvidia.com/downloads/assets/tools/secure/nsight-systems/2025_6/nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \
    apt-get update && apt-get install -y libxcb-cursor0

-RUN apt-get install -y ./nsight-systems-2025.5.1_2025.5.1.121-1_amd64.deb && \
+RUN apt-get install -y ./nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \
    rm -rf /usr/local/cuda/bin/nsys && \
-    ln -s /opt/nvidia/nsight-systems/2025.3.1/target-linux-x64/nsys  /usr/local/cuda/bin/nsys && \
+    ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys  /usr/local/cuda/bin/nsys && \
    rm -rf /usr/local/cuda/bin/nsys-ui && \
-    ln -s /opt/nvidia/nsight-systems/2025.3.1/target-linux-x64/nsys-ui /usr/local/cuda/bin/nsys-ui && \
-    rm nsight-systems-2025.5.1_2025.5.1.121-1_amd64.deb
+    ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys-ui /usr/local/cuda/bin/nsys-ui && \
+    rm nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb


To optimize the Docker image size and build process, it's recommended to combine related commands into a single RUN layer. The wget, apt-get install, and cleanup steps for nsight-systems are currently split across two RUN layers. Combining them reduces the number of layers in the image. Additionally, it's good practice to clean up the apt cache (rm -rf /var/lib/apt/lists/*) in the same layer as apt-get update and apt-get install.

RUN wget https://developer.nvidia.com/downloads/assets/tools/secure/nsight-systems/2025_6/nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \ apt-get update && apt-get install -y libxcb-cursor0 && \ apt-get install -y ./nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \ rm -rf /usr/local/cuda/bin/nsys && \ ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys /usr/local/cuda/bin/nsys && \ rm -rf /usr/local/cuda/bin/nsys-ui && \ ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys-ui /usr/local/cuda/bin/nsys-ui && \ rm nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \ rm -rf /var/lib/apt/lists/*

wuxibin89 · 2025-12-24T03:42:46Z

verl/workers/engine/megatron/__init__.py

 # limitations under the License.

-from .transformer_impl import MegatronEngine, MegatronEngineWithLMHead
+# Avoid cpu worker trigger cuda jit error


We should not import megatron in CPU only actor?

…rl-project#4653) ### What does this PR do? 1. Update dockerfiles of vllm/sglang stable images, vllm from v0.11.0 to v0.12.0 and sglang from v0.5.5 to v0.5.6. 2. Use nvidia/cuda:12.9.1-devel-ubuntu22.04 as base image for vllm. 3. Add hacking code for updating megatron from 0.14.0 to 0.15.0. --------- Co-authored-by: Begunner <went@bytedance.com> Co-authored-by: wuxibin <wuxibin@bytedance.com>

Begunner added 8 commits December 22, 2025 20:41

adapt dockerfiles to vllm/sglang new versions

96b42b0

adapt to sglang

11d8357

update megatron

5dd2e64

update te

12a72f5

update nsight

cf892ba

avoid cpu worker trigger cuda jit error (megatron0.15.0)

935fb35

update workflow images vllm011->012, sgl055->056

acc1207

update docker/README.md

329f196

Begunner requested review from ZihengJiang, eric-haibin-lin and vermouth1992 as code owners December 23, 2025 10:10

gemini-code-assist bot reviewed Dec 23, 2025

View reviewed changes

Begunner added 2 commits December 23, 2025 20:00

lint

2c8caad

recover flash_attn==2.8.1

6e9a953

wuxibin89 changed the title ~~[Docker] Update Dockerfile.stable.vllm to 0.12.0, Dockerfile.stable.sglang to 0.5.6~~ [docker] feat: update stable image to vllm==0.12.0, sglang==0.5.6 Dec 24, 2025

wuxibin89 reviewed Dec 24, 2025

View reviewed changes

Begunner added 13 commits December 24, 2025 14:45

rerun ci

bbc44c5

for .cuda check

44fb125

avoid package conflict

d5898c5

install curl in vllm image

7eee778

restart ci

893af8b

add pytest-a

deeb341

compile flash_attn from source

c0a81ff

update vllm/sglang in setup.py

477c664

rm [gpu] [sglang]

ccdf6a4

update param for vllm

74e5198

mark megatron __init__ as HACK

169ce87

modify workflow install logics

4b05287

rollback some modifies

f076fc2

Begunner and others added 6 commits December 25, 2025 21:07

rm error install

6fe5069

keep all verl installation the same

3f167eb

fix typo

2a270c2

add e2e_sft timeout

4615ab1

separate llm and vlm e2e_sft

7daa9ea

fix workflow

c95906a

wuxibin89 approved these changes Dec 26, 2025

View reviewed changes

Begunner added 3 commits December 27, 2025 15:57

avoid reinstallation of cudnn

e86d500

avoid mlflow installing numpy>=2.0

46b7507

te2.11 is slow for sft, rollback to te2.8

c90fe95

wuxibin89 merged commit 99a3ed0 into verl-project:main Dec 29, 2025
79 of 82 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docker] feat: update stable image to vllm==0.12.0, sglang==0.5.6#4653

[docker] feat: update stable image to vllm==0.12.0, sglang==0.5.6#4653
wuxibin89 merged 32 commits intoverl-project:mainfrom
Begunner:docker-pr

Begunner commented Dec 23, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Dec 23, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

wuxibin89 Dec 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		RUN pip install --no-cache-dir --no-build-isolation flash_attn
		#==2.8.1

Conversation

Begunner commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

CLAassistant commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

wuxibin89 Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Begunner commented Dec 23, 2025 •

edited

Loading

CLAassistant commented Dec 23, 2025 •

edited

Loading

wuxibin89 Dec 24, 2025 •

edited

Loading