[docker] feat: update stable image to vllm==0.12.0, sglang==0.5.6#4653
[docker] feat: update stable image to vllm==0.12.0, sglang==0.5.6#4653wuxibin89 merged 32 commits intoverl-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the Docker images for vllm and sglang to newer versions, along with their dependencies like TransformerEngine and Megatron-LM. The base image for vllm has been changed to a more minimal CUDA image, requiring explicit installation of dependencies, which is a good step towards a more controlled environment. My review focuses on Dockerfile best practices to ensure image size optimization and build reproducibility. I've suggested combining some RUN layers and pinning a dependency version for consistency. The Python code changes appear to be necessary workarounds for issues arising from the library updates and seem correct.
docker/Dockerfile.stable.sglang
Outdated
| RUN pip install --no-cache-dir --no-build-isolation flash_attn | ||
| #==2.8.1 |
There was a problem hiding this comment.
For reproducible builds, it's recommended to pin package versions. The version for flash_attn has been unpinned. This could lead to different versions being installed in the future, potentially causing unexpected behavior or build failures. The Dockerfile.stable.vllm file pins flash_attn to version 2.8.1. For consistency and reproducibility, consider pinning this dependency as well.
RUN pip install --no-cache-dir --no-build-isolation flash_attn==2.8.1
| RUN wget https://developer.nvidia.com/downloads/assets/tools/secure/nsight-systems/2025_6/nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \ | ||
| apt-get update && apt-get install -y libxcb-cursor0 | ||
|
|
||
| RUN apt-get install -y ./nsight-systems-2025.5.1_2025.5.1.121-1_amd64.deb && \ | ||
| RUN apt-get install -y ./nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \ | ||
| rm -rf /usr/local/cuda/bin/nsys && \ | ||
| ln -s /opt/nvidia/nsight-systems/2025.3.1/target-linux-x64/nsys /usr/local/cuda/bin/nsys && \ | ||
| ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys /usr/local/cuda/bin/nsys && \ | ||
| rm -rf /usr/local/cuda/bin/nsys-ui && \ | ||
| ln -s /opt/nvidia/nsight-systems/2025.3.1/target-linux-x64/nsys-ui /usr/local/cuda/bin/nsys-ui && \ | ||
| rm nsight-systems-2025.5.1_2025.5.1.121-1_amd64.deb | ||
| ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys-ui /usr/local/cuda/bin/nsys-ui && \ | ||
| rm nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb |
There was a problem hiding this comment.
To optimize the Docker image size and build process, it's recommended to combine related commands into a single RUN layer. The wget, apt-get install, and cleanup steps for nsight-systems are currently split across two RUN layers. Combining them reduces the number of layers in the image. Additionally, it's good practice to clean up the apt cache (rm -rf /var/lib/apt/lists/*) in the same layer as apt-get update and apt-get install.
RUN wget https://developer.nvidia.com/downloads/assets/tools/secure/nsight-systems/2025_6/nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \
apt-get update && apt-get install -y libxcb-cursor0 && \
apt-get install -y ./nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \
rm -rf /usr/local/cuda/bin/nsys && \
ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys /usr/local/cuda/bin/nsys && \
rm -rf /usr/local/cuda/bin/nsys-ui && \
ln -s /opt/nvidia/nsight-systems/2025.6.1/target-linux-x64/nsys-ui /usr/local/cuda/bin/nsys-ui && \
rm nsight-systems-2025.6.1_2025.6.1.190-1_amd64.deb && \
rm -rf /var/lib/apt/lists/*
| # limitations under the License. | ||
|
|
||
| from .transformer_impl import MegatronEngine, MegatronEngineWithLMHead | ||
| # Avoid cpu worker trigger cuda jit error |
There was a problem hiding this comment.
We should not import megatron in CPU only actor?
…rl-project#4653) ### What does this PR do? 1. Update dockerfiles of vllm/sglang stable images, vllm from v0.11.0 to v0.12.0 and sglang from v0.5.5 to v0.5.6. 2. Use nvidia/cuda:12.9.1-devel-ubuntu22.04 as base image for vllm. 3. Add hacking code for updating megatron from 0.14.0 to 0.15.0. --------- Co-authored-by: Begunner <went@bytedance.com> Co-authored-by: wuxibin <wuxibin@bytedance.com>
…rl-project#4653) ### What does this PR do? 1. Update dockerfiles of vllm/sglang stable images, vllm from v0.11.0 to v0.12.0 and sglang from v0.5.5 to v0.5.6. 2. Use nvidia/cuda:12.9.1-devel-ubuntu22.04 as base image for vllm. 3. Add hacking code for updating megatron from 0.14.0 to 0.15.0. --------- Co-authored-by: Begunner <went@bytedance.com> Co-authored-by: wuxibin <wuxibin@bytedance.com>
…rl-project#4653) ### What does this PR do? 1. Update dockerfiles of vllm/sglang stable images, vllm from v0.11.0 to v0.12.0 and sglang from v0.5.5 to v0.5.6. 2. Use nvidia/cuda:12.9.1-devel-ubuntu22.04 as base image for vllm. 3. Add hacking code for updating megatron from 0.14.0 to 0.15.0. --------- Co-authored-by: Begunner <went@bytedance.com> Co-authored-by: wuxibin <wuxibin@bytedance.com>
…rl-project#4653) ### What does this PR do? 1. Update dockerfiles of vllm/sglang stable images, vllm from v0.11.0 to v0.12.0 and sglang from v0.5.5 to v0.5.6. 2. Use nvidia/cuda:12.9.1-devel-ubuntu22.04 as base image for vllm. 3. Add hacking code for updating megatron from 0.14.0 to 0.15.0. --------- Co-authored-by: Begunner <went@bytedance.com> Co-authored-by: wuxibin <wuxibin@bytedance.com>
What does this PR do?