Skip to content

[docker] fix: new images for sgl056 and vllm012 have compatibility issues#4714

Merged
wuxibin89 merged 2 commits intoverl-project:mainfrom
Begunner:docker-pr-post
Dec 29, 2025
Merged

[docker] fix: new images for sgl056 and vllm012 have compatibility issues#4714
wuxibin89 merged 2 commits intoverl-project:mainfrom
Begunner:docker-pr-post

Conversation

@Begunner
Copy link
Collaborator

What does this PR do?

TransformerEngine-v2.8 leads to unexpected crashes. Try to update it to v2.10.
Fix other resultant compatibility issues.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the TransformerEngine version from v2.8 to v2.10 in the sglang and vllm Dockerfiles to resolve compatibility issues. The change is correct and addresses the stated problem. My review includes a suggestion to pin the dependency to a specific commit hash instead of a tag to improve build reproducibility and security.

RUN MAX_JOBS=128 pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" git+https://github.com/NVIDIA/apex.git

RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-cache-dir --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.8
RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-cache-dir --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For better reproducibility and security, it's recommended to pin dependencies to a specific commit hash instead of a tag. The tag release_v2.10 can be moved, which could lead to different build results in the future. The commit hash corresponding to this tag is 06082989335780a5f7808246a30146313175883a.

RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-cache-dir --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@06082989335780a5f7808246a30146313175883a

RUN MAX_JOBS=128 pip install -v --disable-pip-version-check --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" git+https://github.com/NVIDIA/apex.git

RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.8
RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For better reproducibility and security, it's recommended to pin dependencies to a specific commit hash instead of a tag. The tag release_v2.10 can be moved, which could lead to different build results in the future. The commit hash corresponding to this tag is 06082989335780a5f7808246a30146313175883a.

RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@06082989335780a5f7808246a30146313175883a

@wuxibin89 wuxibin89 merged commit f3a0233 into verl-project:main Dec 29, 2025
98 of 115 checks passed
boren-ms pushed a commit to boren-ms/verl that referenced this pull request Dec 30, 2025
…sues (verl-project#4714)

### What does this PR do?

> TransformerEngine-v2.8 leads to unexpected crashes. Try to update it
to v2.10.
> Fix other resultant compatibility issues.

---------

Co-authored-by: Begunner <went@bytedance.com>
jsfanfanfan pushed a commit to meituan-search/verl that referenced this pull request Jan 9, 2026
…sues (verl-project#4714)

### What does this PR do?

> TransformerEngine-v2.8 leads to unexpected crashes. Try to update it
to v2.10.
> Fix other resultant compatibility issues.

---------

Co-authored-by: Begunner <went@bytedance.com>
vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026
…sues (verl-project#4714)

### What does this PR do?

> TransformerEngine-v2.8 leads to unexpected crashes. Try to update it
to v2.10.
> Fix other resultant compatibility issues.

---------

Co-authored-by: Begunner <went@bytedance.com>
sophiayyya pushed a commit to sophiayyya/verl that referenced this pull request Jan 25, 2026
…sues (verl-project#4714)

### What does this PR do?

> TransformerEngine-v2.8 leads to unexpected crashes. Try to update it
to v2.10.
> Fix other resultant compatibility issues.

---------

Co-authored-by: Begunner <went@bytedance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants