Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .github/workflows/e2e_ascend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,11 @@ jobs:
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
steps:
- name: Config third-party dependency download cache
run: |
sed -Ei 's@(ports|archive).ubuntu.com@cache-service.nginx-pypi-cache.svc.cluster.local:8081@g' /etc/apt/sources.list
pip config set global.index-url http://cache-service.nginx-pypi-cache.svc.cluster.local/pypi/simple
pip config set global.trusted-host cache-service.nginx-pypi-cache.svc.cluster.local
- name: Check npu and CANN info
run: |
cat /usr/local/Ascend/ascend-toolkit/latest/"$(uname -i)"-linux/ascend_toolkit_install.info
Expand Down Expand Up @@ -123,6 +128,11 @@ jobs:
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
steps:
- name: Config third-party dependency download cache
run: |
sed -Ei 's@(ports|archive).ubuntu.com@cache-service.nginx-pypi-cache.svc.cluster.local:8081@g' /etc/apt/sources.list
pip config set global.index-url http://cache-service.nginx-pypi-cache.svc.cluster.local/pypi/simple
pip config set global.trusted-host cache-service.nginx-pypi-cache.svc.cluster.local
- name: Check npu and CANN info
run: |
cat /usr/local/Ascend/ascend-toolkit/latest/"$(uname -i)"-linux/ascend_toolkit_install.info
Expand Down Expand Up @@ -186,6 +196,11 @@ jobs:
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
steps:
- name: Config third-party dependency download cache
run: |
sed -Ei 's@(ports|archive).ubuntu.com@cache-service.nginx-pypi-cache.svc.cluster.local:8081@g' /etc/apt/sources.list
pip config set global.index-url http://cache-service.nginx-pypi-cache.svc.cluster.local/pypi/simple
pip config set global.trusted-host cache-service.nginx-pypi-cache.svc.cluster.local
- name: Check npu and CANN info
run: |
cat /usr/local/Ascend/ascend-toolkit/latest/"$(uname -i)"-linux/ascend_toolkit_install.info
Expand Down
2 changes: 2 additions & 0 deletions docker/ascend/Dockerfile.ascend_8.3.rc1_a2
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ RUN ARCH=$(uname -m) && \
echo "export PYTHONPATH=\$PYTHONPATH:/Megatron-LM" >> ~/.bashrc && \
# Remove existing triton or triton-ascend installed by some third-party packages
pip uninstall -y triton triton-ascend && \
# Install mbridge
pip install mbridge && \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To ensure reproducible Docker builds, it is a best practice to pin the versions of all installed packages. Using pip install mbridge will install the latest version, which could introduce breaking changes in the future and make builds non-deterministic. Please pin this dependency to a specific, known-working version.

    pip install mbridge==<version> && \

# Clear extra files
rm -rf /tmp/* /var/tmp/* && \
pip cache purge
Expand Down
2 changes: 2 additions & 0 deletions docker/ascend/Dockerfile.ascend_8.3.rc1_a3
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ RUN ARCH=$(uname -m) && \
echo "export PYTHONPATH=\$PYTHONPATH:/Megatron-LM" >> ~/.bashrc && \
# Remove existing triton or triton-ascend installed by some third-party packages
pip uninstall -y triton triton-ascend && \
# Install mbridge
pip install mbridge && \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To ensure reproducible Docker builds, it is a best practice to pin the versions of all installed packages. Using pip install mbridge will install the latest version, which could introduce breaking changes in the future and make builds non-deterministic. Please pin this dependency to a specific, known-working version.

    pip install mbridge==<version> && \

# Clear extra files
rm -rf /tmp/* /var/tmp/* && \
pip cache purge
Expand Down
3 changes: 3 additions & 0 deletions docs/ascend_tutorial/ascend_quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,9 @@ MindSpeed 源码安装指令:
# (可选)如希望 shell 关闭,或系统重启后,PYTHONPATH 环境变量仍然生效,建议将它添加到 .bashrc 配置文件中
echo "export PYTHONPATH=$PYTHONPATH:\"$(pwd)/Megatron-LM\"" >> ~/.bashrc

# 安装 mbridge
pip install mbridge
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The installation instructions should specify a fixed version for mbridge to ensure users set up a consistent and working environment. Installing the latest version can lead to unexpected issues if breaking changes are introduced in the dependency. Please add a specific version to the pip install command.

Suggested change
pip install mbridge
pip install mbridge==<version>


MindSpeed 对应 Megatron-LM 后端使用场景,使用方式如下:

1. 使能 verl worker 模型 ``strategy`` 配置为 ``megatron`` ,例如 ``actor_rollout_ref.actor.strategy=megatron``。
Expand Down
3 changes: 2 additions & 1 deletion docs/ascend_tutorial/dockerfile_build_guidance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ vLLM-ascend 0.11.0rc1
Megatron-LM v0.12.1
MindSpeed (f2b0977e)
triton-ascend 3.2.0rc4
mbridge latest version
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using 'latest version' for a dependency is discouraged as it leads to non-reproducible environments and can introduce breaking changes unexpectedly. Please specify a fixed, known-working version for mbridge in this component list to ensure consistency and stability.

Suggested change
mbridge latest version
mbridge <version>

================= ============


Expand All @@ -57,7 +58,7 @@ A3 8.3.RC1 `Dockerfile.ascend_8.3.rc1_a3 <https://github.com/
# Navigate to the directory containing the Dockerfile
cd {verl-root-path}/docker/ascend
# Build the image
docker build -f Dockerfile.ascend_8.2.rc1_a2 -t verl-ascend:8.2.rc1-a2 .
docker build -f Dockerfile.ascend_8.3.rc1_a2 -t verl-ascend:8.3.rc1-a2 .


公开镜像地址
Expand Down
Loading