-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[sglang, doc] feat: add NPU GRPO training scripts for Qwen3-30B (Megaton/SGLang backends) and update doc #5060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,7 @@ | ||
| Ascend Quickstart with SGLang Backend | ||
| =================================== | ||
|
|
||
| Last updated: 09/25/2025. | ||
| Last updated: 01/27/2026. | ||
|
|
||
| 我们在 verl 上增加对华为昇腾设备的支持。 | ||
|
|
||
|
|
@@ -17,97 +17,137 @@ Atlas 800T A3 | |
|
|
||
| 安装 | ||
| ----------------------------------- | ||
| 关键支持版本 | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| 基础环境准备 | ||
| +-----------+-----------------+ | ||
| | software | version | | ||
| +===========+=================+ | ||
| | Python | == 3.11 | | ||
| +-----------+-----------------+ | ||
| | HDK | >= 25.3.RC1 | | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is 25.3.RC1 necessary? HDK updating is really challenging in a real production environment.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. resharding in sglang is developed based on ipc which is supported in new HDK |
||
| +-----------+-----------------+ | ||
| | CANN | >= 8.3.RC1 | | ||
| +-----------+-----------------+ | ||
| | torch | >= 2.7.1 | | ||
| +-----------+-----------------+ | ||
| | torch_npu | >= 2.7.1.post2 | | ||
| +-----------+-----------------+ | ||
| | sglang | v0.5.8 | | ||
| +-----------+-----------------+ | ||
|
|
||
| 从 Docker 镜像进行安装 | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
| 我们提供了DockerFile进行构建,详见 `dockerfile_build_guidance <https://github.com/verl-project/verl/blob/main/docs/ascend_tutorial/dockerfile_build_guidance.rst>`_ ,请根据设备自行选择对应构建文件 | ||
|
|
||
| 从自定义环境安装 | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| **1. 安装HDK&CANN依赖并激活** | ||
|
|
||
| 异构计算架构CANN(Compute Architecture for Neural Networks)是昇腾针对AI场景推出的异构计算架构, 为了使训练和推理引擎能够利用更好、更快的硬件支持, 我们需要安装以下 `先决条件 <https://www.hiascend.com/document/detail/zh/canncommercial/83RC1/softwareinst/instg/instg_quick.html?Mode=PmIns&InstallType=netconda&OS=openEuler&Software=cannToolKit>`_ | ||
|
|
||
| +-----------+-------------+ | ||
| | software | version | | ||
| +-----------+-------------+ | ||
| | Python | == 3.11 | | ||
| +-----------+-------------+ | ||
| | CANN | == 8.3.RC1 | | ||
| +-----------+-------------+ | ||
| | HDK | == 25.3.RC1 | | ||
| +-----------+-------------+ | ||
| | torch | == 2.6.0 | | ||
| | HDK | >= 25.3.RC1 | | ||
| +-----------+-------------+ | ||
| | torch_npu | == 2.6.0 | | ||
| | CANN | >= 8.3.RC1 | | ||
| +-----------+-------------+ | ||
| 安装完成后请激活环境 | ||
|
|
||
| **目前verl框架中sglang npu后端仅支持上述HDK、CANN和PTA版本, 商发可用版本预计2025年10月发布** | ||
| .. code-block:: bash | ||
|
|
||
| 为了能够在 verl 中正常使用 sglang,需使用以下命令安装sglang、torch_memory_saver和verl。 | ||
| source /usr/local/Ascend/ascend-toolkit/set_env.sh | ||
| source /usr/local/Ascend/nnal/atb/set_env.sh | ||
|
|
||
| **2. 创建conda环境** | ||
|
|
||
| sglang | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
| .. code-block:: bash | ||
|
|
||
| # sglang | ||
| git clone https://github.com/sgl-project/sglang.git | ||
| cd sglang | ||
| mv python/pyproject.toml python/pyproject.toml.backup | ||
| mv python/pyproject_other.toml python/pyproject.toml | ||
| pip install -e "python[srt_npu]" | ||
|
|
||
| 安装torch_memory_saver | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
| # create conda env | ||
| conda create -n verl-sglang python==3.11 | ||
| conda activate verl-sglang | ||
|
|
||
| **3. 然后,执行我们在 verl 中提供的脚本** `install_sglang_mcore_npu.sh <https://github.com/verl-project/verl/blob/main/scripts/install_sglang_mcore_npu.sh>`_ | ||
|
|
||
| 如果在此步骤中遇到错误,请检查脚本并手动按照脚本中的步骤操作。 | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| # torch_memory_saver | ||
| git clone https://github.com/sgl-project/sgl-kernel-npu.git | ||
| cd sgl-kernel-npu | ||
| bash build.sh -a memory-saver | ||
| pip install output/torch_memory_saver*.whl | ||
|
|
||
| 安装verl | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
| git clone https://github.com/volcengine/verl.git | ||
| # Make sure you have activated verl conda env | ||
| # NPU_DEVICE=A3 or A2 depends on your device | ||
| NPU_DEVICE=A3 bash verl/scripts/install_sglang_mcore_npu.sh | ||
|
|
||
| **4. 安装verl** | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| git clone https://github.com/volcengine/verl.git | ||
| cd verl | ||
| pip install --no-deps -e . | ||
| pip install -r requirements-npu.txt | ||
|
|
||
|
|
||
| 其他三方库说明 | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
| 快速开始 | ||
| ----------------------------------- | ||
|
|
||
| +--------------+---------------+ | ||
| | software | description | | ||
| +--------------+---------------+ | ||
| | transformers | v4.56.1 | | ||
| +--------------+---------------+ | ||
| | triton_ascend| v3.2.0 | | ||
| +--------------+---------------+ | ||
| **1.当前NPU sglang脚本一览** | ||
|
|
||
| 1. sglang依赖 transformers v4.56.1 | ||
| 2. sglang依赖triton_ascend v3.2.0 | ||
| 3. 暂不支持多模态模型,卸载相关安装包torchvision、timm | ||
| .. _Qwen3-30B: https://github.com/verl-project/verl/blob/main/examples/grpo_trainer/run_qwen3moe-30b_sglang_megatron_npu.sh | ||
| .. _Qwen2.5-32B: https://github.com/verl-project/verl/blob/main/examples/grpo_trainer/run_qwen2-32b_sglang_fsdp_npu.sh | ||
| .. _Qwen3-8B-1k: https://github.com/verl-project/verl/blob/main/examples/grpo_trainer/run_qwen3_8b_grpo_sglang_1k_spmd_npu.sh | ||
| .. _Qwen3-8B-32k: https://github.com/verl-project/verl/blob/main/examples/grpo_trainer/run_qwen3_8b_grpo_sglang_32k_spmd_npu.sh | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| pip uninstall torchvision | ||
| pip uninstall timm | ||
| pip uninstall triton | ||
|
|
||
| pip install transformers==4.56.1 | ||
| pip install -i https://test.pypi.org/simple/ triton-ascend==3.2.0.dev20250925 | ||
| +-----------------+----------------+----------+-------------------+ | ||
| | 模型 | 推荐NPU型号 | 节点数量 | 训推后端 | | ||
| +=================+================+==========+===================+ | ||
| | `Qwen3-30B`_ | Atlas 800T A3 | 1 | SGLang + Megatron | | ||
| +-----------------+----------------+----------+-------------------+ | ||
| | `Qwen2.5-32B`_ | Atlas 900 A2 | 2 | SGLang + FSDP | | ||
| +-----------------+----------------+----------+-------------------+ | ||
| | `Qwen3-8B-1k`_ | Atlas A3/A2 | 1 | SGLang + FSDP | | ||
| +-----------------+----------------+----------+-------------------+ | ||
| | `Qwen3-8B-32k`_ | Atlas A3/A2 | 1 | SGLang + FSDP | | ||
| +-----------------+----------------+----------+-------------------+ | ||
|
|
||
| **2.最佳实践** | ||
|
|
||
| 快速开始 | ||
| ----------------------------------- | ||
| 正式使用前,建议您通过对Qwen3-8B GRPO的训练尝试以检验环境准备和安装的正确性。 | ||
| 我们提供基于verl+sglang `Qwen3-30B`_ 以及 `Qwen2.5-32B`_ 的 `最佳实践 <https://github.com/verl-project/verl/blob/main/docs/ascend_tutorial/examples/ascend_sglang_best_practices.rst>`_ 作为参考 | ||
|
|
||
| **3.环境变量与参数** | ||
|
|
||
| 1.下载数据集并将数据集预处理为parquet格式,以便包含计算RL奖励所需的必要字段 | ||
| 当前NPU上支持sglang后端必须添加以下环境变量 | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| python3 examples/data_preprocess/gsm8k.py --local_save_dir ~/data/gsm8k | ||
| #支持NPU单卡多进程 https://www.hiascend.com/document/detail/zh/canncommercial/850/commlib/hcclug/hcclug_000091.html | ||
| export HCCL_HOST_SOCKET_PORT_RANGE=60000-60050 | ||
| export HCCL_NPU_SOCKET_PORT_RANGE=61000-61050 | ||
| #规避ray在device侧调用无法根据is_npu_available接口识别设备可用性 | ||
| export RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES=1 | ||
| #根据当前设备和需要卡数定义 | ||
| export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 | ||
| #使能推理EP时需要 | ||
| export SGLANG_DEEPEP_BF16_DISPATCH=1 | ||
|
|
||
|
|
||
|
|
||
| 2.执行训练 | ||
| 当前verl已解析推理常见参数, 详见 `async_sglang_server.py <https://github.com/verl-project/verl/blob/main/verl/workers/rollout/sglang_rollout/async_sglang_server.py>`_ 中 ServerArgs初始化传参,其他 `sglang参数 <https://github.com/sgl-project/sglang/blob/main/docs/advanced_features/server_arguments.md>`_ 均可通过engine_kwargs 进行参数传递 | ||
|
|
||
| vllm后端推理脚本转换为sglang, 需要添加修改以下参数 | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| bash verl/examples/grpo_trainer/run_qwen3_8b_grpo_sglang_1k_npu.sh | ||
| #必须 | ||
| actor_rollout_ref.rollout.name=sglang | ||
| +actor_rollout_ref.rollout.engine_kwargs.sglang.attention_backend="ascend" | ||
| #可选 | ||
| #使能推理EP,详细使用方法见 https://github.com/sgl-project/sgl-kernel-npu/blob/main/python/deep_ep/README_CN.md | ||
| ++actor_rollout_ref.rollout.engine_kwargs.sglang.deepep_mode="auto" | ||
| ++actor_rollout_ref.rollout.engine_kwargs.sglang.moe_a2a_backend="deepep" | ||
| #Moe模型多DP时必须设置为True | ||
| +actor_rollout_ref.rollout.engine_kwargs.sglang.enable_dp_attention=False | ||
| #chunked_prefill默认关闭 | ||
| +actor_rollout_ref.rollout.engine_kwargs.sglang.chunked_prefill_size=-1 | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do it match daily image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as the DockerFile we provide