From 68c51e3717f985206731797cdbee95ca2044e098 Mon Sep 17 00:00:00 2001 From: sunyi001 <1659275352@qq.com> Date: Sat, 24 May 2025 20:48:54 +0800 Subject: [PATCH] modify the installation method of vllm-ascend on different architectures and hyperlink --- docs/ascend/ascend_vllm073.rst | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/ascend/ascend_vllm073.rst b/docs/ascend/ascend_vllm073.rst index 6b160dcdd6b..d7c2a60ee1b 100644 --- a/docs/ascend/ascend_vllm073.rst +++ b/docs/ascend/ascend_vllm073.rst @@ -55,7 +55,10 @@ vLLM git clone -b v0.7.3 --depth 1 https://github.com/vllm-project/vllm.git cd vllm pip install -r requirements-build.txt + # for Atlas 200T A2 Box16 VLLM_TARGET_DEVICE=empty pip install -e . --extra-index https://download.pytorch.org/whl/cpu/ + # for Atlas 800T A2 + VLLM_TARGET_DEVICE=empty pip install -e . .. code-block:: bash @@ -83,7 +86,7 @@ vLLM .. image:: https://github.com/eric-haibin-lin/verl-community/blob/main/docs/loss_comparison.png?raw=true :alt: loss_comparison -其中,N 表示训练的步数。更多信息请参考[精度计算说明](https://www.hiascend.com/document/detail/zh/Pytorch/600/ptmoddevg/trainingmigrguide/LMaccuracy_0001.html)。 +其中,N 表示训练的步数。更多信息请参考 `精度计算说明 `_。 根据经验,对于GRPO等强化学习算法,我们期望在相同配置下,在华为昇腾设备上的 reward 与英伟达 GPU 的 reward 平均绝对误差小于等于 4%,具体计算参考 Loss 计算。