Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/nightly-test-npu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
matrix:
part: [0, 1]
container:
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-a3-ubuntu22.04-py3.11
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc2-a3-ubuntu22.04-py3.11
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down Expand Up @@ -69,7 +69,7 @@ jobs:
matrix:
part: [0]
container:
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-a3-ubuntu22.04-py3.11
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc2-a3-ubuntu22.04-py3.11
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down Expand Up @@ -115,7 +115,7 @@ jobs:
matrix:
part: [0]
container:
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-a3-ubuntu22.04-py3.11
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc2-a3-ubuntu22.04-py3.11
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/pr-test-npu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
if: needs.check-changes.outputs.main_package == 'true'
runs-on: linux-arm64-npu-1
container:
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-910b-ubuntu22.04-py3.11
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc2-910b-ubuntu22.04-py3.11
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down Expand Up @@ -88,7 +88,7 @@ jobs:
matrix:
part: [0, 1, 2]
container:
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-910b-ubuntu22.04-py3.11
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc2-910b-ubuntu22.04-py3.11
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down Expand Up @@ -127,7 +127,7 @@ jobs:
if: needs.check-changes.outputs.main_package == 'true'
runs-on: linux-arm64-npu-4
container:
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-910b-ubuntu22.04-py3.11
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc2-910b-ubuntu22.04-py3.11
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down Expand Up @@ -170,7 +170,7 @@ jobs:
matrix:
part: [0, 1]
container:
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-a3-ubuntu22.04-py3.11
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc2-a3-ubuntu22.04-py3.11
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/release-docker-npu-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
runs-on: ubuntu-22.04-arm
strategy:
matrix:
cann_version: ["8.3.rc1"]
cann_version: ["8.3.rc2"]
device_type: ["910b", "a3"]
steps:
- name: Checkout repository
Expand Down Expand Up @@ -73,6 +73,6 @@ jobs:
push: ${{ github.repository == 'sgl-project/sglang' && github.event_name != 'pull_request' }}
provenance: false
build-args: |
SGLANG_KERNEL_NPU_TAG=20251128
SGLANG_KERNEL_NPU_TAG=20251206
CANN_VERSION=${{ matrix.cann_version }}
DEVICE_TYPE=${{ matrix.device_type }}
4 changes: 2 additions & 2 deletions .github/workflows/release-docker-npu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-22.04-arm
strategy:
matrix:
cann_version: ["8.3.rc1"]
cann_version: ["8.3.rc2"]
device_type: ["910b", "a3"]
steps:
- name: Checkout repository
Expand Down Expand Up @@ -70,6 +70,6 @@ jobs:
push: ${{ github.repository == 'sgl-project/sglang' && github.event_name != 'pull_request' }}
provenance: false
build-args: |
SGLANG_KERNEL_NPU_TAG=20251128
SGLANG_KERNEL_NPU_TAG=20251206
CANN_VERSION=${{ matrix.cann_version }}
DEVICE_TYPE=${{ matrix.device_type }}
2 changes: 1 addition & 1 deletion docker/npu.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ARG CANN_VERSION=8.3.rc1
ARG CANN_VERSION=8.3.rc2
ARG DEVICE_TYPE=a3
ARG OS=ubuntu22.04
ARG PYTHON_VERSION=py3.11
Expand Down
2 changes: 1 addition & 1 deletion docs/platforms/ascend_npu.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ conda activate sglang_npu

#### CANN

Prior to start work with SGLang on Ascend you need to install CANN Toolkit, Kernels operator package and NNAL version 8.3.RC1 or higher, check the [installation guide](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1/softwareinst/instg/instg_0008.html?Mode=PmIns&InstallType=local&OS=openEuler&Software=cannToolKit)
Prior to start work with SGLang on Ascend you need to install CANN Toolkit, Kernels operator package and NNAL version 8.3.RC2 or higher, check the [installation guide](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1/softwareinst/instg/instg_0008.html?Mode=PmIns&InstallType=local&OS=openEuler&Software=cannToolKit)

#### MemFabric Adaptor

Expand Down
10 changes: 5 additions & 5 deletions docs/platforms/ascend_npu_deepseek_example.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ python3 -m sglang.launch_server \
--trust-remote-code \
--attention-backend ascend \
--device npu \
--quantization w8a8_int8 \
--quantization modelslim \
--watchdog-timeout 9000 \
--host 127.0.0.1 \
--port 6688 \
Expand Down Expand Up @@ -89,7 +89,7 @@ python -m sglang.launch_server \
--mem-fraction-static 0.6 \
--attention-backend ascend \
--device npu \
--quantization w8a8_int8 \
--quantization modelslim \
--disaggregation-transfer-backend ascend \
--max-running-requests 8 \
--context-length 8192 \
Expand Down Expand Up @@ -145,7 +145,7 @@ python -m sglang.launch_server \
--max-running-requests 352 \
--attention-backend ascend \
--device npu \
--quantization w8a8_int8 \
--quantization modelslim \
--moe-a2a-backend deepep \
--enable-dp-attention \
--deepep-mode low_latency \
Expand Down Expand Up @@ -214,7 +214,7 @@ do
--mem-fraction-static 0.81 \
--attention-backend ascend \
--device npu \
--quantization w8a8_int8 \
--quantization modelslim \
--disaggregation-transfer-backend ascend \
--max-running-requests 8 \
--context-length 8192 \
Expand Down Expand Up @@ -275,7 +275,7 @@ do
--max-running-requests 832 \
--attention-backend ascend \
--device npu \
--quantization w8a8_int8 \
--quantization modelslim \
--moe-a2a-backend deepep \
--enable-dp-attention \
--deepep-mode low_latency \
Expand Down
2 changes: 1 addition & 1 deletion docs/references/mindspore_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ MindSpore is a high-performance AI framework optimized for Ascend NPUs. This doc
## Requirements

MindSpore currently only supports Ascend NPU devices. Users need to first install Ascend CANN software packages.
The CANN software packages can be downloaded from the [Ascend Official Website](https://www.hiascend.com). The recommended version is 8.3.RC1.
The CANN software packages can be downloaded from the [Ascend Official Website](https://www.hiascend.com). The recommended version is 8.3.RC2.

## Supported Models

Expand Down
2 changes: 1 addition & 1 deletion scripts/ci/npu_ci_install_dependency.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ wget -O "${BISHENG_NAME}" "${BISHENG_URL}" && chmod a+x "${BISHENG_NAME}" && "./


### Install sgl-kernel-npu
SGL_KERNEL_NPU_TAG="20251128"
SGL_KERNEL_NPU_TAG="20251206"
git clone --depth 1 https://github.com/sgl-project/sgl-kernel-npu.git --branch ${SGL_KERNEL_NPU_TAG}
(cd sgl-kernel-npu && bash ./build.sh && ${PIP_INSTALL} output/deep_ep*.whl output/sgl_kernel_npu*.whl && cd "$(python3 -m pip show deep-ep | grep -E '^Location:' | awk '{print $2}')" && ln -s deep_ep/deep_ep_cpp*.so)

Expand Down
Loading