Skip to content

[NPU][Doc] Update GLM-5 docs, enabling deepep by default#23708

Merged
iforgetmyname merged 1 commit into
sgl-project:mainfrom
cen121212:4-25-sgl-project-main
May 8, 2026
Merged

[NPU][Doc] Update GLM-5 docs, enabling deepep by default#23708
iforgetmyname merged 1 commit into
sgl-project:mainfrom
cen121212:4-25-sgl-project-main

Conversation

@cen121212
Copy link
Copy Markdown
Contributor

@cen121212 cen121212 commented Apr 25, 2026

Motivation

When DeepEP is not enabled, there can be accuracy issues, so DeepEP is enabled by default.

Modifications

docs/platforms/ascend/ascend_npu_glm5_examples.md

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@cen121212 cen121212 requested a review from wisclmy0611 as a code owner April 25, 2026 07:43
@github-actions github-actions Bot added documentation Improvements or additions to documentation npu labels Apr 25, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Ascend NPU GLM5 documentation by removing specific environment variables and increasing the maximum batch size for CUDA graphs. A review comment correctly identifies that the 'deepep' backend is incompatible with Ascend NPU hardware and suggests using 'ascend_fuseep' instead, while also recommending the removal of the 'deepep-mode' flag.

Comment on lines +184 to +185
--moe-a2a-backend deepep \
--deepep-mode auto \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For Ascend NPU, the optimized MoE All-to-All backend is ascend_fuseep. The deepep backend is specifically designed for NVIDIA GPUs using the deep_ep library and will not work on NPU. Additionally, --deepep-mode is not used by the ascend_fuseep backend and should be removed.

Suggested change
--moe-a2a-backend deepep \
--deepep-mode auto \
--moe-a2a-backend ascend_fuseep \

@iforgetmyname iforgetmyname changed the title 【NPU】【docs】 fix glm5 docs [NPU][Doc] Update GLM-5 docs, enabling deepep by default May 8, 2026
@iforgetmyname
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@github-actions github-actions Bot added the run-ci label May 8, 2026
@iforgetmyname iforgetmyname merged commit 461bc8a into sgl-project:main May 8, 2026
42 checks passed
Dogacel pushed a commit to Dogacel/sglang-fork that referenced this pull request May 8, 2026
LLThomas pushed a commit to LLThomas/sglang that referenced this pull request May 8, 2026
LucQueen pushed a commit to LucQueen/sglang that referenced this pull request May 12, 2026
@zijiexia
Copy link
Copy Markdown
Collaborator

zijiexia commented Jun 4, 2026

Hi @cen121212 , we've moved our documentations under docs_new so your changes here might not be correctly reflected on our documentation page. Can you kindly migrate this change to the corresponding page under docs_new? Thank you so much! sorry for any confusions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation npu run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants