[Doc] Refactor the DeepSeek-V3.1 tutorial. by 1092626063 · Pull Request #4399 · vllm-project/vllm-ascend

1092626063 · 2025-11-24T09:17:45Z

What this PR does / why we need it?

Refactor the DeepSeek-V3.1 tutorial.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

github-actions · 2025-11-24T09:17:55Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request adds a comprehensive tutorial for deploying the DeepSeek-V3.1 model. While the document covers various deployment scenarios, I've found several critical errors in the provided code snippets and configurations, particularly for multi-node and prefill-decode disaggregation setups. These issues, including Python syntax errors, incorrect data parallel configurations, and inconsistent model naming, would likely prevent users from successfully following the instructions. My review provides specific corrections to address these critical problems and improve the tutorial's accuracy and usability.

menogrey · 2025-11-28T01:31:14Z

+local_ip="xxxx"
+
+# [Optional] jemalloc
+# if `libjemalloc.so` is install on your machine, you can turn it on.


jemalloc is for better performance, please add some description, otherwise may be a little confused. Thanks.

i have added some description: “jemalloc is for better performance, if libjemalloc.so is install on your machine, you can turn it on.”

menogrey · 2025-11-28T02:00:37Z

+### Model Weight
+- `DeepSeek-V3.1`(BF16 version): [Download model weight](https://www.modelscope.cn/models/deepseek-ai/DeepSeek-V3.1)
+- `DeepSeek-V3.1-w8a8`(Quantized version): [Download model weight](https://www.modelscope.cn/models/Eco-Tech/DeepSeek-V3.1-w8a8). Note: modify `torch_dtype` from `float16` to `bfloat16` in `config.json`.
+- Method of Quantify: [DeepSeek-V3.1 W8A8+MTP](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v31-w8a8-%E6%B7%B7%E5%90%88%E9%87%8F%E5%8C%96-mtp-%E9%87%8F%E5%8C%96)


DeepSeek-V3.1 W8A8+MTP seems not having a available download url. It's better to upload to modelscope or other platform, since you mention DeepSeek-V3.1 W8A8+MTP as below.

ok, we don't have mtp weights on modelscope, so i put a method of quantify here, maybe i should add more details.

menogrey · 2025-11-28T02:10:11Z

+export VLLM_ASCEND_ENABLE_FLASHCOMM1=0
+export DISABLE_L2_CACHE=1
+
+vllm serve vllm-ascend/DeepSeek-V3.1_w8a8mix_mtp \


In fact, if you use xxx/xxx as a model name, vllm will search it from the huggingface (or if you set VLLM_USE_MODELSCOPE, vllm will search from the modelscope), the vllm-ascend/xxx usually indicate it is from our modelscope vllm-ascend published models, so better change it to a local path.

menogrey · 2025-11-28T02:19:12Z

+export VLLM_USE_V1=1
+export HCCL_BUFFSIZE=200
+export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True
+export VLLM_ASCEND_ENABLE_MLAPO=1


@wangxiyuan The VLLM_ASCEND_ENABLE_MLAPO=1 is also needed in DeepSeek-V3.1? And i am not sure it is ok for this, since i remember it caused some issue in 0.11.0rc1 DeepSeek-V3.2-Exp.

menogrey · 2025-11-28T02:31:02Z

+--gpu-memory-utilization 0.92 \
+--speculative-config '{"num_speculative_tokens": 1, "method": "deepseek_mtp"}' \
+--compilation-config '{"cudagraph_mode": "FULL_DECODE_ONLY"}' \
+--additional-config '{"ascend_scheduler_config":{"enabled":false},"torchair_graph_config":{"enabled":false}}'


ascend schedular is ready to be dropped in main. Refer to this #4498

Signed-off-by: 1092626063 <1092626063@qq.com>

menogrey · 2025-12-01T11:29:31Z

LGTM, thanks for your contribution!

### What this PR does / why we need it? Refactor the DeepSeek-V3.1 tutorial. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: 1092626063 <1092626063@qq.com>

### What this PR does / why we need it? Refactor the DeepSeek-V3.1 tutorial. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: 1092626063 <1092626063@qq.com> Signed-off-by: Che Ruan <cr623@ic.ac.uk>

### What this PR does / why we need it? Refactor the DeepSeek-V3.1 tutorial. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: 1092626063 <1092626063@qq.com>

### What this PR does / why we need it? Refactor the DeepSeek-V3.1 tutorial. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: 1092626063 <1092626063@qq.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

### What this PR does / why we need it? Refactor the DeepSeek-V3.1 tutorial. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: 1092626063 <1092626063@qq.com>

github-actions bot added the documentation Improvements or additions to documentation label Nov 24, 2025

gemini-code-assist bot reviewed Nov 24, 2025

View reviewed changes

Comment thread docs/source/tutorials/DeepSeek-V3.1.md Outdated

Comment thread docs/source/tutorials/DeepSeek-V3.1.md

Comment thread docs/source/tutorials/DeepSeek-V3.1.md Outdated

Comment thread docs/source/tutorials/DeepSeek-V3.1.md

1092626063 force-pushed the DeepSeek3.1 branch 4 times, most recently from 74cdeb3 to 16d672f Compare November 27, 2025 10:15

menogrey reviewed Nov 28, 2025

View reviewed changes

Comment thread docs/source/tutorials/DeepSeek-V3.1.md Outdated

Comment thread docs/source/tutorials/DeepSeek-V3.1.md Outdated

Comment thread docs/source/tutorials/DeepSeek-V3.1.md Outdated

Comment thread docs/source/tutorials/DeepSeek-V3.1.md Outdated

1092626063 force-pushed the DeepSeek3.1 branch 2 times, most recently from 9ce8f7b to 208dcae Compare November 28, 2025 09:10

1092626063 mentioned this pull request Nov 28, 2025

[Doc] Add single NPU tutorial for Qwen2.5-Omni-7B #4446

Merged

1092626063 force-pushed the DeepSeek3.1 branch 2 times, most recently from 712ae28 to 8bb9393 Compare November 29, 2025 03:46

1092626063 mentioned this pull request Nov 29, 2025

[Doc] Add Qwen3-235B tutorial #4358

Merged

1092626063 force-pushed the DeepSeek3.1 branch from 8bb9393 to 5b7511a Compare December 1, 2025 06:54

deepseekv3.1 tutorial

fed65de

Signed-off-by: 1092626063 <1092626063@qq.com>

1092626063 force-pushed the DeepSeek3.1 branch from 5b7511a to fed65de Compare December 1, 2025 06:57

MengqingCao approved these changes Dec 2, 2025

View reviewed changes

MengqingCao merged commit eabedf4 into vllm-project:main Dec 2, 2025
17 checks passed

1092626063 mentioned this pull request Dec 3, 2025

add DeepSeek-R1 tutorial. #4666

Merged

Conversation

1092626063 commented Nov 24, 2025 • edited by MengqingCao Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

menogrey commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1092626063 commented Nov 24, 2025 •

edited by MengqingCao

Loading