Skip to content

[Doc] Update DeepSeek V3.1/R1 2P1D doc#5387

Merged
MengqingCao merged 2 commits intovllm-project:mainfrom
dragondream-chen:main
Dec 27, 2025
Merged

[Doc] Update DeepSeek V3.1/R1 2P1D doc#5387
MengqingCao merged 2 commits intovllm-project:mainfrom
dragondream-chen:main

Conversation

@dragondream-chen
Copy link
Copy Markdown
Collaborator

@dragondream-chen dragondream-chen commented Dec 26, 2025

What this PR does / why we need it?

The PR updates the documentation for DeepSeek-V3.1 and DeepSeek-R1 in the scenario of prefill-decode disaggregation.

Updated some PD separation-related setting parameters and optimal configurations. This script has been verified.

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 26, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the documentation for DeepSeek V3.1 and R1 models, primarily focusing on the 2P1D deployment configurations. The changes involve updating command-line parameters, replacing an embedded script with a link for better maintainability, and removing hardcoded version numbers. My review identified a critical issue in the data parallelism configuration for prefill nodes in DeepSeek-V3.1.md. The provided commands are inconsistent and would result in a non-functional deployment. I have included a detailed comment with a suggested fix for this issue. The other documentation changes appear correct and improve clarity.

Comment on lines +581 to +583
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address 141.xx.xx.1 --dp-rpc-port 12321 --vllm-start-port 7100
# p1
python launch_dp_program.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address 141.xx.xx.2 --dp-rpc-port 12321 --vllm-start-port 7100
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address 141.xx.xx.2 --dp-rpc-port 12321 --vllm-start-port 7100
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The data parallelism configuration for the prefill nodes (p0, p1) appears to be incorrect and will likely lead to errors.

Here are the inconsistencies:

  1. dp-size vs. number of workers: dp-size is set to 2, but with dp-size-local=2 on two nodes (p0 and p1), you are attempting to launch 4 workers in total. The total number of workers should match dp-size.
  2. Rank Conflict: dp-rank-start is 0 for both nodes. This will cause data parallel rank conflicts as both nodes will try to launch workers with ranks 0 and 1.
  3. DP Master Address: dp-address is different for p0 and p1. For a single data parallel group, all workers should point to the same master address.
  4. Connector Config: The kv-transfer-config in the run_dp_template.sh for prefill nodes specifies "dp_size": 2, which is inconsistent with launching 4 workers.

To fix this for a 2-node, 4-worker prefill setup, you should:

  • Use dp-size=4.
  • Assign unique ranks (e.g., dp-rank-start=0 for p0, dp-rank-start=2 for p1).
  • Use a single master dp-address.
  • Update the dp_size to 4 in kv-transfer-config in the run_dp_template.sh scripts.

I've provided a code suggestion to correct the launch commands, assuming 141.xx.xx.1 is the master. You will also need to update the dp_size to 4 in the prefill section of kv_connector_extra_config within the run_dp_template.sh scripts for both prefill nodes.

Suggested change
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address 141.xx.xx.1 --dp-rpc-port 12321 --vllm-start-port 7100
# p1
python launch_dp_program.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address 141.xx.xx.2 --dp-rpc-port 12321 --vllm-start-port 7100
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address 141.xx.xx.2 --dp-rpc-port 12321 --vllm-start-port 7100
python launch_online_dp.py --dp-size 4 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address 141.xx.xx.1 --dp-rpc-port 12321 --vllm-start-port 7100
# p1
python launch_online_dp.py --dp-size 4 --tp-size 8 --dp-size-local 2 --dp-rank-start 2 --dp-address 141.xx.xx.1 --dp-rpc-port 12321 --vllm-start-port 7100

@dragondream-chen dragondream-chen force-pushed the main branch 2 times, most recently from 575970f to f299b34 Compare December 26, 2025 09:09
@dragondream-chen dragondream-chen force-pushed the main branch 2 times, most recently from 748455f to 8e7dbe6 Compare December 26, 2025 10:42
Signed-off-by: chenmenglong <chenmenglong1@huawei.com>
@MengqingCao MengqingCao merged commit b8b5521 into vllm-project:main Dec 27, 2025
6 of 8 checks passed
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Dec 29, 2025
…to eplb_refactor

* 'main' of https://github.com/vllm-project/vllm-ascend: (46 commits)
  [Feature] Support to use fullgraph with eagle (vllm-project#5118)
  [EPLB][refactor] Modification of the initialization logic for expert_map and log2phy(depend on pr5285) (vllm-project#5311)
  [Refactor]6/N Extract common code of class AscendMLAImpl (vllm-project#5314)
  [Refactor] cache cos/sin in mla & remove parameter model in builder. (vllm-project#5277)
  update vllm pin to 12.27 (vllm-project#5412)
  [ReleaseNote] Add release note for v0.13.0rc1 (vllm-project#5334)
  [Bugfix] Correctly handle the output shape in multimodal attention (vllm-project#5443)
  Fix nightly (vllm-project#5413)
  [bugfix] fix typo of _skip_all_reduce_across_dp_group (vllm-project#5435)
  [Doc]modify pcp tutorial doc (vllm-project#5440)
  [Misc] fast fail for exiting if tools/install_flash_infer_attention_score_ops_a2.sh (vllm-project#5422)
  [Doc] Update DeepSeek V3.1/R1 2P1D doc (vllm-project#5387)
  [DOC]Fix model weight download links (vllm-project#5436)
  [Doc] Modify DeepSeek-R1/V3.1 documentation (vllm-project#5426)
  Revert "[feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5300)" (vllm-project#5434)
  [Bugfix] fix greedy temperature detection (vllm-project#5417)
  [doc] Update Qwen3-235B doc for reproducing latest performance (vllm-project#5323)
  [feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5300)
  [Doc] delete environment variable HCCL_OP_EXPANSION_MODE in DeepSeekV3.1/R1 (vllm-project#5419)
  [Doc] add long_sequence feature user guide (vllm-project#5343)
  ...
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
The PR updates the documentation for DeepSeek-V3.1 and DeepSeek-R1 in
the scenario of prefill-decode disaggregation.

Updated some PD separation-related setting parameters and optimal
configurations. This script has been verified.

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0

Signed-off-by: chenmenglong <chenmenglong1@huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
The PR updates the documentation for DeepSeek-V3.1 and DeepSeek-R1 in
the scenario of prefill-decode disaggregation.

Updated some PD separation-related setting parameters and optimal
configurations. This script has been verified.

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0

Signed-off-by: chenmenglong <chenmenglong1@huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
The PR updates the documentation for DeepSeek-V3.1 and DeepSeek-R1 in
the scenario of prefill-decode disaggregation.

Updated some PD separation-related setting parameters and optimal
configurations. This script has been verified.

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0

Signed-off-by: chenmenglong <chenmenglong1@huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants