Skip to content

[CI] Add multi-nodes longseq configs of DeepSeek-R1-W8A8 & Qwen3-235B-W8A8#5381

Merged
MengqingCao merged 10 commits intovllm-project:mainfrom
dsxsteven:12_26_add_longseq_nightly
Jan 4, 2026
Merged

[CI] Add multi-nodes longseq configs of DeepSeek-R1-W8A8 & Qwen3-235B-W8A8#5381
MengqingCao merged 10 commits intovllm-project:mainfrom
dsxsteven:12_26_add_longseq_nightly

Conversation

@dsxsteven
Copy link
Copy Markdown
Contributor

@dsxsteven dsxsteven commented Dec 26, 2025

What this PR does / why we need it?

add DeepSeek-R1-W8A8 and Qwen3-235B-W8A8 configs in multi-nodes and longseq (PCP&DCP) scenario

Does this PR introduce any user-facing change?

NO

How was this patch tested?

Signed-off-by: daishixun <dsxsteven@sina.com>
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces new multi-node long-sequence test configurations for DeepSeek-R1-W8A8 and Qwen3-235B-W8A8. The changes are straightforward additions of YAML configuration files. My review has identified two potential issues: one is a likely incorrect engine_id in the DeepSeek configuration, and the other is an incomplete benchmarks section in the Qwen3 configuration. Both issues could lead to incorrect or ineffective testing and should be addressed.

'{"kv_connector": "MooncakeConnectorV1",
"kv_role": "kv_consumer",
"kv_port": "30200",
"engine_id": "2",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The engine_id for the consumer node is set to "2", which appears to be incorrect for a 2-node setup. In a producer-consumer configuration, engine IDs are typically sequential, starting from "0". The accompanying Qwen3-235B-W8A8-longseq.yaml config uses "0" and "1", which is the expected pattern. Please correct this to "1" to ensure proper node communication.

        "engine_id": "1",

}
}
}'
benchmarks:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The benchmarks section is empty. This will result in the performance and accuracy tests for this model being skipped, making the test configuration ineffective. Please provide the necessary benchmark configurations for perf and acc, similar to the DeepSeek-R1-W8A8-longseq.yaml file.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsxsteven is this skip expected?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

--max-num-seqs 4
--max-model-len 32768
--max-num-batched-tokens 16384
--trust-remote-code
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subsequent cases need to be supplemented for TP asymmetry.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#Todo after #5224 merge

Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: daishixun <dsxsteven@sina.com>
@dsxsteven
Copy link
Copy Markdown
Contributor Author

Local successful test results
33

Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: daishixun <dsxsteven@sina.com>
@weiguihua2 weiguihua2 added ready read for review ready-for-test start test by label for PR and removed ready read for review ready-for-test start test by label for PR labels Dec 30, 2025
--trust-remote-code
--no-enable-prefix-caching
--gpu-memory-utilization 0.9
--compilation_config '{"cudagraph_capture_sizes":[1,2,4,8,16,32], "cudagraph_mode": "FULL_DECODE_ONLY"}'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to specify "cudagraph_capture_sizes":[1,2,4,8,16,32] here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: daishixun <dsxsteven@sina.com>
@MengqingCao MengqingCao merged commit 3c7e6c6 into vllm-project:main Jan 4, 2026
16 checks passed
wjunLu pushed a commit to wjunLu/vllm-ascend that referenced this pull request Jan 4, 2026
…-W8A8 (vllm-project#5381)

### What this PR does / why we need it?
add DeepSeek-R1-W8A8 and Qwen3-235B-W8A8 configs in multi-nodes and
longseq (PCP&DCP) scenario

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0
---------
Signed-off-by: daishixun <dsxsteven@sina.com>
wjunLu pushed a commit to wjunLu/vllm-ascend that referenced this pull request Jan 4, 2026
…-W8A8 (vllm-project#5381)

### What this PR does / why we need it?
add DeepSeek-R1-W8A8 and Qwen3-235B-W8A8 configs in multi-nodes and
longseq (PCP&DCP) scenario

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0
---------
Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: wjunLu <wjunlu217@gmail.com>
@dsxsteven dsxsteven deleted the 12_26_add_longseq_nightly branch January 6, 2026 08:02
Rozwel-dx pushed a commit to Rozwel-dx/vllm-ascend that referenced this pull request Jan 8, 2026
…-W8A8 (vllm-project#5381)

### What this PR does / why we need it?
add DeepSeek-R1-W8A8 and Qwen3-235B-W8A8 configs in multi-nodes and
longseq (PCP&DCP) scenario

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0
---------
Signed-off-by: daishixun <dsxsteven@sina.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…-W8A8 (vllm-project#5381)

### What this PR does / why we need it?
add DeepSeek-R1-W8A8 and Qwen3-235B-W8A8 configs in multi-nodes and
longseq (PCP&DCP) scenario

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0
---------
Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…-W8A8 (vllm-project#5381)

### What this PR does / why we need it?
add DeepSeek-R1-W8A8 and Qwen3-235B-W8A8 configs in multi-nodes and
longseq (PCP&DCP) scenario

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0
---------
Signed-off-by: daishixun <dsxsteven@sina.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…-W8A8 (vllm-project#5381)

### What this PR does / why we need it?
add DeepSeek-R1-W8A8 and Qwen3-235B-W8A8 configs in multi-nodes and
longseq (PCP&DCP) scenario

- vLLM version: release/v0.13.0
- vLLM main:
vllm-project/vllm@bc0a5a0
---------
Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants