Skip to content

[Nightly][Refactor]Migrate nightly single-node model tests from .py to .yaml#6503

Merged
wangxiyuan merged 5 commits intovllm-project:mainfrom
MrZ20:single_nightly
Mar 3, 2026
Merged

[Nightly][Refactor]Migrate nightly single-node model tests from .py to .yaml#6503
wangxiyuan merged 5 commits intovllm-project:mainfrom
MrZ20:single_nightly

Conversation

@MrZ20
Copy link
Copy Markdown
Contributor

@MrZ20 MrZ20 commented Feb 3, 2026

What this PR does / why we need it?

This PR refactors the nightly single-node model test by migrating test configurations from Python scripts to a more maintainable YAML-based format.

Original PR Python (.py) YAML (.yaml)
#3568 test_deepseek_r1_0528_w8a8_eplb.py DeepSeek-R1-0528-W8A8.yaml
#3631 test_deepseek_r1_0528_w8a8.py DeepSeek-R1-0528-W8A8.yaml
#5874 test_deepseek_r1_w8a8_hbm.py DeepSeek-R1-W8A8-HBM.yaml
#3908 test_deepseek_v3_2_w8a8.py DeepSeek-V3.2-W8A8.yaml
#5682 test_kimi_k2_thinking.py Kimi-K2-Thinking.yaml
#4111 test_mtpx_deepseek_r1_0528_w8a8.py MTPX-DeepSeek-R1-0528-W8A8.yaml
#3733 test_prefix_cache_deepseek_r1_0528_w8a8.py Prefix-Cache-DeepSeek-R1-0528-W8A8.yaml
#6543 test_qwen3_235b_w8a8.py Qwen3-235B-A22B-W8A8.yaml
#6543 test_qwen3_235b_a22b_w8a8_eplb.py Qwen3-235B-A22B-W8A8.yaml
#3973 test_qwen3_30b_w8a8.py Qwen3-30B-A3B-W8A8.yaml
#3541 test_qwen3_32b_int8.py Qwen3-32B-Int8.yaml
#3757 test_qwq_32b.py QwQ-32B.yaml
#5616 test_qwen3_next_w8a8.py Qwen3-Next-80B-A3B-Instruct-W8A8.yaml
#3541 test_qwen2_5_vl_7b.py Qwen2.5-VL-7B-Instruct.yaml
#5301 test_qwen2_5_vl_7b_epd.py Qwen2.5-VL-7B-Instruct-EPD.yaml
#3707 test_qwen2_5_vl_32b.py Qwen2.5-VL-32B-Instruct.yaml
#3676 test_qwen3_32b_int8_a3_feature_stack3.py Qwen3-32B-Int8-A3-Feature-Stack3.yaml
#3709 test_prefix_cache_qwen3_32b_int8.py Prefix-Cache-Qwen3-32B-Int8.yaml
#5395 test_qwen3_next.py Qwen3-Next-80B-A3B-Instruct-A2.yaml
#3474 test_qwen3_32b.py Qwen3-32B.yaml
#3541 test_qwen3_32b_int8.py Qwen3-32B-Int8-A2.yaml

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 3, 2026

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @MrZ20, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the existing nightly single-node model tests by transitioning their setup from Python code to a declarative YAML format. This change aims to streamline the definition and management of test parameters, making it simpler to configure and execute various model tests. The new structure enhances maintainability and allows for easier extension of test scenarios without modifying core test logic.

Highlights

  • YAML Configuration for Nightly Tests: Migrated nightly single-node model test configurations from Python scripts to a more maintainable YAML-based format, improving readability and ease of management.
  • New Configuration Loader: Introduced SingleNodeConfigLoader in single_node_config.py to parse and validate the new YAML test configurations, abstracting the configuration loading logic.
  • Updated Test Execution: The test_single_node.py script now dynamically loads test parameters and environment variables from the YAML configuration, enabling flexible test execution for different models and benchmarks.
  • DeepSeek Model Test Added: Added a new YAML configuration file for the DeepSeek-R1-0528-W8A8 model, including specific server commands, environment variables, and benchmark settings for accuracy testing with Expert Parallel Load Balancing (EPLB).

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • tests/e2e/nightly/single_node/config/DeepSeek-R1-0528-W8A8-EPLB.yaml
    • Added a new YAML configuration file for the DeepSeek-R1-0528-W8A8 model, defining its server command, common environment variables, and accuracy benchmark settings, including Expert Parallel Load Balancing (EPLB) configurations.
  • tests/e2e/nightly/single_node/scripts/init.py
    • Added an empty __init__.py file to designate the scripts directory as a Python package.
  • tests/e2e/nightly/single_node/scripts/single_node_config.py
    • Added a new Python module containing SingleNodeConfig and SingleNodeConfigLoader classes.
    • SingleNodeConfig encapsulates model, environment variables, server commands, and benchmark commands.
    • SingleNodeConfigLoader provides methods to load, parse, and validate test configurations from YAML files, including environment variable expansion within commands.
  • tests/e2e/nightly/single_node/scripts/test_single_node.py
    • Modified the test_single_node function to utilize SingleNodeConfigLoader.from_yaml() to load test configurations.
    • The test now dynamically uses the model, server command, server port, and environment variables defined in the loaded YAML configuration.
    • Integrated aisbench test cases (accuracy and performance) based on the configurations provided in the YAML file.
Ignored Files
  • Ignored by pattern: .github/workflows/** (2)
    • .github/workflows/_e2e_nightly_single_node_yaml.yaml
    • .github/workflows/schedule_nightly_test_a3.yaml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully migrates the nightly single-node model test configuration to a YAML-based format, improving maintainability. However, there are a couple of critical issues related to the default configuration loading and a high-severity inconsistency in configuration validation that need to be addressed to ensure the new YAML configuration is correctly utilized and validated.

Comment thread tests/e2e/nightly/single_node/scripts/single_node_config.py Outdated
Comment thread tests/e2e/nightly/single_node/scripts/test_single_node.py Outdated
Comment thread tests/e2e/nightly/single_node/scripts/single_node_config.py Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 6, 2026

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@MrZ20 MrZ20 force-pushed the single_nightly branch 6 times, most recently from eeb4598 to f8fd663 Compare March 2, 2026 07:46
@MrZ20 MrZ20 marked this pull request as ready for review March 2, 2026 08:12
@MrZ20 MrZ20 requested review from Yikun and wangxiyuan as code owners March 2, 2026 08:12
Copy link
Copy Markdown
Member

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a great improvement

- name: Checkout PR 6503
working-directory: /vllm-workspace/vllm-ascend
run: |
echo "Fetching PR 6503..."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need this, any plan to remove?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently under testing, will be removed before merging

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

This pull request has conflicts, please resolve those before we can evaluate the pull request.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MrZ20 added 3 commits March 3, 2026 10:33
Signed-off-by: MrZ20 <2609716663@qq.com>

v1

Signed-off-by: MrZ20 <2609716663@qq.com>

add test

Signed-off-by: MrZ20 <2609716663@qq.com>

fix

Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>

add port diy

Signed-off-by: MrZ20 <2609716663@qq.com>

add fun diy

Signed-off-by: MrZ20 <2609716663@qq.com>

add pd func

Signed-off-by: MrZ20 <2609716663@qq.com>

refactor

Signed-off-by: MrZ20 <2609716663@qq.com>

start nightly test

Signed-off-by: MrZ20 <2609716663@qq.com>

start nightly test 2

Signed-off-by: MrZ20 <2609716663@qq.com>

fix

Signed-off-by: MrZ20 <2609716663@qq.com>

fix

Signed-off-by: MrZ20 <2609716663@qq.com>

fix

Signed-off-by: MrZ20 <2609716663@qq.com>

test

Signed-off-by: MrZ20 <2609716663@qq.com>

fix

Signed-off-by: MrZ20 <2609716663@qq.com>

fix

Signed-off-by: MrZ20 <2609716663@qq.com>
MrZ20 added 2 commits March 3, 2026 10:38
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
@wangxiyuan wangxiyuan merged commit 859f2c2 into vllm-project:main Mar 3, 2026
25 checks passed
@MrZ20 MrZ20 deleted the single_nightly branch March 4, 2026 06:05
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Mar 5, 2026
…to qwen3next_graph

* 'main' of https://github.com/vllm-project/vllm-ascend: (40 commits)
  [Feature] Add docs of batch invariance and make some extra operators patch (vllm-project#6910)
  [bugfix]Qwen2.5VL accurate question (vllm-project#6975)
  [CI] Add DeepSeek-V3.2 large EP nightly ci (vllm-project#6378)
  [Ops][BugFix] Fix RoPE shape mismatch for mtp models with flashcomm v1 enabled (vllm-project#6939)
  [bugfix]fix file not found error in nightly of single-node (vllm-project#6976)
  [Bugfix] Fix the acceptance rates dorp issue when applying eagle3 to QuaRot model (vllm-project#6914)
  [CI] Enable auto upgrade e2e estimated time for auto-partition suites (vllm-project#6840)
  [Doc][Misc] Fix msprobe_guide.md documentation issues (vllm-project#6965)
  [Nightly][Refactor]Migrate nightly single-node model tests from `.py` to `.yaml` (vllm-project#6503)
  [BugFix] Improve GDN layer detection for multimodal models (vllm-project#6941)
  [feat]ds3.2 pcp support mtp and chunkprefill (vllm-project#6917)
  [CPU binding] Implement global CPU slicing and improve IRQ binding for Ascend NPUs (vllm-project#6945)
  [Triton] Centralize Ascend extension op dispatch in triton_utils (vllm-project#6937)
  [csrc][bugfix] Add compile-time Ascend950/910_95 compatibility for custom ops between CANN8.5 and 9.0 (vllm-project#6936)
  [300I][Bugfix] fix unquant model weight nd2nz error (vllm-project#6851)
  [doc] fix supported_models (vllm-project#6930)
  [CI] nightly test timeout (vllm-project#6912)
  [CI] Upgrade CANN to 8.5.1 (vllm-project#6897)
  [Model]Add Qwen3-Omni quantization Ascend NPU adaptation and optimization (vllm-project#6828)
  [P/D][v0.16.0]Adapt to RecomputeScheduler in vLLM 0.16.0 (vllm-project#6898)
  ...
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
… to `.yaml` (vllm-project#6503)

### What this PR does / why we need it?
This PR refactors the nightly single-node model test by migrating test
configurations from Python scripts to a more maintainable `YAML-based`
format.

| Original PR | Python (`.py`) | YAML (`.yaml`) |
| :--- | :--- | :--- |
| [vllm-project#3568](vllm-project#3568) |
`test_deepseek_r1_0528_w8a8_eplb.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [vllm-project#3631](vllm-project#3631) |
`test_deepseek_r1_0528_w8a8.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [vllm-project#5874](vllm-project#5874) |
`test_deepseek_r1_w8a8_hbm.py` | `DeepSeek-R1-W8A8-HBM.yaml` |
| [vllm-project#3908](vllm-project#3908) |
`test_deepseek_v3_2_w8a8.py` | `DeepSeek-V3.2-W8A8.yaml` |
| [vllm-project#5682](vllm-project#5682) |
`test_kimi_k2_thinking.py` | `Kimi-K2-Thinking.yaml` |
| [vllm-project#4111](vllm-project#4111) |
`test_mtpx_deepseek_r1_0528_w8a8.py` | `MTPX-DeepSeek-R1-0528-W8A8.yaml`
|
| [vllm-project#3733](vllm-project#3733) |
`test_prefix_cache_deepseek_r1_0528_w8a8.py` |
`Prefix-Cache-DeepSeek-R1-0528-W8A8.yaml` |
| [vllm-project#6543](vllm-project#6543) |
`test_qwen3_235b_w8a8.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [vllm-project#6543](vllm-project#6543) |
`test_qwen3_235b_a22b_w8a8_eplb.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [vllm-project#3973](vllm-project#3973) |
`test_qwen3_30b_w8a8.py` | `Qwen3-30B-A3B-W8A8.yaml` |
| [vllm-project#3541](vllm-project#3541) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8.yaml` |
| [vllm-project#3757](vllm-project#3757) |
`test_qwq_32b.py` | `QwQ-32B.yaml` |
| [vllm-project#5616](vllm-project#5616) |
`test_qwen3_next_w8a8.py` | `Qwen3-Next-80B-A3B-Instruct-W8A8.yaml` |
| [vllm-project#3541](vllm-project#3541) |
`test_qwen2_5_vl_7b.py` | `Qwen2.5-VL-7B-Instruct.yaml` |
| [vllm-project#5301](vllm-project#5301) |
`test_qwen2_5_vl_7b_epd.py` | `Qwen2.5-VL-7B-Instruct-EPD.yaml` |
| [vllm-project#3707](vllm-project#3707) |
`test_qwen2_5_vl_32b.py` | `Qwen2.5-VL-32B-Instruct.yaml` |
| [vllm-project#3676](vllm-project#3676) |
`test_qwen3_32b_int8_a3_feature_stack3.py` |
`Qwen3-32B-Int8-A3-Feature-Stack3.yaml` |
| [vllm-project#3709](vllm-project#3709) |
`test_prefix_cache_qwen3_32b_int8.py` |
`Prefix-Cache-Qwen3-32B-Int8.yaml` |
| [vllm-project#5395](vllm-project#5395) |
`test_qwen3_next.py` | `Qwen3-Next-80B-A3B-Instruct-A2.yaml` |
| [vllm-project#3474](vllm-project#3474) |
`test_qwen3_32b.py` | `Qwen3-32B.yaml` |
| [vllm-project#3541](vllm-project#3541) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8-A2.yaml` |
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

---------

Signed-off-by: MrZ20 <2609716663@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants