[Nightly][Refactor]Migrate nightly single-node model tests from .py to .yaml#6503
[Nightly][Refactor]Migrate nightly single-node model tests from .py to .yaml#6503wangxiyuan merged 5 commits intovllm-project:mainfrom
.py to .yaml#6503Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Summary of ChangesHello @MrZ20, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the existing nightly single-node model tests by transitioning their setup from Python code to a declarative YAML format. This change aims to streamline the definition and management of test parameters, making it simpler to configure and execute various model tests. The new structure enhances maintainability and allows for easier extension of test scenarios without modifying core test logic. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request successfully migrates the nightly single-node model test configuration to a YAML-based format, improving maintainability. However, there are a couple of critical issues related to the default configuration loading and a high-severity inconsistency in configuration validation that need to be addressed to ensure the new YAML configuration is correctly utilized and validated.
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
ad138c5 to
08e576a
Compare
eeb4598 to
f8fd663
Compare
| - name: Checkout PR 6503 | ||
| working-directory: /vllm-workspace/vllm-ascend | ||
| run: | | ||
| echo "Fetching PR 6503..." |
There was a problem hiding this comment.
Why need this, any plan to remove?
There was a problem hiding this comment.
Currently under testing, will be removed before merging
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
1 similar comment
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: MrZ20 <2609716663@qq.com> add port diy Signed-off-by: MrZ20 <2609716663@qq.com> add fun diy Signed-off-by: MrZ20 <2609716663@qq.com> add pd func Signed-off-by: MrZ20 <2609716663@qq.com> refactor Signed-off-by: MrZ20 <2609716663@qq.com> start nightly test Signed-off-by: MrZ20 <2609716663@qq.com> start nightly test 2 Signed-off-by: MrZ20 <2609716663@qq.com> fix Signed-off-by: MrZ20 <2609716663@qq.com> fix Signed-off-by: MrZ20 <2609716663@qq.com> fix Signed-off-by: MrZ20 <2609716663@qq.com> test Signed-off-by: MrZ20 <2609716663@qq.com> fix Signed-off-by: MrZ20 <2609716663@qq.com> fix Signed-off-by: MrZ20 <2609716663@qq.com>
…to qwen3next_graph * 'main' of https://github.com/vllm-project/vllm-ascend: (40 commits) [Feature] Add docs of batch invariance and make some extra operators patch (vllm-project#6910) [bugfix]Qwen2.5VL accurate question (vllm-project#6975) [CI] Add DeepSeek-V3.2 large EP nightly ci (vllm-project#6378) [Ops][BugFix] Fix RoPE shape mismatch for mtp models with flashcomm v1 enabled (vllm-project#6939) [bugfix]fix file not found error in nightly of single-node (vllm-project#6976) [Bugfix] Fix the acceptance rates dorp issue when applying eagle3 to QuaRot model (vllm-project#6914) [CI] Enable auto upgrade e2e estimated time for auto-partition suites (vllm-project#6840) [Doc][Misc] Fix msprobe_guide.md documentation issues (vllm-project#6965) [Nightly][Refactor]Migrate nightly single-node model tests from `.py` to `.yaml` (vllm-project#6503) [BugFix] Improve GDN layer detection for multimodal models (vllm-project#6941) [feat]ds3.2 pcp support mtp and chunkprefill (vllm-project#6917) [CPU binding] Implement global CPU slicing and improve IRQ binding for Ascend NPUs (vllm-project#6945) [Triton] Centralize Ascend extension op dispatch in triton_utils (vllm-project#6937) [csrc][bugfix] Add compile-time Ascend950/910_95 compatibility for custom ops between CANN8.5 and 9.0 (vllm-project#6936) [300I][Bugfix] fix unquant model weight nd2nz error (vllm-project#6851) [doc] fix supported_models (vllm-project#6930) [CI] nightly test timeout (vllm-project#6912) [CI] Upgrade CANN to 8.5.1 (vllm-project#6897) [Model]Add Qwen3-Omni quantization Ascend NPU adaptation and optimization (vllm-project#6828) [P/D][v0.16.0]Adapt to RecomputeScheduler in vLLM 0.16.0 (vllm-project#6898) ...
… to `.yaml` (vllm-project#6503) ### What this PR does / why we need it? This PR refactors the nightly single-node model test by migrating test configurations from Python scripts to a more maintainable `YAML-based` format. | Original PR | Python (`.py`) | YAML (`.yaml`) | | :--- | :--- | :--- | | [vllm-project#3568](vllm-project#3568) | `test_deepseek_r1_0528_w8a8_eplb.py` | `DeepSeek-R1-0528-W8A8.yaml` | | [vllm-project#3631](vllm-project#3631) | `test_deepseek_r1_0528_w8a8.py` | `DeepSeek-R1-0528-W8A8.yaml` | | [vllm-project#5874](vllm-project#5874) | `test_deepseek_r1_w8a8_hbm.py` | `DeepSeek-R1-W8A8-HBM.yaml` | | [vllm-project#3908](vllm-project#3908) | `test_deepseek_v3_2_w8a8.py` | `DeepSeek-V3.2-W8A8.yaml` | | [vllm-project#5682](vllm-project#5682) | `test_kimi_k2_thinking.py` | `Kimi-K2-Thinking.yaml` | | [vllm-project#4111](vllm-project#4111) | `test_mtpx_deepseek_r1_0528_w8a8.py` | `MTPX-DeepSeek-R1-0528-W8A8.yaml` | | [vllm-project#3733](vllm-project#3733) | `test_prefix_cache_deepseek_r1_0528_w8a8.py` | `Prefix-Cache-DeepSeek-R1-0528-W8A8.yaml` | | [vllm-project#6543](vllm-project#6543) | `test_qwen3_235b_w8a8.py` | `Qwen3-235B-A22B-W8A8.yaml` | | [vllm-project#6543](vllm-project#6543) | `test_qwen3_235b_a22b_w8a8_eplb.py` | `Qwen3-235B-A22B-W8A8.yaml` | | [vllm-project#3973](vllm-project#3973) | `test_qwen3_30b_w8a8.py` | `Qwen3-30B-A3B-W8A8.yaml` | | [vllm-project#3541](vllm-project#3541) | `test_qwen3_32b_int8.py` | `Qwen3-32B-Int8.yaml` | | [vllm-project#3757](vllm-project#3757) | `test_qwq_32b.py` | `QwQ-32B.yaml` | | [vllm-project#5616](vllm-project#5616) | `test_qwen3_next_w8a8.py` | `Qwen3-Next-80B-A3B-Instruct-W8A8.yaml` | | [vllm-project#3541](vllm-project#3541) | `test_qwen2_5_vl_7b.py` | `Qwen2.5-VL-7B-Instruct.yaml` | | [vllm-project#5301](vllm-project#5301) | `test_qwen2_5_vl_7b_epd.py` | `Qwen2.5-VL-7B-Instruct-EPD.yaml` | | [vllm-project#3707](vllm-project#3707) | `test_qwen2_5_vl_32b.py` | `Qwen2.5-VL-32B-Instruct.yaml` | | [vllm-project#3676](vllm-project#3676) | `test_qwen3_32b_int8_a3_feature_stack3.py` | `Qwen3-32B-Int8-A3-Feature-Stack3.yaml` | | [vllm-project#3709](vllm-project#3709) | `test_prefix_cache_qwen3_32b_int8.py` | `Prefix-Cache-Qwen3-32B-Int8.yaml` | | [vllm-project#5395](vllm-project#5395) | `test_qwen3_next.py` | `Qwen3-Next-80B-A3B-Instruct-A2.yaml` | | [vllm-project#3474](vllm-project#3474) | `test_qwen3_32b.py` | `Qwen3-32B.yaml` | | [vllm-project#3541](vllm-project#3541) | `test_qwen3_32b_int8.py` | `Qwen3-32B-Int8-A2.yaml` | ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0 --------- Signed-off-by: MrZ20 <2609716663@qq.com>
What this PR does / why we need it?
This PR refactors the nightly single-node model test by migrating test configurations from Python scripts to a more maintainable
YAML-basedformat..py).yaml)test_deepseek_r1_0528_w8a8_eplb.pyDeepSeek-R1-0528-W8A8.yamltest_deepseek_r1_0528_w8a8.pyDeepSeek-R1-0528-W8A8.yamltest_deepseek_r1_w8a8_hbm.pyDeepSeek-R1-W8A8-HBM.yamltest_deepseek_v3_2_w8a8.pyDeepSeek-V3.2-W8A8.yamltest_kimi_k2_thinking.pyKimi-K2-Thinking.yamltest_mtpx_deepseek_r1_0528_w8a8.pyMTPX-DeepSeek-R1-0528-W8A8.yamltest_prefix_cache_deepseek_r1_0528_w8a8.pyPrefix-Cache-DeepSeek-R1-0528-W8A8.yamltest_qwen3_235b_w8a8.pyQwen3-235B-A22B-W8A8.yamltest_qwen3_235b_a22b_w8a8_eplb.pyQwen3-235B-A22B-W8A8.yamltest_qwen3_30b_w8a8.pyQwen3-30B-A3B-W8A8.yamltest_qwen3_32b_int8.pyQwen3-32B-Int8.yamltest_qwq_32b.pyQwQ-32B.yamltest_qwen3_next_w8a8.pyQwen3-Next-80B-A3B-Instruct-W8A8.yamltest_qwen2_5_vl_7b.pyQwen2.5-VL-7B-Instruct.yamltest_qwen2_5_vl_7b_epd.pyQwen2.5-VL-7B-Instruct-EPD.yamltest_qwen2_5_vl_32b.pyQwen2.5-VL-32B-Instruct.yamltest_qwen3_32b_int8_a3_feature_stack3.pyQwen3-32B-Int8-A3-Feature-Stack3.yamltest_prefix_cache_qwen3_32b_int8.pyPrefix-Cache-Qwen3-32B-Int8.yamltest_qwen3_next.pyQwen3-Next-80B-A3B-Instruct-A2.yamltest_qwen3_32b.pyQwen3-32B.yamltest_qwen3_32b_int8.pyQwen3-32B-Int8-A2.yamlDoes this PR introduce any user-facing change?
How was this patch tested?