[Test] add stability test case for wan2.2, qwen-tts, qwen3-omni and qwen-image model and modified conftest.py in test/dfx/#2817
Merged
hsliuustc0106 merged 39 commits intoApr 23, 2026
Conversation
…ions and scripts - Introduced new stability test scripts for Qwen3-Omni and Wan2.2 models, including `test_stability_qwen3_omni.py` and `test_stability_wan22.py`. - Added corresponding JSON configuration files for both models to define benchmark parameters. - Updated existing documentation to reflect changes in stability testing configurations and methods. - Enhanced the `conftest.py` files to support new test structures and parameters. These additions aim to improve the stability testing framework and provide comprehensive benchmarks for the new models. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…mands and benchmark execution - Added L5 stability testing commands for Qwen3-Omni and Wan2.2 models in the test guide. - Introduced a new `run_benchmark` function in `conftest.py` to streamline benchmark execution and result handling. - Refactored existing stability test scripts to utilize the new benchmark execution method, improving code organization and maintainability. These updates aim to enhance the stability testing capabilities and provide clearer guidance for executing benchmarks. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…8/vllm-omni into main-longterm-wan22
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Collaborator
|
The test_wan22.json config only runs 3 prompts (num_prompts_per_batch=3) over 300 seconds with max_concurrency=1. Is this sufficient for a stability test? Consider increasing num_prompts_per_batch to get better coverage. |
Signed-off-by: zhumingjue138 <zhumingjue@huawei.com>
…string in conftest.py Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…server params creation and update OmniServer fixture to accommodate stage config paths. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
… improve performance stability. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…s; introduce serve_args support for OmniServer fixture and streamline unique server params creation. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
1 task
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…iptions Signed-off-by: zhumingjue <zhumingjue@huawei.com>
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
… gracefully. Updated paths to use 'deploy' directory instead of 'stage_configs'. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…bility_qwen3_omni.py to allow for longer initialization periods. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…e stability test functionality. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…t.py and update test configurations - Introduced functions to sample integer values from specified ranges and to handle bucket key sampling. - Updated benchmark parameters in test_qwen3_omni.json to use range specifications for input and output lengths, and adjusted request rates. - Changed dataset names from "random" to "random-mm" for clarity. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…pecifications for bucket keys - Modified the bucket configuration from "(0, 60, 3)" to "(0, 1-60, 1-3)" for improved clarity and consistency with recent changes in sampling functionality. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…y testing adjustments Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…event ffmpeg encoding failures - Added logic to ensure that height and width are even numbers when the number of frames is greater than one, addressing potential encoding/decoding issues. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…onfigurations - Introduced new test scripts for Qwen-Image and Qwen3-TTS stability benchmarks, utilizing parameterized test cases to handle various server configurations and benchmark parameters. - Updated the `_sample_stability_batch_params` function in `conftest.py` to include additional fields for width and height, enhancing the sampling capabilities for stability tests. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
- Introduced JSON test files for Qwen-Image and Qwen3-TTS, defining server and benchmark parameters for stability testing. - Each test includes detailed configurations such as model specifications, dataset names, and various performance metrics. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
- Removed redundant parameters and adjusted the random_range_ratio to 0.0 for improved stability testing. - Updated random_mm_bucket_config to use range specifications for clarity. - Cleaned up the JSON structure by eliminating unnecessary dataset entries. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
yenuo26
reviewed
Apr 21, 2026
| os.environ.pop("BENCHMARK_DIR") | ||
|
|
||
|
|
||
| def _run_one_diffusion_batch( |
Collaborator
There was a problem hiding this comment.
i think these can be moved to helpers.py
- Moved benchmark helper functions from `conftest.py` to `helpers.py` for better organization and clarity. - Updated test scripts to import benchmark functions from `helpers.py`, ensuring a cleaner structure. - Enhanced the documentation in `conftest.py` to reflect the new organization of helper functions. Signed-off-by: zhumingjue <zhumingjue@huawei.com>
Contributor
Author
|
@Gaohan123 @hsliuustc0106 this pr is ready, can it be merged? |
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
…8/vllm-omni into main-longterm-wan22
hongzhi-gao
pushed a commit
to hongzhi-gao/vllm-omni
that referenced
this pull request
Apr 23, 2026
…wen-image model and modified conftest.py in test/dfx/ (vllm-project#2817) Signed-off-by: zhumingjue <zhumingjue@huawei.com> Signed-off-by: zhumingjue138 <zhumingjue@huawei.com> Signed-off-by: hongzhigao <761417898@qq.com>
12 tasks
1 task
lengrongfu
pushed a commit
to lengrongfu/vllm-omni
that referenced
this pull request
May 1, 2026
…wen-image model and modified conftest.py in test/dfx/ (vllm-project#2817) Signed-off-by: zhumingjue <zhumingjue@huawei.com> Signed-off-by: zhumingjue138 <zhumingjue@huawei.com>
clodaghwalsh17
pushed a commit
to clodaghwalsh17/nm-vllm-omni-ent
that referenced
this pull request
May 12, 2026
…wen-image model and modified conftest.py in test/dfx/ (vllm-project#2817) Signed-off-by: zhumingjue <zhumingjue@huawei.com> Signed-off-by: zhumingjue138 <zhumingjue@huawei.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
add stability test case for wan2.2 model and modified conftest.py in test/dfx/
Test Plan
1、modified conftest.py in test/dfx/
2、Split the file "tests/dfx/stability/scripts/test_benchmark_stability.py" according to the model names and rename it as "tests/dfx/stability/scripts/test_stability_qwen3_omni.py" and "tests/dfx/stability/scripts/test_stability_wan22.py"
pytest -s -v tests/dfx/perf/scripts/run_benchmark.py
pytest -s -v tests/dfx/stability/scripts/test_stability_qwen3_omni.py
pytest -s -v tests/dfx/stability/scripts/test_stability_wan22.py
pytest -s -v tests/dfx/stability/scripts/test_stability_qwen_image.py
pytest -s -v tests/dfx/stability/scripts/test_stability_qwen3_tts.py
Test Result
pytest -s -v tests/dfx/perf/scripts/run_benchmark.py

pytest -s -v tests/dfx/stability/scripts/test_stability_qwen3_omni.py
pytest -s -v tests/dfx/stability/scripts/test_stability_qwen_image.py
pytest -s -v tests/dfx/stability/scripts/test_stability_qwen3_tts.py
pytest -s -v tests/dfx/stability/scripts/test_stability_wan22.py
nightly CI

24h test:
related issue: #2928
12h test:
another 12h test
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)