[Test] L4 complete diffusion feature test for Qwen-Image-Edit models#1682
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 302cdf9ea2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Adds an L4 “complete diffusion feature” test suite for Qwen-Image-Edit models, covering both online serving and offline inference modes, and refactors test utilities to support these scenarios.
Changes:
- Added new e2e tests for Qwen-Image-Edit and Qwen-Image-Edit-2509 diffusion features (online + offline).
- Refactored hardware test marking to expose reusable
hardware_marks(...)(with optionalparallelmarking). - Centralized Omni server + media validation helpers in
tests/conftest.pyand removed duplicated server code from an existing test.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils.py | Introduces hardware_marks(...) and extends hardware_test(...) with an optional parallel mark. |
| tests/e2e/online_serving/test_qwen_image_edit_expansion.py | Adds online-serving diffusion feature matrix tests for Qwen-Image-Edit models. |
| tests/e2e/offline_inference/test_qwen_image_edit_expansion.py | Adds offline-only diffusion feature tests (TP / VAE patch parallel). |
| tests/e2e/online_serving/test_image_gen_edit.py | Removes local OmniServer implementation and switches to shared fixture/utilities. |
| tests/conftest.py | Adds shared image/video/audio assertions and base64 image decoding helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
hsliuustc0106
left a comment
There was a problem hiding this comment.
Review
Rating: 8.5/10 | Verdict: ✅ Approved
Summary
Comprehensive L4 test coverage for Qwen-Image-Edit models across diffusion features (TeaCache, Cache-DiT, parallelism modes, CPU offload). Well-structured test organization following the multi-level testing system.
Highlights
- ✅ Covers 18 test combinations (online + offline)
- ✅ Proper test markers (diffusion, parallel, distributed_cuda)
- ✅ Tests both single-image and multi-image variants
- ✅ Reusable test utilities in conftest.py
- ✅ Clear test plan and collection verification
Minor Suggestions
- Test results section mentions "Running on my side..." but no results provided yet
- Consider adding test timeout configuration for long-running diffusion tests
Recommendation
Ready to merge once test results are provided.
Reviewed by OpenClaw with vllm-omni-skills 🦐
adc26d6 to
2a962dd
Compare
|
PR is ready. @yenuo26 please check
@wtomin @Bounty-hunter please also check if it fits our previous discussions. @hsliuustc0106 please add a ready tag, thanks |
|
@congw729 please check the modified part of the mark |
e5580ea to
6727cdb
Compare
| def omni_server(request, run_level, model_prefix): | ||
| """Start vLLM-Omni server as a subprocess with actual model weights. | ||
| Uses session scope so the server starts only once for the entire test session. | ||
| Multi-stage initialization can take 10-20+ minutes. |
There was a problem hiding this comment.
Please give documentation for these arguments here.
There was a problem hiding this comment.
Done. I did not add Args: section in the docstring because pytest fixtures are not intended to be "called" by users. pytest will call them internally, and create variables with the same names. The arguments are all fixtures themselves, defined above (except request, which is a pytest internal argument) and auto-loaded by pytest.
So I add
- Type annotations of all arguments, for future developers of this helper fixture
- Docstring for
run_levelandmodel_prefixfixtures, which are defined above in this file.
| To ensure project maintainability and sustainable development, please submit test code (unit tests, system tests, or end-to-end tests) alongside their code changes. | ||
| For comprehensive testing guidelines and the definition of test levels (L1-L5), please refer to the [Test File Structure and Style Guide](../ci/tests_style.md). | ||
| The following tests are required to add: | ||
| - L4 test of the model's full *functionality* (i.e., all the *diffusion features* that are supported by this model), including several [parallelism acceleration methods](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/parallelism_acceleration/), [CPU offloading](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/cpu_offload_diffusion/), [TeaCache](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/teacache/) and [Cache-DiT](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/cache_dit_acceleration/) cache backends, [quantization methods](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/quantization/overview/). |
There was a problem hiding this comment.
This doc works for me temporally. In the future, we should have detailed docs under docs/contribution/ci/ for different levels tests for diffusion models.
d7a3ee6 to
4d8dd88
Compare
ab3a24b to
1031c30
Compare
|
Note: The tests are intended to be nightly. I have previously temporarily make it run on CI with the "[WIP] ..." commit. Results are as follow:
|
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
2613584 to
aca42cb
Compare
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
…llm-project#1682) Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com> Signed-off-by: Megha Agarwal <agarwalmegha1308@gmail.com>
…llm-project#1682) Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Add L4 expansion tests for Qwen-Image and Qwen-Image-2512 (text-to-image) following the structure of PR vllm-project#1682 (Qwen-Image-Edit). Covers: - TeaCache + layerwise offload (single card) - Cache-DiT + Ulysses=2, Ring=2, CFG-Parallel=2, TP=2 + VAE-Patch-Parallel=2 Tests are picked up by nightly: test_*_expansion.py -m 'advanced_model and diffusion and H100'. Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com>
…llm-project#1682) Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Purpose
To follow the recent establishment of multi-level testing system, this PR adds L4 test for Qwen-Image-Edit models (single-image model and 2509 multi-image model). It can also serve as an example for future contribution of other diffusion models.
Test Plan
serveCLI implementation file. Note that tensor parallelism is supported by vLLM underneath, so it is not present in this file. Then, if the feature is not available in online mode, add another filetests/e2e/offline_inference/test_{model}_expansion.py.Test Result
Complete Test
On my own side, I am running the tests in smaller groups. And they all pass
Quick sanity check of test markers:
pytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m diffusionoutputpytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m paralleloutputpytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m distributed_cudaoutputEssential Elements of an Effective PR Description Checklist
(Optional) The necessary documentation update, such as updatingsupported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.(Optional) Release notes update. If your change is user-facing, please update the release notes draft.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)