Skip to content

[Test] L4 complete diffusion feature test for Qwen-Image-Edit models#1682

Merged
wtomin merged 10 commits into
vllm-project:mainfrom
fhfuih:test-qwen-image-edit
Mar 11, 2026
Merged

[Test] L4 complete diffusion feature test for Qwen-Image-Edit models#1682
wtomin merged 10 commits into
vllm-project:mainfrom
fhfuih:test-qwen-image-edit

Conversation

@fhfuih
Copy link
Copy Markdown
Contributor

@fhfuih fhfuih commented Mar 5, 2026

Purpose

To follow the recent establishment of multi-level testing system, this PR adds L4 test for Qwen-Image-Edit models (single-image model and 2509 multi-image model). It can also serve as an example for future contribution of other diffusion models.

Test Plan

  • The complete set of diffusion features are taken into consideration. The most recent list of diffusion features is at [RFC]: Continuous Diffusion Model Acceleration Support #1217 . It can be more updated than the Diffusion Acceleration doc.
  • The features that the tested model does not yet support are excluded.
    • For example, for Qwen-Image-Edit models, Quantization and HSDP are excluded. (At the time of writing, VAE Patching Parallel is marked as "pending", but the PR was just merged.) This is the list of features tested for Qwen-Image-Edit(-2509)
      • TeaCache
      • Cache-DiT
      • Ulysses=2
      • Ring=2
      • CFG-Parallel=2
      • CPU Offload (model)=2
      • CPU Offload (Layerwise)=2
      • Tensor-Parallel=2
      • VAE-Patch-Parallel=2
  • Currently, all diffusion features are supported in online serving mode. In case of additional features in the future, one can check the list of features available in online serving mode at the serve CLI implementation file. Note that tensor parallelism is supported by vLLM underneath, so it is not present in this file. Then, if the feature is not available in online mode, add another file tests/e2e/offline_inference/test_{model}_expansion.py.

Test Result

Complete Test

On my own side, I am running the tests in smaller groups. And they all pass

Quick sanity check of test markers:

pytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m diffusion output
<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[cache_tea_cache]>
          <Function test_qwen_image_edit_multi[cache_tea_cache]>
          <Function test_qwen_image_edit_single[cache_cache_dit]>
          <Function test_qwen_image_edit_multi[cache_cache_dit]>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_multi[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_single[tensor_parallel_2]>
          <Function test_qwen_image_edit_multi[tensor_parallel_2]>
          <Function test_qwen_image_edit_single[cpu_offload]>
          <Function test_qwen_image_edit_multi[cpu_offload]>
          <Function test_qwen_image_edit_single[layerwise_offload]>
          <Function test_qwen_image_edit_multi[layerwise_offload]>
=============================================================================================================== 18 tests collected in 0.01s ================================================================================================================
pytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m parallel output
<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_multi[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_single[tensor_parallel_2]>
          <Function test_qwen_image_edit_multi[tensor_parallel_2]>
===== 10/18 tests collected (8 deselected) in 0.01s =====
pytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m distributed_cuda output
<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_multi[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_single[tensor_parallel_2]>
          <Function test_qwen_image_edit_multi[tensor_parallel_2]>
===== 10/18 tests collected (8 deselected) in 0.01s =====

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@fhfuih fhfuih marked this pull request as ready for review March 5, 2026 08:41
Copilot AI review requested due to automatic review settings March 5, 2026 08:41
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 302cdf9ea2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread tests/e2e/offline_inference/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/conftest.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an L4 “complete diffusion feature” test suite for Qwen-Image-Edit models, covering both online serving and offline inference modes, and refactors test utilities to support these scenarios.

Changes:

  • Added new e2e tests for Qwen-Image-Edit and Qwen-Image-Edit-2509 diffusion features (online + offline).
  • Refactored hardware test marking to expose reusable hardware_marks(...) (with optional parallel marking).
  • Centralized Omni server + media validation helpers in tests/conftest.py and removed duplicated server code from an existing test.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/utils.py Introduces hardware_marks(...) and extends hardware_test(...) with an optional parallel mark.
tests/e2e/online_serving/test_qwen_image_edit_expansion.py Adds online-serving diffusion feature matrix tests for Qwen-Image-Edit models.
tests/e2e/offline_inference/test_qwen_image_edit_expansion.py Adds offline-only diffusion feature tests (TP / VAE patch parallel).
tests/e2e/online_serving/test_image_gen_edit.py Removes local OmniServer implementation and switches to shared fixture/utilities.
tests/conftest.py Adds shared image/video/audio assertions and base64 image decoding helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/conftest.py Outdated
Comment thread tests/e2e/offline_inference/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/e2e/offline_inference/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/e2e/offline_inference/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/e2e/offline_inference/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/e2e/online_serving/test_qwen_image_edit_expansion.py Outdated
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Rating: 8.5/10 | Verdict: ✅ Approved

Summary

Comprehensive L4 test coverage for Qwen-Image-Edit models across diffusion features (TeaCache, Cache-DiT, parallelism modes, CPU offload). Well-structured test organization following the multi-level testing system.

Highlights

  • ✅ Covers 18 test combinations (online + offline)
  • ✅ Proper test markers (diffusion, parallel, distributed_cuda)
  • ✅ Tests both single-image and multi-image variants
  • ✅ Reusable test utilities in conftest.py
  • ✅ Clear test plan and collection verification

Minor Suggestions

  • Test results section mentions "Running on my side..." but no results provided yet
  • Consider adding test timeout configuration for long-running diffusion tests

Recommendation

Ready to merge once test results are provided.


Reviewed by OpenClaw with vllm-omni-skills 🦐

@fhfuih fhfuih force-pushed the test-qwen-image-edit branch from adc26d6 to 2a962dd Compare March 6, 2026 08:11
@fhfuih
Copy link
Copy Markdown
Contributor Author

fhfuih commented Mar 6, 2026

PR is ready. @yenuo26 please check

  1. I add a new param to omni_server and omni_runner
  2. I use pytest parametrization, so that the test case names look as follow, not the same as test_qwen3_omni_expansion.
<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package offline_inference>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[tp_2]>
          <Function test_qwen_image_edit_single[vae_parallel_2]>
          <Function test_qwen_image_edit_multi[tp_2]>
          <Function test_qwen_image_edit_multi[vae_parallel_2]>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[cache_tea_cache]>
          <Function test_qwen_image_edit_single[cache_cache_dit]>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[cpu_offload]>
          <Function test_qwen_image_edit_single[layerwise_offload]>
          <Function test_qwen_image_edit_multi[cache_tea_cache]>
          <Function test_qwen_image_edit_multi[cache_cache_dit]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cpu_offload]>
          <Function test_qwen_image_edit_multi[layerwise_offload]>
=============================================================================================================== 18 tests collected in 0.01s ================================================================================================================

@wtomin @Bounty-hunter please also check if it fits our previous discussions.

@hsliuustc0106 please add a ready tag, thanks

Comment thread .buildkite/test-nightly.yml Outdated
Comment thread tests/e2e/online_serving/test_image_gen_edit.py
Comment thread tests/e2e/online_serving/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/e2e/online_serving/test_image_gen_edit.py
@yenuo26
Copy link
Copy Markdown
Collaborator

yenuo26 commented Mar 6, 2026

@congw729 please check the modified part of the mark

Comment thread tests/e2e/online_serving/test_image_gen_edit.py
@fhfuih fhfuih force-pushed the test-qwen-image-edit branch from e5580ea to 6727cdb Compare March 9, 2026 02:26
Comment thread docs/contributing/ci/tests_style.md Outdated
Comment thread tests/e2e/online_serving/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/e2e/online_serving/test_qwen_image_edit_expansion.py
@wtomin wtomin requested a review from gcanlin March 9, 2026 07:38
Comment thread tests/e2e/online_serving/test_qwen_image_edit_expansion.py Outdated
Comment thread tests/conftest.py
def omni_server(request, run_level, model_prefix):
"""Start vLLM-Omni server as a subprocess with actual model weights.
Uses session scope so the server starts only once for the entire test session.
Multi-stage initialization can take 10-20+ minutes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give documentation for these arguments here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I did not add Args: section in the docstring because pytest fixtures are not intended to be "called" by users. pytest will call them internally, and create variables with the same names. The arguments are all fixtures themselves, defined above (except request, which is a pytest internal argument) and auto-loaded by pytest.

So I add

  1. Type annotations of all arguments, for future developers of this helper fixture
  2. Docstring for run_level and model_prefix fixtures, which are defined above in this file.

@wtomin wtomin added the ready label to trigger buildkite CI label Mar 9, 2026
To ensure project maintainability and sustainable development, please submit test code (unit tests, system tests, or end-to-end tests) alongside their code changes.
For comprehensive testing guidelines and the definition of test levels (L1-L5), please refer to the [Test File Structure and Style Guide](../ci/tests_style.md).
The following tests are required to add:
- L4 test of the model's full *functionality* (i.e., all the *diffusion features* that are supported by this model), including several [parallelism acceleration methods](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/parallelism_acceleration/), [CPU offloading](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/cpu_offload_diffusion/), [TeaCache](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/teacache/) and [Cache-DiT](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/cache_dit_acceleration/) cache backends, [quantization methods](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/quantization/overview/).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc works for me temporally. In the future, we should have detailed docs under docs/contribution/ci/ for different levels tests for diffusion models.

@fhfuih fhfuih force-pushed the test-qwen-image-edit branch from d7a3ee6 to 4d8dd88 Compare March 9, 2026 10:50
Comment thread tests/e2e/online_serving/test_qwen3_omni.py Outdated
@fhfuih fhfuih force-pushed the test-qwen-image-edit branch from ab3a24b to 1031c30 Compare March 10, 2026 03:43
@fhfuih
Copy link
Copy Markdown
Contributor Author

fhfuih commented Mar 10, 2026

Note: Do not merge this PR yet if this bolded line is not deleted. This PR is ready to be merged

The tests are intended to be nightly. I have previously temporarily make it run on CI with the "[WIP] ..." commit. Results are as follow:

fhfuih added 2 commits March 11, 2026 09:31
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
@fhfuih fhfuih force-pushed the test-qwen-image-edit branch from 2613584 to aca42cb Compare March 11, 2026 01:35
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
fhfuih added 2 commits March 11, 2026 10:51
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Comment thread tests/utils.py
Comment thread docs/contributing/model/adding_diffusion_model.md Outdated
fhfuih added 4 commits March 11, 2026 14:16
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
@Gaohan123 Gaohan123 added this to the v0.18.0 milestone Mar 11, 2026
@wtomin wtomin merged commit a190982 into vllm-project:main Mar 11, 2026
6 of 7 checks passed
meghaagr13 pushed a commit to meghaagr13/vllm-omni that referenced this pull request Mar 12, 2026
…llm-project#1682)

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: Megha Agarwal <agarwalmegha1308@gmail.com>
meghaagr13 pushed a commit to meghaagr13/vllm-omni that referenced this pull request Mar 12, 2026
…llm-project#1682)

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
SamitHuang added a commit to SamitHuang/vllm-omni that referenced this pull request Mar 13, 2026
Add L4 expansion tests for Qwen-Image and Qwen-Image-2512 (text-to-image)
following the structure of PR vllm-project#1682 (Qwen-Image-Edit). Covers:
- TeaCache + layerwise offload (single card)
- Cache-DiT + Ulysses=2, Ring=2, CFG-Parallel=2, TP=2 + VAE-Patch-Parallel=2

Tests are picked up by nightly: test_*_expansion.py -m 'advanced_model and diffusion and H100'.

Made-with: Cursor

Signed-off-by: samithuang <285365963@qq.com>
@fhfuih fhfuih deleted the test-qwen-image-edit branch March 16, 2026 06:51
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
…llm-project#1682)

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants