[Test] L4 complete diffusion feature test for Qwen-Image-Edit models by fhfuih · Pull Request #1682 · vllm-project/vllm-omni

fhfuih · 2026-03-05T08:41:34Z

Purpose

To follow the recent establishment of multi-level testing system, this PR adds L4 test for Qwen-Image-Edit models (single-image model and 2509 multi-image model). It can also serve as an example for future contribution of other diffusion models.

Test Plan

The complete set of diffusion features are taken into consideration. The most recent list of diffusion features is at [RFC]: Continuous Diffusion Model Acceleration Support #1217 . It can be more updated than the Diffusion Acceleration doc.
The features that the tested model does not yet support are excluded.
- For example, for Qwen-Image-Edit models, Quantization and HSDP are excluded. (At the time of writing, VAE Patching Parallel is marked as "pending", but the PR was just merged.) This is the list of features tested for Qwen-Image-Edit(-2509)
  - TeaCache
  - Cache-DiT
  - Ulysses=2
  - Ring=2
  - CFG-Parallel=2
  - CPU Offload (model)=2
  - CPU Offload (Layerwise)=2
  - Tensor-Parallel=2
  - VAE-Patch-Parallel=2
Currently, all diffusion features are supported in online serving mode. In case of additional features in the future, one can check the list of features available in online serving mode at the serve CLI implementation file. Note that tensor parallelism is supported by vLLM underneath, so it is not present in this file. Then, if the feature is not available in online mode, add another file tests/e2e/offline_inference/test_{model}_expansion.py.

Test Result

Complete Test

On my own side, I am running the tests in smaller groups. And they all pass

Quick sanity check of test markers:

pytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m diffusion output

<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[cache_tea_cache]>
          <Function test_qwen_image_edit_multi[cache_tea_cache]>
          <Function test_qwen_image_edit_single[cache_cache_dit]>
          <Function test_qwen_image_edit_multi[cache_cache_dit]>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_multi[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_single[tensor_parallel_2]>
          <Function test_qwen_image_edit_multi[tensor_parallel_2]>
          <Function test_qwen_image_edit_single[cpu_offload]>
          <Function test_qwen_image_edit_multi[cpu_offload]>
          <Function test_qwen_image_edit_single[layerwise_offload]>
          <Function test_qwen_image_edit_multi[layerwise_offload]>
=============================================================================================================== 18 tests collected in 0.01s ================================================================================================================

pytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m parallel output

<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_multi[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_single[tensor_parallel_2]>
          <Function test_qwen_image_edit_multi[tensor_parallel_2]>
===== 10/18 tests collected (8 deselected) in 0.01s =====

pytest tests/e2e/online_serving/test_qwen_image_edit_expansion.py --collect-only -m distributed_cuda output

<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_multi[vae_patch_parallel_2]>
          <Function test_qwen_image_edit_single[tensor_parallel_2]>
          <Function test_qwen_image_edit_multi[tensor_parallel_2]>
===== 10/18 tests collected (8 deselected) in 0.01s =====

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
~~(Optional) Release notes update. If your change is user-facing, please update the release notes draft.~~

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 302cdf9ea2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copilot

Pull request overview

Adds an L4 “complete diffusion feature” test suite for Qwen-Image-Edit models, covering both online serving and offline inference modes, and refactors test utilities to support these scenarios.

Changes:

Added new e2e tests for Qwen-Image-Edit and Qwen-Image-Edit-2509 diffusion features (online + offline).
Refactored hardware test marking to expose reusable hardware_marks(...) (with optional parallel marking).
Centralized Omni server + media validation helpers in tests/conftest.py and removed duplicated server code from an existing test.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/utils.py	Introduces `hardware_marks(...)` and extends `hardware_test(...)` with an optional `parallel` mark.
tests/e2e/online_serving/test_qwen_image_edit_expansion.py	Adds online-serving diffusion feature matrix tests for Qwen-Image-Edit models.
tests/e2e/offline_inference/test_qwen_image_edit_expansion.py	Adds offline-only diffusion feature tests (TP / VAE patch parallel).
tests/e2e/online_serving/test_image_gen_edit.py	Removes local OmniServer implementation and switches to shared fixture/utilities.
tests/conftest.py	Adds shared image/video/audio assertions and base64 image decoding helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

hsliuustc0106

Review

Rating: 8.5/10 | Verdict: ✅ Approved

Summary

Comprehensive L4 test coverage for Qwen-Image-Edit models across diffusion features (TeaCache, Cache-DiT, parallelism modes, CPU offload). Well-structured test organization following the multi-level testing system.

Highlights

✅ Covers 18 test combinations (online + offline)
✅ Proper test markers (diffusion, parallel, distributed_cuda)
✅ Tests both single-image and multi-image variants
✅ Reusable test utilities in conftest.py
✅ Clear test plan and collection verification

Minor Suggestions

Test results section mentions "Running on my side..." but no results provided yet
Consider adding test timeout configuration for long-running diffusion tests

Recommendation

Ready to merge once test results are provided.

Reviewed by OpenClaw with vllm-omni-skills 🦐

fhfuih · 2026-03-06T08:16:52Z

PR is ready. @yenuo26 please check

I add a new param to omni_server and omni_runner
I use pytest parametrization, so that the test case names look as follow, not the same as test_qwen3_omni_expansion.

<Dir vllm-omni>
  <Package tests>
    <Package e2e>
      <Package offline_inference>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[tp_2]>
          <Function test_qwen_image_edit_single[vae_parallel_2]>
          <Function test_qwen_image_edit_multi[tp_2]>
          <Function test_qwen_image_edit_multi[vae_parallel_2]>
      <Package online_serving>
        <Module test_qwen_image_edit_expansion.py>
          <Function test_qwen_image_edit_single[cache_tea_cache]>
          <Function test_qwen_image_edit_single[cache_cache_dit]>
          <Function test_qwen_image_edit_single[ulysses_2]>
          <Function test_qwen_image_edit_single[ring_2]>
          <Function test_qwen_image_edit_single[cfg_parallel_2]>
          <Function test_qwen_image_edit_single[cpu_offload]>
          <Function test_qwen_image_edit_single[layerwise_offload]>
          <Function test_qwen_image_edit_multi[cache_tea_cache]>
          <Function test_qwen_image_edit_multi[cache_cache_dit]>
          <Function test_qwen_image_edit_multi[ulysses_2]>
          <Function test_qwen_image_edit_multi[ring_2]>
          <Function test_qwen_image_edit_multi[cfg_parallel_2]>
          <Function test_qwen_image_edit_multi[cpu_offload]>
          <Function test_qwen_image_edit_multi[layerwise_offload]>
=============================================================================================================== 18 tests collected in 0.01s ================================================================================================================

@wtomin @Bounty-hunter please also check if it fits our previous discussions.

@hsliuustc0106 please add a ready tag, thanks

yenuo26 · 2026-03-06T08:44:40Z

@congw729 please check the modified part of the mark

wtomin · 2026-03-09T07:49:27Z

+def omni_server(request, run_level, model_prefix):
    """Start vLLM-Omni server as a subprocess with actual model weights.
    Uses session scope so the server starts only once for the entire test session.
    Multi-stage initialization can take 10-20+ minutes.


Please give documentation for these arguments here.

Done. I did not add Args: section in the docstring because pytest fixtures are not intended to be "called" by users. pytest will call them internally, and create variables with the same names. The arguments are all fixtures themselves, defined above (except request, which is a pytest internal argument) and auto-loaded by pytest.

So I add

Type annotations of all arguments, for future developers of this helper fixture

Docstring for run_level and model_prefix fixtures, which are defined above in this file.

wtomin · 2026-03-09T08:22:40Z

+To ensure project maintainability and sustainable development, please submit test code (unit tests, system tests, or end-to-end tests) alongside their code changes.
+For comprehensive testing guidelines and the definition of test levels (L1-L5), please refer to the [Test File Structure and Style Guide](../ci/tests_style.md).
+The following tests are required to add:
+- L4 test of the model's full *functionality* (i.e., all the *diffusion features* that are supported by this model), including several [parallelism acceleration methods](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/parallelism_acceleration/), [CPU offloading](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/cpu_offload_diffusion/), [TeaCache](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/teacache/) and [Cache-DiT](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/cache_dit_acceleration/) cache backends, [quantization methods](https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/quantization/overview/).


This doc works for me temporally. In the future, we should have detailed docs under docs/contribution/ci/ for different levels tests for diffusion models.

fhfuih · 2026-03-10T06:55:37Z

Note: ~~Do not merge this PR yet if this bolded line is not deleted.~~ This PR is ready to be merged

The tests are intended to be nightly. I have previously temporarily make it run on CI with the "[WIP] ..." commit. Results are as follow:

The latest version
- https://buildkite.com/vllm/vllm-omni/builds/3969/steps/canvas?sid=019cdab5-3fbb-4f59-9623-56e0fefb0614&tab=output
- Combine features together to reduce the number of test cases (from 9 to 5 for Qwen-Image-Edit. For a model that supports all features, theoretically reducing #cases from 11 to 6)
- Runs for ~15 min (~7.5 min for Qwen-Image-Edit and ~7.5 min for Qwen-Image-Edit-2509)
A previous version
- https://buildkite.com/vllm/vllm-omni/builds/3907/steps/canvas?sid=019cd64c-5ee1-4d18-9d5e-a3a9b48db7f7&tab=output
- Enable 1 diffusion feature per test case. 9 in total for Qwen-Image-Edit. Theoretically 11 max.
- Runs for ~30 min (15 min for each model)

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>

…llm-project#1682) Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com> Signed-off-by: Megha Agarwal <agarwalmegha1308@gmail.com>

…llm-project#1682) Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Add L4 expansion tests for Qwen-Image and Qwen-Image-2512 (text-to-image) following the structure of PR vllm-project#1682 (Qwen-Image-Edit). Covers: - TeaCache + layerwise offload (single card) - Cache-DiT + Ulysses=2, Ring=2, CFG-Parallel=2, TP=2 + VAE-Patch-Parallel=2 Tests are picked up by nightly: test_*_expansion.py -m 'advanced_model and diffusion and H100'. Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com>

…llm-project#1682) Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

fhfuih marked this pull request as ready for review March 5, 2026 08:41

Copilot AI review requested due to automatic review settings March 5, 2026 08:41

chatgpt-codex-connector Bot reviewed Mar 5, 2026

View reviewed changes

Comment thread tests/e2e/offline_inference/test_qwen_image_edit_expansion.py Outdated

Comment thread tests/conftest.py Outdated

Copilot AI reviewed Mar 5, 2026

View reviewed changes

Copilot started reviewing on behalf of fhfuih March 5, 2026 09:21 View session

fhfuih mentioned this pull request Mar 6, 2026

[RFC]: Supplement use cases for L1, L3, and L4 JiusiServe/vllm-omni#163

Closed

1 task

hsliuustc0106 approved these changes Mar 6, 2026

View reviewed changes

fhfuih force-pushed the test-qwen-image-edit branch from adc26d6 to 2a962dd Compare March 6, 2026 08:11

yenuo26 reviewed Mar 6, 2026

View reviewed changes

Comment thread .buildkite/test-nightly.yml Outdated

Comment thread tests/e2e/online_serving/test_image_gen_edit.py

yenuo26 reviewed Mar 6, 2026

View reviewed changes

Comment thread tests/e2e/online_serving/test_qwen_image_edit_expansion.py Outdated

Comment thread tests/e2e/online_serving/test_image_gen_edit.py

hsliuustc0106 reviewed Mar 7, 2026

View reviewed changes

Comment thread tests/e2e/online_serving/test_image_gen_edit.py

fhfuih force-pushed the test-qwen-image-edit branch from e5580ea to 6727cdb Compare March 9, 2026 02:26