[Test] Add performance tests for Qwen-Image-Layered model#2807
Merged
hsliuustc0106 merged 5 commits intovllm-project:mainfrom Apr 16, 2026
Merged
[Test] Add performance tests for Qwen-Image-Layered model#2807hsliuustc0106 merged 5 commits intovllm-project:mainfrom
hsliuustc0106 merged 5 commits intovllm-project:mainfrom
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
19 tasks
| "baseline": { | ||
| "throughput_qps": 0.02, | ||
| "latency_mean": 40.0, | ||
| "peak_memory_mb_max": 82000, |
Collaborator
There was a problem hiding this comment.
82GB? H100 single device only have 80GB, how could this become the baseline?
| "baseline": { | ||
| "throughput_qps": 0.005, | ||
| "latency_mean": 150.0, | ||
| "peak_memory_mb_max": 90000, |
| "enable-negative-prompt": true, | ||
| "baseline": { | ||
| "throughput_qps": 0.02, | ||
| "latency_mean": 40.0, |
Collaborator
There was a problem hiding this comment.
what's the tested latency results locally?
Collaborator
|
Collaborator
|
What is the relationship between this and PR #2772 ? |
Contributor
Author
2772 Focuses on the accuracy while this is targeted for performance |
Contributor
Author
|
f5e20dc to
2275bc3
Compare
Add a new perf test JSON for the Qwen/Qwen-Image-Layered model using the vllm-omni server. Defines a single-device baseline with diffusion pipeline profiler enabled and two benchmark scenarios (640x640, 20 steps; 1024x1024, 35 steps) including baseline throughput, latency, and peak memory targets. Signed-off-by: John Liu BUAA <liukecheng97@gmail.com>
Run an additional pytest using tests/dfx/perf/tests/test_qwen_image_layered_vllm_omni.json in the nightly diffusion pipeline. Capture its exit code as EXIT4 and include it in the success conditional and final exit bitwise OR so the step considers this new test when uploading artifacts and determining overall status. Signed-off-by: John Liu BUAA <liukecheng97@gmail.com>
Run an additional diffusion benchmark (tests/dfx/perf/tests/test_qwen_image_layered_vllm_omni.json) in the nightly pipeline, capture its exit code (EXIT4), and include it in the success condition so artifacts (results and logs) are uploaded if any benchmark succeeds. Signed-off-by: John Liu BUAA <liukecheng97@gmail.com>
fhfuih
reviewed
Apr 16, 2026
Contributor
fhfuih
left a comment
There was a problem hiding this comment.
LGTM. The errors in the latest CI pipeline seems unrelated: https://buildkite.com/vllm/vllm-omni/builds/6894/steps/canvas
- Output (Accuracy?) assertion in Omni model
- Timeout in diffusion functionality test
lvliang-intel
pushed a commit
to lvliang-intel/vllm-omni
that referenced
this pull request
Apr 20, 2026
…ct#2807) Signed-off-by: John Liu BUAA <liukecheng97@gmail.com>
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds a new performance test configuration for the Qwen Image Layered model using the vLLM-Omni server. The test includes single-device baselines for two different image sizes and step counts, with detailed benchmark parameters and expected baseline metrics.
Performance testing:
test_qwen_image_layered_single_devicetotest_qwen_image_layered_vllm_omni.json, providing single-device performance baselines for the Qwen Image Layered model on the vLLM-Omni server, including two benchmark scenarios with different image resolutions and inference steps.