[BugFix] Surface diffusion metrics in chat completions; sanitize Bagel img2img mm kwargs by NumberWan · Pull Request #2932 · vllm-project/vllm-omni

NumberWan · 2026-04-20T03:08:53Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Bug Description

Run Code:
pytest -s -v /home/w00917303/vllm-omni/tests/dfx/perf/scripts/run_diffusion_benchmark.py -- --config-file /home/w00917303/vllm-omni/tests/dfx/perf/tests/test_bagel_vllm_omni.json

In t2i test, memory related information cannot read correctly since the metric merging have problem in mutli-stage model.

In i2i test, it showed 400 bad request. Root cause clarification: In the img2img path, OmniBagelProcessor forwards target_h/target_w (derived from width/height in the benchmark extra_body) into tokenizer(), which causes the failure. The SigLIP warning only indicates that image preprocessing ignores these two arguments.

Summary

Ensure diffusion profiler fields (stage_durations, peak_memory_mb) are visible to clients and benchmarks on the OpenAI-compatible chat completions path for image outputs.
When height / width are provided, always pass target_h / target_w via mm_processor_kwargs for both t2i and i2i so GLM-style processors keep consistent behavior.
Strip target_h / target_w inside OmniBagelMultiModalProcessor before Bagel img2img HF calls so Bagel stays compatible without model-specific branching in the HTTP layer.

Changes

vllm_omni/entrypoints/openai/serving_chat.py — merge / default image response metrics with omni profiler fields; keep unified mm_processor_kwargs for dimensions on the omni multistage image path.
benchmarks/diffusion/backends.py — if message-level metrics lack stage_durations or peak_memory_mb, fall back to top-level metrics in the JSON body.
vllm_omni/model_executor/models/bagel/bagel.py — _mm_kwargs_for_bagel_img2img_hf() and use it on img2img super()._call_hf_processor(...) paths.

Testing

Added unit test: test_create_image_choice_exposes_diffusion_metrics in tests/entrypoints/openai_api/test_serving_chat_metrics.py to verify stage_durations / peak_memory_mb are exposed in image chat completions.
Added benchmark regression tests in tests/benchmarks/test_diffusion_backends_metrics.py:
- test_chat_completions_metrics_fallback_to_top_level
- test_chat_completions_metrics_message_level_takes_precedence
Updated e2e test test_bagel_img2img_online in tests/e2e/online_serving/test_bagel_online.py to include explicit height / width in extra_body.

Test Results

In, t2i test, memory related information read correctly

In i2i test, both 30 request replied succussfully

python -m pytest tests/entrypoints/openai_api/test_serving_chat_metrics.py -vv
- test_create_image_choice_exposes_diffusion_metrics ✅ PASSED
python -m pytest tests/benchmarks/test_diffusion_backends_metrics.py -vv
- test_chat_completions_metrics_fallback_to_top_level ✅ PASSED
- test_chat_completions_metrics_message_level_takes_precedence ✅ PASSED
`python -m pytest tests/e2e/online_serving/test_bagel_online.py -vv
- test_bagel_img2img_online[omni_server0] ✅ PASSED

Notes

Intentionally keeps Bagel-specific behavior in the Bagel multimodal processor rather than if model == ... in serving_chat.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

…lback Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

hsliuustc0106

BLOCKING:

Test Coverage — Missing regression test. Please add an automated test that verifies:
1. Diffusion profiler metrics (stage_durations, peak_memory_mb) are exposed in chat completions responses for image outputs
2. The benchmark fallback logic correctly populates metrics from top-level when message-level metrics are missing
3. Bagel img2img mode works correctly with height/width parameters without HF processor errors

The current test plan only provides manual verification. Please add automated tests with assertions.

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan · 2026-04-21T08:09:44Z

BLOCKING:

Test Coverage — Missing regression test. Please add an automated test that verifies:

Diffusion profiler metrics (stage_durations, peak_memory_mb) are exposed in chat completions responses for image outputs

The benchmark fallback logic correctly populates metrics from top-level when message-level metrics are missing

Bagel img2img mode works correctly with height/width parameters without HF processor errors

The current test plan only provides manual verification. Please add automated tests with assertions.

The 3 tests have been added and passed all tests. Please refer to the "Test" and "Test Result" parts in the description

NumberWan · 2026-04-21T08:10:04Z

@natureofnature

hsliuustc0106 · 2026-04-21T10:22:14Z

I think you need to paste the results w/o this commit, I cannot see the benefits of this PR

NumberWan · 2026-04-22T06:24:01Z

I think you need to paste the results w/o this commit, I cannot see the benefits of this PR

The description has been updated, before&after result showed in the description

Gaohan123

LGTM. Thanks

…l img2img mm kwargs (vllm-project#2932) Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan added 3 commits April 20, 2026 10:28

fix(benchmark): expose diffusion metrics in chat metrics + client fal…

87f3a8d

…lback Signed-off-by: NumberWan <wantszkin2003@gmail.com>

restore i2i mm kwargs for GLM strip targets in Bagel HF path

53017c2

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

restore i2i mm kwargs for GLM strip targets in Bagel HF path

8fd3f2b

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan requested a review from hsliuustc0106 as a code owner April 20, 2026 03:08

NumberWan mentioned this pull request Apr 20, 2026

[RFC]: L4 Performance Test for Bagel JiusiServe/vllm-omni#200

Open

16 tasks

Gaohan123 added this to the v0.20.0 milestone Apr 20, 2026

hsliuustc0106 reviewed Apr 20, 2026

View reviewed changes

NumberWan changed the title ~~[WIP][BugFix] Surface diffusion metrics in chat completions; sanitize Bagel img2img mm kwargs~~ [BugFix] Surface diffusion metrics in chat completions; sanitize Bagel img2img mm kwargs Apr 21, 2026

UT case

6f72223

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Gaohan123 added the ready label to trigger buildkite CI label Apr 22, 2026

Gaohan123 approved these changes Apr 22, 2026

View reviewed changes

Gaohan123 enabled auto-merge (squash) April 22, 2026 16:23

Gaohan123 merged commit ead87aa into vllm-project:main Apr 22, 2026
7 of 8 checks passed

yenuo26 mentioned this pull request Apr 23, 2026

[CI Failure]: Bagel Model Test with H100 (Real Weights), tests/e2e/online_serving/test_bagel_online.py::test_bagel_img2img_online[omni_server0] , AssertionError: Expected width=512, got 1024 #3048

Open

1 task

qinganrice pushed a commit to qinganrice/vllm-omni that referenced this pull request Apr 23, 2026

[BugFix] Surface diffusion metrics in chat completions; sanitize Bage…

ce6c3da

…l img2img mm kwargs (vllm-project#2932) Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan mentioned this pull request Apr 23, 2026

[BugFix] Bagel img2img e2e: drop extra_body height/width #3054

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Surface diffusion metrics in chat completions; sanitize Bagel img2img mm kwargs#2932

[BugFix] Surface diffusion metrics in chat completions; sanitize Bagel img2img mm kwargs#2932
Gaohan123 merged 4 commits intovllm-project:mainfrom
NumberWan:bugfix/benchmark_metric

NumberWan commented Apr 20, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 left a comment

Uh oh!

NumberWan commented Apr 21, 2026

Uh oh!

NumberWan commented Apr 21, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 commented Apr 21, 2026

Uh oh!

NumberWan commented Apr 22, 2026

Uh oh!

Gaohan123 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

NumberWan commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Bug Description

Summary

Changes

Testing

Test Results

Notes

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

NumberWan commented Apr 21, 2026

Uh oh!

NumberWan commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hsliuustc0106 commented Apr 21, 2026

Uh oh!

NumberWan commented Apr 22, 2026

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NumberWan commented Apr 20, 2026 •

edited

Loading

NumberWan commented Apr 21, 2026 •

edited

Loading