[Bugfix] Fix image quality in /v1/images/generations for multi-stage pipeline by RuixiangMa · Pull Request #2267 · vllm-project/vllm-omni

RuixiangMa · 2026-03-27T09:29:29Z

Purpose

/v1/images/generations and /v1/chat/completions used different request-construction paths in multi-stage pipelines, so stage inputs diverged (especially AR/comprehension conditioning), causing quality drift (e.g., noisy outputs).

for example:

vllm serve --model ZhipuAI/GLM-Image --omni --port 8004

curl -X POST http://localhost:8004/v1/images/generations   -H "Content-Type: application/json"   -d '{
       "prompt": "A beautiful sunset over the ocean with sailing boats",
       "size": "1024x1024",
       "num_inference_steps": 50,
       "guidance_scale": 1.5,
       "seed": 42
     }' | jq -r '.data[0].b64_json' | base64 -d > output.png

Unified multi-stage image request construction through shared logic in serving_chat
Kept single-stage behavior intact with explicit fallback paths.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

…pipeline Signed-off-by: Lancer <maruixiang6688@gmail.com>

Signed-off-by: Lancer <maruixiang6688@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8a9151a61f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T09:36:03Z

+
    try:
+        if _should_route_images_via_chat(stage_configs):
+            chat_handler = Omnichat(raw_request)


Handle missing openai_serving_chat before routing /images

Guard this lookup with getattr before invoking Omnichat: Omnichat(raw_request) ultimately reads request.app.state.openai_serving_chat as a direct attribute, so when that state key is absent (e.g., compatibility setups that only provide engine_client/stage_configs) it raises AttributeError before your if chat_handler is None check runs. That turns valid multi-stage /v1/images/generations calls into a 500 instead of a controlled fallback/error response.

Useful? React with 👍 / 👎.

RuixiangMa · 2026-03-27T09:43:04Z

@JaredforReal PTAL

hsliuustc0106

🏗️ Architecture Review

Excellent bugfix with architectural improvement! This is a textbook example of identifying and fixing a root cause correctly.

✅ What Works Well

1. Root Cause Analysis 🎯

Correctly identified the request construction divergence between and
Understood the impact on multi-stage pipelines (AR/comprehension conditioning)

2. Elegant Solution 🏗️

```
Before: Divergent paths
/v1/images → Direct path (missing stage conditioning)
/v1/chat → Chat path (correct stage inputs)

After: Unified path for multi-stage
/v1/images (multi-stage) → Chat path → Consistent quality
/v1/images (single-stage) → Direct path → Backward compatible
```

3. Backward Compatibility 🛡️

Single-stage behavior remains unchanged
No breaking changes
Smooth migration path

4. Code Quality ✨

Clean extraction of shared logic into `generate_diffusion_images()`
Good separation of concerns
Proper error handling

5. Test Coverage 📊

Added regression test for multi-stage path
Unit test for `_build_multistage_generation_inputs()`
Mock design is appropriate

⚠️ Suggestions for Improvement

1. Documentation Update (Documentation)

Add architecture change documentation to help users understand the fix:

```markdown

Architecture Change

Unified multi-stage image generation through
`OmniOpenAIServingChat.generate_diffusion_images()`
Single-stage pipelines: No change (backward compatible)
Multi-stage pipelines: Now route through chat handler
Fixes quality drift in models like GLM-Image multi-stage pipeline
```

2. Test Enhancement (Test Coverage)

Consider adding edge case tests:

```python
def test_multistage_error_handling():
"""Test error handling in multi-stage path"""
# Test missing chat_handler
# Test generate_diffusion_images error response

def test_single_stage_backward_compatibility():
"""Ensure single-stage behavior unchanged"""
# Verify exact same behavior as before
```

3. Performance Monitoring (Reliability)

Monitor performance impact in production:

Multi-stage path adds one layer of indirection
Should be negligible, but worth tracking
Consider caching if performance becomes an issue

📊 Summary

BLOCKER scan:

✅ Correctness: PASS
✅ Reliability/Safety: PASS
✅ Breaking Changes: PASS (backward compatible)
✅ Test Coverage: PASS
⚠️ Documentation: Needs architecture change doc
✅ Security: PASS

OVERALL: NO CRITICAL BLOCKERS

VERDICT: COMMENT (ready to approve after doc update)

🎯 Architecture Highlights

This is a perfect example of proper architectural thinking:

✅ Identify root cause - not just symptoms
✅ Design elegant solution - unified path via routing
✅ Maintain compatibility - single-stage unchanged
✅ Add tests - regression + unit tests
✅ Clean code - extracted shared logic

Great work on this architectural fix! The solution is elegant and well-executed. 🏗️

Architecture Score: 5/5 - Excellent design with minor doc needs

lishunyang12

left a few comments, mostly around the clone fallback and the param-setting repetition.

lishunyang12 · 2026-04-02T15:47:07Z

+
    try:
+        if _should_route_images_via_chat(stage_configs):
+            chat_handler = Omnichat(raw_request)


Omnichat() accesses request.app.state.openai_serving_chat directly, so it raises AttributeError before the None check ever runs. Use getattr(raw_request.app.state, "openai_serving_chat", None) instead.

Suggested change

chat_handler = Omnichat(raw_request)

chat_handler = getattr(raw_request.app.state, "openai_serving_chat", None)

This now reads openai_serving_chat via getattr(..., None) instead of calling Omnichat(...) before the None check.

lishunyang12 · 2026-04-02T15:47:07Z

                default_params = SamplingParams(**default_params)
            if idx == comprehension_idx:
                params = self._apply_request_overrides(default_params, request)
                sampling_params_list.append(params)


Silent fallback to the original (un-cloned) object means every request mutates the engine defaults in-place. After the first request with custom params, all subsequent requests see corrupted defaults. This needs to either raise or deep-copy.

I removed the silent fallback to the original object. Non-comprehension stage defaults must now clone successfully, otherwise we raise instead of reusing and mutating shared defaults.

lishunyang12 · 2026-04-02T15:47:07Z

+            if stage_type == "diffusion":
+                if hasattr(default_stage_params, "height") and height is not None:
+                    default_stage_params.height = height
+                if hasattr(default_stage_params, "width") and width is not None:


The 12 repeated if hasattr(dp, X) and X is not None: dp.X = X blocks are hard to maintain — easy to forget a field or introduce a typo. Consider a helper like:

def _set_if_supported(obj, **kwargs): for k, v in kwargs.items(): if v is not None and hasattr(obj, k): setattr(obj, k, v)

then call _set_if_supported(default_stage_params, height=height, width=width, ...) once.

Good suggestion,thx

lishunyang12 · 2026-04-02T15:47:07Z


+    def _should_route_images_via_chat(configs: list[Any]) -> bool:
+        # Unify request construction for any multi-stage pipeline to avoid
+        # divergence between /v1/images and /v1/chat/completions.


Nit: _should_route_images_via_chat is a one-liner nested function. Just inline the condition.

Suggested change

# divergence between /v1/images and /v1/chat/completions.

if len(stage_configs) > 1:

# Conflicts: # tests/entrypoints/openai_api/test_image_server.py # vllm_omni/entrypoints/openai/api_server.py # vllm_omni/entrypoints/openai/serving_chat.py

Signed-off-by: Lancer <maruixiang6688@gmail.com>

hsliuustc0106

lgtm

hsliuustc0106 · 2026-04-16T04:22:38Z

fix ci please

hsliuustc0106 · 2026-04-16T04:24:42Z

+        if guidance_scale_2 is not None:
+            gen_params.guidance_scale_2 = guidance_scale_2
+        if layers is not None:
+            gen_params.layers = layers


This removes the call that existed in the old . For where comes from , invalid values (e.g., ) will now silently pass through instead of returning a 400 error. Please add validation here.

hsliuustc0106 · 2026-04-16T04:24:42Z

+            effective_seed = request.seed if request.seed is not None else random.randint(0, MAX_UINT32_SEED)
+            extra_body: dict[str, Any] = {
+                "seed": effective_seed,
+                "num_outputs_per_prompt": request.n,


This new multistage path calls but is missing the call that the single-stage path has (line 1391). Oversized image requests will bypass the safety guard.

hsliuustc0106 · 2026-04-16T04:24:42Z

+                if hasattr(default_stage_params, "clone"):
+                    try:
+                        default_stage_params = default_stage_params.clone()
+                    except Exception:


Silent hides clone failures. This proceeds with the un-cloned default params object, which would mutate shared defaults across requests. Should either let the exception propagate or log a warning.

hsliuustc0106 · 2026-04-16T04:24:42Z

+        # divergence between /v1/images and /v1/chat/completions.
+        if len(stage_configs) > 1:
+            chat_handler = getattr(raw_request.app.state, "openai_serving_chat", None)
+            if chat_handler is None:


This multistage path doesn't include and that the single-stage path passes via (lines 1371-1374). If any model relies on these for correct generation quality, the multistage path will silently diverge.

Signed-off-by: Lancer <maruixiang6688@gmail.com>

…pipeline (vllm-project#2267) Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

RuixiangMa added 3 commits March 27, 2026 15:54

[Bugfix] Fix image quality in /v1/images/generations for multi-stage …

b9d2ed1

…pipeline Signed-off-by: Lancer <maruixiang6688@gmail.com>

Merge remote-tracking branch 'origin/main' into fixmutistageimage

efb727a

upd

8a9151a

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa requested a review from hsliuustc0106 as a code owner March 27, 2026 09:29

RuixiangMa changed the title ~~Fix image quality in /v1/images/generations for multi-stage pipelin~~ [Bugfix] Fix image quality in /v1/images/generations for multi-stage pipelin Mar 27, 2026

chatgpt-codex-connector Bot reviewed Mar 27, 2026

View reviewed changes

RuixiangMa changed the title ~~[Bugfix] Fix image quality in /v1/images/generations for multi-stage pipelin~~ [Bugfix] Fix image quality in /v1/images/generations for multi-stage pipeline Mar 27, 2026

hsliuustc0106 reviewed Mar 27, 2026

View reviewed changes

This comment was marked as duplicate.

Sign in to view

lishunyang12 reviewed Apr 2, 2026

View reviewed changes

RuixiangMa added 2 commits April 5, 2026 15:48

Merge remote-tracking branch 'origin/main' into fixmutistageimage

af34e82

# Conflicts: # tests/entrypoints/openai_api/test_image_server.py # vllm_omni/entrypoints/openai/api_server.py # vllm_omni/entrypoints/openai/serving_chat.py

upd

688aac3

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa mentioned this pull request Apr 15, 2026

[Bug]: glm-image with sp support generate wrong pictures on ascend device #2814

Closed

1 task

RuixiangMa added 2 commits April 15, 2026 17:30

Merge origin/main into fixmutistageimage

4e7bca4

upd

8981654

Signed-off-by: Lancer <maruixiang6688@gmail.com>

hsliuustc0106 approved these changes Apr 16, 2026

View reviewed changes

hsliuustc0106 added the ready label to trigger buildkite CI label Apr 16, 2026

Merge branch 'main' into fixmutistageimage

c192610

hsliuustc0106 reviewed Apr 16, 2026

View reviewed changes

RuixiangMa added 2 commits April 17, 2026 15:45

Resolve merge conflict in serving_chat.py

b4834e0

upd

f1ce40f

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa force-pushed the fixmutistageimage branch from 583f5aa to f1ce40f Compare April 17, 2026 09:42

hsliuustc0106 approved these changes Apr 17, 2026

View reviewed changes

hsliuustc0106 merged commit cf75ae6 into vllm-project:main Apr 17, 2026
8 checks passed

	chat_handler = Omnichat(raw_request)
	chat_handler = getattr(raw_request.app.state, "openai_serving_chat", None)

	# divergence between /v1/images and /v1/chat/completions.
	if len(stage_configs) > 1:

Conversation

RuixiangMa commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

RuixiangMa commented Mar 27, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

🏗️ Architecture Review

✅ What Works Well

1. Root Cause Analysis 🎯

2. Elegant Solution 🏗️

3. Backward Compatibility 🛡️

4. Code Quality ✨

5. Test Coverage 📊

⚠️ Suggestions for Improvement

1. Documentation Update (Documentation)

Architecture Change

2. Test Enhancement (Test Coverage)

3. Performance Monitoring (Reliability)

📊 Summary

🎯 Architecture Highlights

Uh oh!

This comment was marked as duplicate.

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Apr 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RuixiangMa commented Mar 27, 2026 •

edited

Loading