Skip to content

[Bugfix] Fix image quality in /v1/images/generations for multi-stage pipeline#2267

Merged
hsliuustc0106 merged 10 commits into
vllm-project:mainfrom
RuixiangMa:fixmutistageimage
Apr 17, 2026
Merged

[Bugfix] Fix image quality in /v1/images/generations for multi-stage pipeline#2267
hsliuustc0106 merged 10 commits into
vllm-project:mainfrom
RuixiangMa:fixmutistageimage

Conversation

@RuixiangMa
Copy link
Copy Markdown
Collaborator

@RuixiangMa RuixiangMa commented Mar 27, 2026

Purpose

/v1/images/generations and /v1/chat/completions used different request-construction paths in multi-stage pipelines, so stage inputs diverged (especially AR/comprehension conditioning), causing quality drift (e.g., noisy outputs).

for example:

vllm serve --model ZhipuAI/GLM-Image --omni --port 8004

curl -X POST http://localhost:8004/v1/images/generations   -H "Content-Type: application/json"   -d '{
       "prompt": "A beautiful sunset over the ocean with sailing boats",
       "size": "1024x1024",
       "num_inference_steps": 50,
       "guidance_scale": 1.5,
       "seed": 42
     }' | jq -r '.data[0].b64_json' | base64 -d > output.png
output
  • Unified multi-stage image request construction through shared logic in serving_chat
  • Kept single-stage behavior intact with explicit fallback paths.

Test Plan

Test Result

output1
Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

…pipeline

Signed-off-by: Lancer <maruixiang6688@gmail.com>
Signed-off-by: Lancer <maruixiang6688@gmail.com>
@RuixiangMa RuixiangMa changed the title Fix image quality in /v1/images/generations for multi-stage pipelin [Bugfix] Fix image quality in /v1/images/generations for multi-stage pipelin Mar 27, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8a9151a61f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


try:
if _should_route_images_via_chat(stage_configs):
chat_handler = Omnichat(raw_request)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle missing openai_serving_chat before routing /images

Guard this lookup with getattr before invoking Omnichat: Omnichat(raw_request) ultimately reads request.app.state.openai_serving_chat as a direct attribute, so when that state key is absent (e.g., compatibility setups that only provide engine_client/stage_configs) it raises AttributeError before your if chat_handler is None check runs. That turns valid multi-stage /v1/images/generations calls into a 500 instead of a controlled fallback/error response.

Useful? React with 👍 / 👎.

@RuixiangMa
Copy link
Copy Markdown
Collaborator Author

@JaredforReal PTAL

@RuixiangMa RuixiangMa changed the title [Bugfix] Fix image quality in /v1/images/generations for multi-stage pipelin [Bugfix] Fix image quality in /v1/images/generations for multi-stage pipeline Mar 27, 2026
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏗️ Architecture Review

Excellent bugfix with architectural improvement! This is a textbook example of identifying and fixing a root cause correctly.


✅ What Works Well

1. Root Cause Analysis 🎯

  • Correctly identified the request construction divergence between and
  • Understood the impact on multi-stage pipelines (AR/comprehension conditioning)

2. Elegant Solution 🏗️

```
Before: Divergent paths
/v1/images → Direct path (missing stage conditioning)
/v1/chat → Chat path (correct stage inputs)

After: Unified path for multi-stage
/v1/images (multi-stage) → Chat path → Consistent quality
/v1/images (single-stage) → Direct path → Backward compatible
```

3. Backward Compatibility 🛡️

  • Single-stage behavior remains unchanged
  • No breaking changes
  • Smooth migration path

4. Code Quality

  • Clean extraction of shared logic into `generate_diffusion_images()`
  • Good separation of concerns
  • Proper error handling

5. Test Coverage 📊

  • Added regression test for multi-stage path
  • Unit test for `_build_multistage_generation_inputs()`
  • Mock design is appropriate

⚠️ Suggestions for Improvement

1. Documentation Update (Documentation)

Add architecture change documentation to help users understand the fix:

```markdown

Architecture Change

  • Unified multi-stage image generation through
    `OmniOpenAIServingChat.generate_diffusion_images()`
  • Single-stage pipelines: No change (backward compatible)
  • Multi-stage pipelines: Now route through chat handler
  • Fixes quality drift in models like GLM-Image multi-stage pipeline
    ```

2. Test Enhancement (Test Coverage)

Consider adding edge case tests:

```python
def test_multistage_error_handling():
"""Test error handling in multi-stage path"""
# Test missing chat_handler
# Test generate_diffusion_images error response

def test_single_stage_backward_compatibility():
"""Ensure single-stage behavior unchanged"""
# Verify exact same behavior as before
```

3. Performance Monitoring (Reliability)

Monitor performance impact in production:

  • Multi-stage path adds one layer of indirection
  • Should be negligible, but worth tracking
  • Consider caching if performance becomes an issue

📊 Summary

BLOCKER scan:

  • ✅ Correctness: PASS
  • ✅ Reliability/Safety: PASS
  • ✅ Breaking Changes: PASS (backward compatible)
  • ✅ Test Coverage: PASS
  • ⚠️ Documentation: Needs architecture change doc
  • ✅ Security: PASS

OVERALL: NO CRITICAL BLOCKERS

VERDICT: COMMENT (ready to approve after doc update)


🎯 Architecture Highlights

This is a perfect example of proper architectural thinking:

  1. Identify root cause - not just symptoms
  2. Design elegant solution - unified path via routing
  3. Maintain compatibility - single-stage unchanged
  4. Add tests - regression + unit tests
  5. Clean code - extracted shared logic

Great work on this architectural fix! The solution is elegant and well-executed. 🏗️

Architecture Score: 5/5 - Excellent design with minor doc needs

hsliuustc0106

This comment was marked as duplicate.

Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few comments, mostly around the clone fallback and the param-setting repetition.


try:
if _should_route_images_via_chat(stage_configs):
chat_handler = Omnichat(raw_request)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omnichat() accesses request.app.state.openai_serving_chat directly, so it raises AttributeError before the None check ever runs. Use getattr(raw_request.app.state, "openai_serving_chat", None) instead.

Suggested change
chat_handler = Omnichat(raw_request)
chat_handler = getattr(raw_request.app.state, "openai_serving_chat", None)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now reads openai_serving_chat via getattr(..., None) instead of calling Omnichat(...) before the None check.

default_params = SamplingParams(**default_params)
if idx == comprehension_idx:
params = self._apply_request_overrides(default_params, request)
sampling_params_list.append(params)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent fallback to the original (un-cloned) object means every request mutates the engine defaults in-place. After the first request with custom params, all subsequent requests see corrupted defaults. This needs to either raise or deep-copy.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the silent fallback to the original object. Non-comprehension stage defaults must now clone successfully, otherwise we raise instead of reusing and mutating shared defaults.

if stage_type == "diffusion":
if hasattr(default_stage_params, "height") and height is not None:
default_stage_params.height = height
if hasattr(default_stage_params, "width") and width is not None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 12 repeated if hasattr(dp, X) and X is not None: dp.X = X blocks are hard to maintain — easy to forget a field or introduce a typo. Consider a helper like:

def _set_if_supported(obj, **kwargs):
    for k, v in kwargs.items():
        if v is not None and hasattr(obj, k):
            setattr(obj, k, v)

then call _set_if_supported(default_stage_params, height=height, width=width, ...) once.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion,thx


def _should_route_images_via_chat(configs: list[Any]) -> bool:
# Unify request construction for any multi-stage pipeline to avoid
# divergence between /v1/images and /v1/chat/completions.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: _should_route_images_via_chat is a one-liner nested function. Just inline the condition.

Suggested change
# divergence between /v1/images and /v1/chat/completions.
if len(stage_configs) > 1:

# Conflicts:
#	tests/entrypoints/openai_api/test_image_server.py
#	vllm_omni/entrypoints/openai/api_server.py
#	vllm_omni/entrypoints/openai/serving_chat.py
Signed-off-by: Lancer <maruixiang6688@gmail.com>
Signed-off-by: Lancer <maruixiang6688@gmail.com>
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Apr 16, 2026
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

fix ci please

if guidance_scale_2 is not None:
gen_params.guidance_scale_2 = guidance_scale_2
if layers is not None:
gen_params.layers = layers
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes the call that existed in the old . For where comes from , invalid values (e.g., ) will now silently pass through instead of returning a 400 error. Please add validation here.

effective_seed = request.seed if request.seed is not None else random.randint(0, MAX_UINT32_SEED)
extra_body: dict[str, Any] = {
"seed": effective_seed,
"num_outputs_per_prompt": request.n,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new multistage path calls but is missing the call that the single-stage path has (line 1391). Oversized image requests will bypass the safety guard.

if hasattr(default_stage_params, "clone"):
try:
default_stage_params = default_stage_params.clone()
except Exception:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent hides clone failures. This proceeds with the un-cloned default params object, which would mutate shared defaults across requests. Should either let the exception propagate or log a warning.

# divergence between /v1/images and /v1/chat/completions.
if len(stage_configs) > 1:
chat_handler = getattr(raw_request.app.state, "openai_serving_chat", None)
if chat_handler is None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This multistage path doesn't include and that the single-stage path passes via (lines 1371-1374). If any model relies on these for correct generation quality, the multistage path will silently diverge.

Signed-off-by: Lancer <maruixiang6688@gmail.com>
@hsliuustc0106 hsliuustc0106 merged commit cf75ae6 into vllm-project:main Apr 17, 2026
8 checks passed
lvliang-intel pushed a commit to lvliang-intel/vllm-omni that referenced this pull request Apr 20, 2026
…pipeline (vllm-project#2267)

Signed-off-by: Lancer <maruixiang6688@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026
…pipeline (vllm-project#2267)

Signed-off-by: Lancer <maruixiang6688@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
…pipeline (vllm-project#2267)

Signed-off-by: Lancer <maruixiang6688@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
daixinning pushed a commit to daixinning/vllm-omni that referenced this pull request May 28, 2026
…pipeline (vllm-project#2267)

Signed-off-by: Lancer <maruixiang6688@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants