[Bugfix] Fix image quality in /v1/images/generations for multi-stage pipeline#2267
Conversation
…pipeline Signed-off-by: Lancer <maruixiang6688@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8a9151a61f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| try: | ||
| if _should_route_images_via_chat(stage_configs): | ||
| chat_handler = Omnichat(raw_request) |
There was a problem hiding this comment.
Handle missing openai_serving_chat before routing /images
Guard this lookup with getattr before invoking Omnichat: Omnichat(raw_request) ultimately reads request.app.state.openai_serving_chat as a direct attribute, so when that state key is absent (e.g., compatibility setups that only provide engine_client/stage_configs) it raises AttributeError before your if chat_handler is None check runs. That turns valid multi-stage /v1/images/generations calls into a 500 instead of a controlled fallback/error response.
Useful? React with 👍 / 👎.
|
@JaredforReal PTAL |
hsliuustc0106
left a comment
There was a problem hiding this comment.
🏗️ Architecture Review
Excellent bugfix with architectural improvement! This is a textbook example of identifying and fixing a root cause correctly.
✅ What Works Well
1. Root Cause Analysis 🎯
- Correctly identified the request construction divergence between and
- Understood the impact on multi-stage pipelines (AR/comprehension conditioning)
2. Elegant Solution 🏗️
```
Before: Divergent paths
/v1/images → Direct path (missing stage conditioning)
/v1/chat → Chat path (correct stage inputs)
After: Unified path for multi-stage
/v1/images (multi-stage) → Chat path → Consistent quality
/v1/images (single-stage) → Direct path → Backward compatible
```
3. Backward Compatibility 🛡️
- Single-stage behavior remains unchanged
- No breaking changes
- Smooth migration path
4. Code Quality ✨
- Clean extraction of shared logic into `generate_diffusion_images()`
- Good separation of concerns
- Proper error handling
5. Test Coverage 📊
- Added regression test for multi-stage path
- Unit test for `_build_multistage_generation_inputs()`
- Mock design is appropriate
⚠️ Suggestions for Improvement
1. Documentation Update (Documentation)
Add architecture change documentation to help users understand the fix:
```markdown
Architecture Change
- Unified multi-stage image generation through
`OmniOpenAIServingChat.generate_diffusion_images()` - Single-stage pipelines: No change (backward compatible)
- Multi-stage pipelines: Now route through chat handler
- Fixes quality drift in models like GLM-Image multi-stage pipeline
```
2. Test Enhancement (Test Coverage)
Consider adding edge case tests:
```python
def test_multistage_error_handling():
"""Test error handling in multi-stage path"""
# Test missing chat_handler
# Test generate_diffusion_images error response
def test_single_stage_backward_compatibility():
"""Ensure single-stage behavior unchanged"""
# Verify exact same behavior as before
```
3. Performance Monitoring (Reliability)
Monitor performance impact in production:
- Multi-stage path adds one layer of indirection
- Should be negligible, but worth tracking
- Consider caching if performance becomes an issue
📊 Summary
BLOCKER scan:
- ✅ Correctness: PASS
- ✅ Reliability/Safety: PASS
- ✅ Breaking Changes: PASS (backward compatible)
- ✅ Test Coverage: PASS
⚠️ Documentation: Needs architecture change doc- ✅ Security: PASS
OVERALL: NO CRITICAL BLOCKERS
VERDICT: COMMENT (ready to approve after doc update)
🎯 Architecture Highlights
This is a perfect example of proper architectural thinking:
- ✅ Identify root cause - not just symptoms
- ✅ Design elegant solution - unified path via routing
- ✅ Maintain compatibility - single-stage unchanged
- ✅ Add tests - regression + unit tests
- ✅ Clean code - extracted shared logic
Great work on this architectural fix! The solution is elegant and well-executed. 🏗️
Architecture Score: 5/5 - Excellent design with minor doc needs
lishunyang12
left a comment
There was a problem hiding this comment.
left a few comments, mostly around the clone fallback and the param-setting repetition.
|
|
||
| try: | ||
| if _should_route_images_via_chat(stage_configs): | ||
| chat_handler = Omnichat(raw_request) |
There was a problem hiding this comment.
Omnichat() accesses request.app.state.openai_serving_chat directly, so it raises AttributeError before the None check ever runs. Use getattr(raw_request.app.state, "openai_serving_chat", None) instead.
| chat_handler = Omnichat(raw_request) | |
| chat_handler = getattr(raw_request.app.state, "openai_serving_chat", None) |
There was a problem hiding this comment.
This now reads openai_serving_chat via getattr(..., None) instead of calling Omnichat(...) before the None check.
| default_params = SamplingParams(**default_params) | ||
| if idx == comprehension_idx: | ||
| params = self._apply_request_overrides(default_params, request) | ||
| sampling_params_list.append(params) |
There was a problem hiding this comment.
Silent fallback to the original (un-cloned) object means every request mutates the engine defaults in-place. After the first request with custom params, all subsequent requests see corrupted defaults. This needs to either raise or deep-copy.
There was a problem hiding this comment.
I removed the silent fallback to the original object. Non-comprehension stage defaults must now clone successfully, otherwise we raise instead of reusing and mutating shared defaults.
| if stage_type == "diffusion": | ||
| if hasattr(default_stage_params, "height") and height is not None: | ||
| default_stage_params.height = height | ||
| if hasattr(default_stage_params, "width") and width is not None: |
There was a problem hiding this comment.
The 12 repeated if hasattr(dp, X) and X is not None: dp.X = X blocks are hard to maintain — easy to forget a field or introduce a typo. Consider a helper like:
def _set_if_supported(obj, **kwargs):
for k, v in kwargs.items():
if v is not None and hasattr(obj, k):
setattr(obj, k, v)then call _set_if_supported(default_stage_params, height=height, width=width, ...) once.
There was a problem hiding this comment.
Good suggestion,thx
|
|
||
| def _should_route_images_via_chat(configs: list[Any]) -> bool: | ||
| # Unify request construction for any multi-stage pipeline to avoid | ||
| # divergence between /v1/images and /v1/chat/completions. |
There was a problem hiding this comment.
Nit: _should_route_images_via_chat is a one-liner nested function. Just inline the condition.
| # divergence between /v1/images and /v1/chat/completions. | |
| if len(stage_configs) > 1: |
# Conflicts: # tests/entrypoints/openai_api/test_image_server.py # vllm_omni/entrypoints/openai/api_server.py # vllm_omni/entrypoints/openai/serving_chat.py
|
fix ci please |
| if guidance_scale_2 is not None: | ||
| gen_params.guidance_scale_2 = guidance_scale_2 | ||
| if layers is not None: | ||
| gen_params.layers = layers |
There was a problem hiding this comment.
This removes the call that existed in the old . For where comes from , invalid values (e.g., ) will now silently pass through instead of returning a 400 error. Please add validation here.
| effective_seed = request.seed if request.seed is not None else random.randint(0, MAX_UINT32_SEED) | ||
| extra_body: dict[str, Any] = { | ||
| "seed": effective_seed, | ||
| "num_outputs_per_prompt": request.n, |
There was a problem hiding this comment.
This new multistage path calls but is missing the call that the single-stage path has (line 1391). Oversized image requests will bypass the safety guard.
| if hasattr(default_stage_params, "clone"): | ||
| try: | ||
| default_stage_params = default_stage_params.clone() | ||
| except Exception: |
There was a problem hiding this comment.
Silent hides clone failures. This proceeds with the un-cloned default params object, which would mutate shared defaults across requests. Should either let the exception propagate or log a warning.
| # divergence between /v1/images and /v1/chat/completions. | ||
| if len(stage_configs) > 1: | ||
| chat_handler = getattr(raw_request.app.state, "openai_serving_chat", None) | ||
| if chat_handler is None: |
There was a problem hiding this comment.
This multistage path doesn't include and that the single-stage path passes via (lines 1371-1374). If any model relies on these for correct generation quality, the multistage path will silently diverge.
583f5aa to
f1ce40f
Compare
…pipeline (vllm-project#2267) Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
…pipeline (vllm-project#2267) Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
…pipeline (vllm-project#2267) Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
…pipeline (vllm-project#2267) Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Purpose
/v1/images/generations and /v1/chat/completions used different request-construction paths in multi-stage pipelines, so stage inputs diverged (especially AR/comprehension conditioning), causing quality drift (e.g., noisy outputs).
for example:
vllm serve --model ZhipuAI/GLM-Image --omni --port 8004
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)