[WIP] Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa… by vraiti · Pull Request #3119 · vllm-project/vllm-omni

vraiti · 2026-04-24T21:29:59Z

Purpose

Diffusion pipeline input-validation errors (oversized prompts, too many images, malformed input) currently raise ValueError, which surfaces as HTTP 500 in the API server. #2840 adds the ability for QwenImageEditPlusPipeline to return a 400 error when too many images are passed as input to prevent OOM. This PR generalizes this feature to all diffusion models for any input validation with OmniInputValidation

Also adds max_multimodal_text_tokens for text input size validation.

Changes:

vllm_omni/exceptions.py (new) — Defines OmniInputValidationError(Exception) in a zero-dependency module so pipeline code can import it without pulling in entrypoints.
vllm_omni/entrypoints/omni_base.py — Re-exports OmniInputValidationError for backward compatibility. _check_engine_output_error inspects error_type == "OmniInputValidationError" and re-raises accordingly. Adds OmniEngineDeadError with error_stage_id for richer diagnostics.
vllm_omni/entrypoints/openai/api_server.py — Image generation, image editing, chat completion, speech, and video endpoints catch OmniInputValidationError and return HTTP 400. Engine error handling refactored into _create_engine_error_json_response / _build_engine_error_payload with error_stage_id and request_id in the response body. Adds _get_int_limit helper and _get_max_multimodal_text_tokens accessor.
vllm_omni/entrypoints/openai/serving_chat.py — Catches OmniInputValidationError in chat completion and returns error response instead of 500.
vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit_plus.py — Pre-process uses od_config.max_multimodal_image_inputs (falls back to hardcoded 4) and raises OmniInputValidationError. forward() caps max_sequence_length with od_config.max_multimodal_text_tokens and wraps encode_prompt calls to convert ValueError to OmniInputValidationError.
vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit.py — Pre-process raises OmniInputValidationError for missing or multiple images.
vllm_omni/diffusion/models/flux2/pipeline_flux2.py — Flux2ImageProcessor.check_image_input() raises OmniInputValidationError for wrong type, too small, or extreme aspect ratio.
vllm_omni/diffusion/model_metadata.py — Adds max_multimodal_text_tokens field to DiffusionModelMetadata. Registers defaults for Flux2Pipeline (512), QwenImagePipeline (1024), QwenImageLayeredPipeline (1024, 1 image), QwenImageEditPipeline (1024, 1 image), QwenImageEditPlusPipeline (1024, 4 images).
vllm_omni/diffusion/data.py — Adds max_multimodal_text_tokens and error_type fields to OmniDiffusionConfig / DiffusionOutput. update_multimodal_support() respects user-supplied stage config values for both max_multimodal_image_inputs and max_multimodal_text_tokens (metadata defaults only fill None).

Test Plan

# Unit tests — input validation and max_sequence_length enforcement
pytest tests/diffusion/models/qwen_image/test_qwen_image_edit_plus.py -v
pytest tests/diffusion/models/qwen_image/test_qwen_image_max_sequence_length.py -v

test_qwen_image_edit_plus_rejects_too_many_input_images — verifies OmniInputValidationError raised at default limit (4 images)
test_qwen_image_edit_plus_rejects_images_exceeding_config_limit — verifies config-driven limit (max_multimodal_image_inputs=2)
test_forward_caps_max_sequence_length_with_max_multimodal_text_tokens — verifies od_config.max_multimodal_text_tokens caps token count and raises OmniInputValidationError through forward()
Existing test_qwen_image_max_sequence_length.py tests continue to pass (prompt-length validation unchanged)

Test Result

22/22 relevant tests passed.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

…lidationError Signed-off-by: vraiti <vraiti@redhat.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 724b225c64

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T21:36:56Z

                        req_id=sched_req_id,
                        step_index=None,
                        finished=True,
-                        result=DiffusionOutput(error=str(exc)),
+                        result=DiffusionOutput(error=str(exc), error_type=type(exc).__name__),


Preserve validation exception type through DiffusionEngine.step

This change stores error_type in DiffusionOutput, but DiffusionEngine.step() still collapses any output.error into RuntimeError (raise RuntimeError(output.error)), so input-validation failures raised in pipeline forward() (including the new OmniInputValidationError prompt-length path) are surfaced upstream as RuntimeError. Because _check_engine_output_error only converts error_type == "OmniInputValidationError" into a 400, these requests still return 500 instead of the intended 400. Please preserve/re-raise the original type from output.error_type when output.error is present.

Useful? React with 👍 / 👎.

Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa…

724b225

…lidationError Signed-off-by: vraiti <vraiti@redhat.com>

vraiti requested a review from hsliuustc0106 as a code owner April 24, 2026 21:30

vraiti changed the title ~~Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa…~~ [WIP] Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa… Apr 24, 2026

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa…#3119

[WIP] Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa…#3119
vraiti wants to merge 1 commit into
vllm-project:mainfrom
vraiti:feat_model-len-guard

vraiti commented Apr 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vraiti commented Apr 24, 2026

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant