[WIP] Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa…#3119
[WIP] Add HTTP 400 error propegation to diffusion pipelines via OmniInputVa…#3119vraiti wants to merge 1 commit into
Conversation
…lidationError Signed-off-by: vraiti <vraiti@redhat.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 724b225c64
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| req_id=sched_req_id, | ||
| step_index=None, | ||
| finished=True, | ||
| result=DiffusionOutput(error=str(exc)), | ||
| result=DiffusionOutput(error=str(exc), error_type=type(exc).__name__), |
There was a problem hiding this comment.
Preserve validation exception type through DiffusionEngine.step
This change stores error_type in DiffusionOutput, but DiffusionEngine.step() still collapses any output.error into RuntimeError (raise RuntimeError(output.error)), so input-validation failures raised in pipeline forward() (including the new OmniInputValidationError prompt-length path) are surfaced upstream as RuntimeError. Because _check_engine_output_error only converts error_type == "OmniInputValidationError" into a 400, these requests still return 500 instead of the intended 400. Please preserve/re-raise the original type from output.error_type when output.error is present.
Useful? React with 👍 / 👎.
Purpose
Diffusion pipeline input-validation errors (oversized prompts, too many images, malformed input) currently raise
ValueError, which surfaces as HTTP 500 in the API server. #2840 adds the ability for QwenImageEditPlusPipeline to return a 400 error when too many images are passed as input to prevent OOM. This PR generalizes this feature to all diffusion models for any input validation withOmniInputValidationAlso adds
max_multimodal_text_tokensfor text input size validation.Changes:
vllm_omni/exceptions.py(new) — DefinesOmniInputValidationError(Exception)in a zero-dependency module so pipeline code can import it without pulling in entrypoints.vllm_omni/entrypoints/omni_base.py— Re-exportsOmniInputValidationErrorfor backward compatibility._check_engine_output_errorinspectserror_type == "OmniInputValidationError"and re-raises accordingly. AddsOmniEngineDeadErrorwitherror_stage_idfor richer diagnostics.vllm_omni/entrypoints/openai/api_server.py— Image generation, image editing, chat completion, speech, and video endpoints catchOmniInputValidationErrorand return HTTP 400. Engine error handling refactored into_create_engine_error_json_response/_build_engine_error_payloadwitherror_stage_idandrequest_idin the response body. Adds_get_int_limithelper and_get_max_multimodal_text_tokensaccessor.vllm_omni/entrypoints/openai/serving_chat.py— CatchesOmniInputValidationErrorin chat completion and returns error response instead of 500.vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit_plus.py— Pre-process usesod_config.max_multimodal_image_inputs(falls back to hardcoded 4) and raisesOmniInputValidationError.forward()capsmax_sequence_lengthwithod_config.max_multimodal_text_tokensand wrapsencode_promptcalls to convertValueErrortoOmniInputValidationError.vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit.py— Pre-process raisesOmniInputValidationErrorfor missing or multiple images.vllm_omni/diffusion/models/flux2/pipeline_flux2.py—Flux2ImageProcessor.check_image_input()raisesOmniInputValidationErrorfor wrong type, too small, or extreme aspect ratio.vllm_omni/diffusion/model_metadata.py— Addsmax_multimodal_text_tokensfield toDiffusionModelMetadata. Registers defaults for Flux2Pipeline (512), QwenImagePipeline (1024), QwenImageLayeredPipeline (1024, 1 image), QwenImageEditPipeline (1024, 1 image), QwenImageEditPlusPipeline (1024, 4 images).vllm_omni/diffusion/data.py— Addsmax_multimodal_text_tokensanderror_typefields toOmniDiffusionConfig/DiffusionOutput.update_multimodal_support()respects user-supplied stage config values for bothmax_multimodal_image_inputsandmax_multimodal_text_tokens(metadata defaults only fillNone).Test Plan
# Unit tests — input validation and max_sequence_length enforcement pytest tests/diffusion/models/qwen_image/test_qwen_image_edit_plus.py -v pytest tests/diffusion/models/qwen_image/test_qwen_image_max_sequence_length.py -vtest_qwen_image_edit_plus_rejects_too_many_input_images— verifiesOmniInputValidationErrorraised at default limit (4 images)test_qwen_image_edit_plus_rejects_images_exceeding_config_limit— verifies config-driven limit (max_multimodal_image_inputs=2)test_forward_caps_max_sequence_length_with_max_multimodal_text_tokens— verifiesod_config.max_multimodal_text_tokenscaps token count and raisesOmniInputValidationErrorthroughforward()test_qwen_image_max_sequence_length.pytests continue to pass (prompt-length validation unchanged)Test Result
22/22 relevant tests passed.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)