[Feature] online HunyuanImage-3.0 IT2I (image editing) support#3410
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fff228c533
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| lora_dict = _get_lora_from_json_str(lora) | ||
| _parse_lora_request(lora_dict) | ||
| extra_body["lora"] = lora_dict | ||
| if hunyuan_task is not None: |
There was a problem hiding this comment.
Initialize hunyuan_task before conditional extraction
hunyuan_task is only assigned inside the JSON-only branch, but it is later referenced unconditionally when building extra_body for multi-stage edits. For non-JSON requests (the existing multipart/form-data path), this raises UnboundLocalError at runtime once len(stage_configs) > 1, so regular image-edit requests to multi-stage pipelines will fail before dispatch.
Useful? React with 👍 / 👎.
| """ | ||
| # Handle JSON request | ||
| json_data = None | ||
| if raw_request.headers.get("Content-Type") == "application/json": |
There was a problem hiding this comment.
Relax JSON content-type check to handle charset parameters
The JSON parsing path only runs when Content-Type is exactly application/json. Many clients send valid JSON headers like application/json; charset=utf-8; those requests skip this branch, leaving form fields unset and causing false 422 validation errors instead of processing the JSON payload. This makes the new JSON mode unreliable across common HTTP clients.
Useful? React with 👍 / 👎.
fff228c to
1e6d0d2
Compare
| lora_dict = _get_lora_from_json_str(lora) | ||
| _parse_lora_request(lora_dict) | ||
| extra_body["lora"] = lora_dict | ||
| if hunyuan_task is not None: |
There was a problem hiding this comment.
UnboundLocalError: is only assigned in the JSON branch (line 1703-1722) but is referenced unconditionally at line 1936. For non-JSON requests using multi-stage pipelines, this will crash before dispatch. Initialize before the JSON branch.
| """ | ||
| # Handle JSON request | ||
| json_data = None | ||
| if raw_request.headers.get("Content-Type") == "application/json": |
There was a problem hiding this comment.
JSON content-type check is too strict. Many clients send "application/json; charset=utf-8" instead of just "application/json". Those requests will skip this branch and cause 422 errors instead of processing the JSON payload. Use "application/json" in raw_request.headers.get("Content-Type", "") to handle charset parameters.
| layers = extra_body.get("layers") | ||
| resolution = extra_body.get("resolution") | ||
| hunyuan_task = extra_body.get("hunyuan_task") | ||
|
|
There was a problem hiding this comment.
Debug print statement should be removed before merge.
| # `attention_mask` to `pixel_attention_mask` so the dict key must match | ||
| # the expected forward signature. | ||
| vit_kwargs = {"spatial_shapes": [], "pixel_attention_mask": []} | ||
| vit_kwargs = {"spatial_shapes": [], "attention_mask": []} |
There was a problem hiding this comment.
The comment above says "transformers >=5.54 renamed the kwarg from attention_mask to pixel_attention_mask", but this change does the opposite. Are you using an old transformers version? If so, please add a version constraint or explain why this is safe.
There was a problem hiding this comment.
This PR relies on the bug fix from PR #3395 to run. In fact, this change is a bug fix for the previous implementation rather than online support. I will delete it later.
| images.append(img) | ||
| except Exception as e: | ||
| raise ValueError(f"Failed to open uploaded file: {e}") | ||
| # 4. Local file path |
There was a problem hiding this comment.
Local file path support via os.path.exists needs security review. This allows arbitrary file access from the server filesystem. Consider restricting to a whitelist directory or adding path validation to prevent directory traversal attacks.
There was a problem hiding this comment.
Thanks for the suggestion. Local file upload has been removed; images are now passed via base64 instead.
69a4caa to
5459cb5
Compare
There was a problem hiding this comment.
have you rebase main? this have removed in main
There was a problem hiding this comment.
Not yet, will rebase now.
| # Extract parameters from JSON | ||
| image = json_data.get("image") | ||
| prompt = json_data.get("prompt") | ||
| model = json_data.get("model") |
There was a problem hiding this comment.
image edit is a standard interface. why need add these field , just reuse from input paramter image prompt
There was a problem hiding this comment.
Fixed. Removed all standard field extractions from the JSON block (image, prompt, model, n, size, etc.). These are already declared as Form(...) / File(...) parameters in the function signature and are correctly parsed by FastAPI for standard multipart/form-data requests.
| lora = json_data.get("lora") | ||
| layers = json_data.get("layers") | ||
| resolution = json_data.get("resolution") | ||
| hunyuan_task = json_data.get("hunyuan_task") |
There was a problem hiding this comment.
Fixed. Renamed hunyuan_task to bot_task.
4bf4543 to
8e13eda
Compare
8e13eda to
8851bbb
Compare
Signed-off-by: skf1999 <13234016272@163.com>
…project#3410) Signed-off-by: skf1999 <13234016272@163.com>
Purpose
Support online HunyuanImage-3.0 IT2I (image editing) inference
This PR needs to run on top of the bug fix from PR #3395.
Test Plan
Online Inference
Online Request
bot_task can be chosen between "it2i_think" or "it2i_recaption"
Test Result
An examplr for "it2i_think"

Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)