[Feature] online HunyuanImage-3.0 IT2I (image editing) support by skf-1999 · Pull Request #3410 · vllm-project/vllm-omni

skf-1999 · 2026-05-07T09:10:51Z

Purpose

Support online HunyuanImage-3.0 IT2I (image editing) inference
This PR needs to run on top of the bug fix from PR #3395.

Test Plan

Online Inference

vllm serve "/data/HunyuanImage-3.0-Instruct" \
    --omni \
    --port "8091" \
    --tensor_parallel_size 8 \
    --stage-configs-path vllm_omni/model_executor/stage_configs/hunyuan_image3_it2i.yaml \
    --enforce-eager

Online Request

curl -X POST http://localhost:8091/v1/images/edits \
  -F "image=@/data/s00957182/0506/edit_dog.png" \
  -F "prompt=新年宠物海报，Q版圆润的可爱标题\"新年快乐汪\"，副标题\"HAPPY NEW YEAR\"。 鱼眼镜头，背景是房间门口，近景，上传的主体歪头笑，围着红色围巾，戴着红色毛线帽，高清，绒毛细节，面部特写。 宝丽莱相纸，超现实主义，写实主义，胶片摄影，打印颗粒感肌理。肌理，超写实，复古感。" \
  -F "bot_task=it2i_think" \
  -F "n=1" \
  -F "num_inference_steps=50" \
  -F "guidance_scale=2.5" \
  -F "seed=42" \
  | jq -r '.data[0].b64_json' \
  | base64 -d > result.png

bot_task can be chosen between "it2i_think" or "it2i_recaption"

Test Result

An examplr for "it2i_think"

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fff228c533

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-07T09:13:22Z

                lora_dict = _get_lora_from_json_str(lora)
                _parse_lora_request(lora_dict)
                extra_body["lora"] = lora_dict
+            if hunyuan_task is not None:


Initialize hunyuan_task before conditional extraction

hunyuan_task is only assigned inside the JSON-only branch, but it is later referenced unconditionally when building extra_body for multi-stage edits. For non-JSON requests (the existing multipart/form-data path), this raises UnboundLocalError at runtime once len(stage_configs) > 1, so regular image-edit requests to multi-stage pipelines will fail before dispatch.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-07T09:13:22Z

    """
+    # Handle JSON request
+    json_data = None
+    if raw_request.headers.get("Content-Type") == "application/json":


Relax JSON content-type check to handle charset parameters

The JSON parsing path only runs when Content-Type is exactly application/json. Many clients send valid JSON headers like application/json; charset=utf-8; those requests skip this branch, leaving form fields unset and causing false 422 validation errors instead of processing the JSON payload. This makes the new JSON mode unreliable across common HTTP clients.

Useful? React with 👍 / 👎.

hsliuustc0106 · 2026-05-07T09:26:04Z

                lora_dict = _get_lora_from_json_str(lora)
                _parse_lora_request(lora_dict)
                extra_body["lora"] = lora_dict
+            if hunyuan_task is not None:


UnboundLocalError: is only assigned in the JSON branch (line 1703-1722) but is referenced unconditionally at line 1936. For non-JSON requests using multi-stage pipelines, this will crash before dispatch. Initialize before the JSON branch.

Already addressed.

hsliuustc0106 · 2026-05-07T09:26:49Z

    """
+    # Handle JSON request
+    json_data = None
+    if raw_request.headers.get("Content-Type") == "application/json":


JSON content-type check is too strict. Many clients send "application/json; charset=utf-8" instead of just "application/json". Those requests will skip this branch and cause 422 errors instead of processing the JSON payload. Use "application/json" in raw_request.headers.get("Content-Type", "") to handle charset parameters.

hsliuustc0106 · 2026-05-07T09:26:59Z

        layers = extra_body.get("layers")
        resolution = extra_body.get("resolution")
+        hunyuan_task = extra_body.get("hunyuan_task")



Debug print statement should be removed before merge.

hsliuustc0106 · 2026-05-07T09:27:09Z

-            # `attention_mask` to `pixel_attention_mask` so the dict key must match
-            # the expected forward signature.
-            vit_kwargs = {"spatial_shapes": [], "pixel_attention_mask": []}
+            vit_kwargs = {"spatial_shapes": [], "attention_mask": []}


The comment above says "transformers >=5.54 renamed the kwarg from attention_mask to pixel_attention_mask", but this change does the opposite. Are you using an old transformers version? If so, please add a version constraint or explain why this is safe.

This PR relies on the bug fix from PR #3395 to run. In fact, this change is a bug fix for the previous implementation rather than online support. I will delete it later.

hsliuustc0106 · 2026-05-07T09:27:24Z

                images.append(img)
            except Exception as e:
                raise ValueError(f"Failed to open uploaded file: {e}")
+        # 4. Local file path


Local file path support via os.path.exists needs security review. This allows arbitrary file access from the server filesystem. Consider restricting to a whitelist directory or adding path validation to prevent directory traversal attacks.

Thanks for the suggestion. Local file upload has been removed; images are now passed via base64 instead.

hsliuustc0106

Bounty-hunter · 2026-05-08T02:14:03Z

have you rebase main? this have removed in main

Not yet, will rebase now.

Bounty-hunter · 2026-05-08T02:24:49Z

+        # Extract parameters from JSON
+        image = json_data.get("image")
+        prompt = json_data.get("prompt")
+        model = json_data.get("model")


image edit is a standard interface. why need add these field , just reuse from input paramter image prompt

Fixed. Removed all standard field extractions from the JSON block (image, prompt, model, n, size, etc.). These are already declared as Form(...) / File(...) parameters in the function signature and are correctly parsed by FastAPI for standard multipart/form-data requests.

Bounty-hunter · 2026-05-08T02:25:33Z

+        lora = json_data.get("lora")
+        layers = json_data.get("layers")
+        resolution = json_data.get("resolution")
+        hunyuan_task = json_data.get("hunyuan_task")


just use bot_task?

Fixed. Renamed hunyuan_task to bot_task.

Bounty-hunter

LGTM

Signed-off-by: skf1999 <13234016272@163.com>

…project#3410) Signed-off-by: skf1999 <13234016272@163.com>

skf-1999 requested a review from hsliuustc0106 as a code owner May 7, 2026 09:10

chatgpt-codex-connector Bot reviewed May 7, 2026

View reviewed changes

skf-1999 force-pushed the feat/image-edit-api branch from fff228c to 1e6d0d2 Compare May 7, 2026 09:16

hsliuustc0106 reviewed May 7, 2026

View reviewed changes

hsliuustc0106 requested changes May 7, 2026

View reviewed changes

zengchuang-hw mentioned this pull request May 7, 2026

[RFC]: Support Hunyuan image AR + DIT JiusiServe/vllm-omni#183

Closed

1 task

Bounty-hunter mentioned this pull request May 7, 2026

[Feature]: [Hunyuanimage]Support DIT reuse kv from AR stage JiusiServe/vllm-omni#216

Open

1 task

skf-1999 force-pushed the feat/image-edit-api branch from 69a4caa to 5459cb5 Compare May 8, 2026 02:22

Bounty-hunter reviewed May 8, 2026

View reviewed changes

skf-1999 force-pushed the feat/image-edit-api branch 2 times, most recently from 4bf4543 to 8e13eda Compare May 8, 2026 06:20

skf-1999 requested a review from tzhouam as a code owner May 8, 2026 06:20

skf-1999 force-pushed the feat/image-edit-api branch from 8e13eda to 8851bbb Compare May 8, 2026 06:35

Bounty-hunter approved these changes May 8, 2026

View reviewed changes

hsliuustc0106 added the ready label to trigger buildkite CI label May 8, 2026

hsliuustc0106 merged commit 039a09a into vllm-project:main May 8, 2026
8 checks passed

Bounty-hunter mentioned this pull request May 8, 2026

[Feature] HunyuanImage-3.0 IT2I: multi-image input + prompt API cleanup #3444

Merged

12 tasks

[Feature] online HunyuanImage-3.0 IT2I (image editing) support

8851bbb

Signed-off-by: skf1999 <13234016272@163.com>

Bounty-hunter mentioned this pull request May 10, 2026

[RFC]: HunyuanImage Model deployment optimization #2015

Open

clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026

[Feature] online HunyuanImage-3.0 IT2I (image editing) support (vllm-…

52943b5

…project#3410) Signed-off-by: skf1999 <13234016272@163.com>

Conversation

skf-1999 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bounty-hunter left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

skf-1999 commented May 7, 2026 •

edited

Loading