[Feat] Enable VAE parallel in HunyuanImage3 by Fishermanykx · Pull Request #3091 · vllm-project/vllm-omni

Fishermanykx · 2026-04-24T03:02:12Z

Summary

Enable VAE parallel support in HunyuanImage3.

Current changes:

add a distributed Hunyuan VAE wrapper at vllm_omni/diffusion/distributed/autoencoders/autoencoder_kl_hunyuan.py
wire HunyuanImage3Pipeline to use the distributed autoencoder wrapper
remove the NPU fused MoE init hook in vllm_omni/platforms/npu/models/hunyuan_fused_moe.py

unified deploy yaml in #3172

Validation

static checks only so far (py_compile, diff checks)
runtime validation is still pending

Test Plan

Tested on 4xAscend NPU

server

vllm serve $model --omni --port "8031" \
    --log-stats \
    --stage-configs-path "vllm_omni/platforms/npu/stage_configs/hunyuan_image3_t2i.yaml"

vae_patch_parallel_size is set to 4

client

curl -X POST http://localhost:8031/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": 
    "A cinematic medium shot captures a single Asian woman seated on a chair within a dimly lit room, creating an intimate and theatrical atmosphere. The composition is focused on the subject, rendered with rich colors and intricate textures that evoke a nostalgic and moody feeling.\n\nThe primary subject is a young Asian woman with a thoughtful and expressive countenance, her gaze directed slightly away from the camera. She is seated in a relaxed yet elegant posture on an ornate, vintage armchair. The chair is upholstered in a deep red velvet, its fabric showing detailed, intricate textures and slight signs of wear. She wears a simple, elegant dress in a dark teal hue, the material catching the light in a way that reveals its fine-woven texture. Her skin has a soft, matte quality, and the light delicately models the contours of her face and arms.\n\nThe surrounding room is characterized by its vintage decor, which contributes to the historic and evocative mood. In the immediate background, partially blurred due to a shallow depth of field consistent with a f/2.8 aperture, the wall is covered with wallpaper featuring a subtle, damask pattern. The overall color palette is a carefully balanced interplay of deep teal and rich red hues, creating a visually compelling and cohesive environment. The entire scene is detailed, from the fibers of the upholstery to the subtle patterns on the wall.\n\nThe lighting is highly dramatic and artistic, defined by high contrast and pronounced shadow play. A single key light source, positioned off-camera, projects gobo lighting patterns onto the scene, casting intricate shapes of light and shadow across the woman and the back wall. These dramatic shadows create a strong scense of depth and a theatrical quality. While some shadows are deep and defined, others remain soft, gently wrapping around the subject and preventing the loss of detail in darker areas. The soft focus on the background enhances the intimate feeling, drawing all attention to the expressive subject. The overall image presents a cinematic, photorealistic photography style.",
    "num_inference_steps": 2,
    "guidance_scale": "1.0",
    "n": 1,
    "size": "1024x1024",
    "seed": 42
  }' | jq -r '.data[0].b64_json' | base64 -d > output.png

Test Result

output

VAE decode time 625.7ms -> 355ms

w/o vae parallel

w vae parallel

Fishermanykx · 2026-04-24T03:37:57Z

PTAL @gcanlin @Semmer2

hsliuustc0106 · 2026-04-24T07:30:08Z

does it work in GPU as well?
does it affect the acc?

Bounty-hunter

LGTM

BLANKETusers · 2026-05-14T08:28:30Z

Test Plan

Tested on 2xH200 GPU

VAE

python vllm-omni/examples/offline_inference/hunyuan_image3/end2end.py \
  --model tencent/HunyuanImage-3.0-Instruct \
  --modality text2img \
  --deploy-config vllm-omni/vllm_omni/deploy/hunyuan_image3_dit.yaml \
  --prompts "A cinematic medium shot captures a single Asian woman seated on a chair within a dimly lit room, creating an intimate and theatrical atmosphere. The composition is focused on the subject, rendered with rich colors and intricate textures that evoke a nostalgic and moody feeling.\n\nThe primary subject is a young Asian woman with a thoughtful and expressive countenance, her gaze directed slightly away from the camera. She is seated in a relaxed yet elegant posture on an ornate, vintage armchair. The chair is upholstered in a deep red velvet, its fabric showing detailed, intricate textures and slight signs of wear. She wears a simple, elegant dress in a dark teal hue, the material catching the light in a way that reveals its fine-woven texture. Her skin has a soft, matte quality, and the light delicately models the contours of her face and arms.\n\nThe surrounding room is characterized by its vintage decor, which contributes to the historic and evocative mood. In the immediate background, partially blurred due to a shallow depth of field consistent with a f/2.8 aperture, the wall is covered with wallpaper featuring a subtle, damask pattern. The overall color palette is a carefully balanced interplay of deep teal and rich red hues, creating a visually compelling and cohesive environment. The entire scene is detailed, from the fibers of the upholstery to the subtle patterns on the wall.\n\nThe lighting is highly dramatic and artistic, defined by high contrast and pronounced shadow play. A single key light source, positioned off-camera, projects gobo lighting patterns onto the scene, casting intricate shapes of light and shadow across the woman and the back wall. These dramatic shadows create a strong scense of depth and a theatrical quality. While some shadows are deep and defined, others remain soft, gently wrapping around the subject and preventing the loss of detail in darker areas. The soft focus on the background enhances the intimate feeling, drawing all attention to the expressive subject. The overall image presents a cinematic, photorealistic photography style." \
  --output ./output/output_offline_vae \
  --vae-use-tiling

No VAE

python vllm-omni/examples/offline_inference/hunyuan_image3/end2end.py \
  --model tencent/HunyuanImage-3.0-Instruct \
  --modality text2img \
  --deploy-config vllm-omni/vllm_omni/deploy/hunyuan_image3_dit.yaml \
  --prompts "A cinematic medium shot captures a single Asian woman seated on a chair within a dimly lit room, creating an intimate and theatrical atmosphere. The composition is focused on the subject, rendered with rich colors and intricate textures that evoke a nostalgic and moody feeling.\n\nThe primary subject is a young Asian woman with a thoughtful and expressive countenance, her gaze directed slightly away from the camera. She is seated in a relaxed yet elegant posture on an ornate, vintage armchair. The chair is upholstered in a deep red velvet, its fabric showing detailed, intricate textures and slight signs of wear. She wears a simple, elegant dress in a dark teal hue, the material catching the light in a way that reveals its fine-woven texture. Her skin has a soft, matte quality, and the light delicately models the contours of her face and arms.\n\nThe surrounding room is characterized by its vintage decor, which contributes to the historic and evocative mood. In the immediate background, partially blurred due to a shallow depth of field consistent with a f/2.8 aperture, the wall is covered with wallpaper featuring a subtle, damask pattern. The overall color palette is a carefully balanced interplay of deep teal and rich red hues, creating a visually compelling and cohesive environment. The entire scene is detailed, from the fibers of the upholstery to the subtle patterns on the wall.\n\nThe lighting is highly dramatic and artistic, defined by high contrast and pronounced shadow play. A single key light source, positioned off-camera, projects gobo lighting patterns onto the scene, casting intricate shapes of light and shadow across the woman and the back wall. These dramatic shadows create a strong scense of depth and a theatrical quality. While some shadows are deep and defined, others remain soft, gently wrapping around the subject and preventing the loss of detail in darker areas. The soft focus on the background enhances the intimate feeling, drawing all attention to the expressive subject. The overall image presents a cinematic, photorealistic photography style." \
  --output ./output/output_offline_vae

Test Result

VAE

No VAE

CLIP Score

99.85/100

Gaohan123

Here are some suggestions:

Please add simple UT for it
I didn't notice any modification about NPU, which is not consistent with your PR description

Fishermanykx · 2026-05-15T07:01:34Z

remove the NPU fused MoE init hook in vllm_omni/platforms/npu/models/hunyuan_fused_moe.py

done
remove the NPU fused MoE init hook in vllm_omni/platforms/npu/models/hunyuan_fused_moe.py this is done in pull 2979, which is not merged when this pr proposed. As I rebase my code, this change no longer exists in this pr.

Signed-off-by: KexiongYu <yukexiong1@huawei.com>

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch 2 times, most recently from 421d557 to c69899e Compare April 24, 2026 03:36

Fishermanykx changed the title ~~[WIP][Feat.] Enable VAE parallel in HunyuanImage3~~ [Feat.] Enable VAE parallel in HunyuanImage3 Apr 24, 2026

Fishermanykx marked this pull request as ready for review April 24, 2026 03:36

Fishermanykx requested a review from hsliuustc0106 as a code owner April 24, 2026 03:36

Fishermanykx changed the title ~~[Feat.] Enable VAE parallel in HunyuanImage3~~ [Feat] Enable VAE parallel in HunyuanImage3 Apr 24, 2026

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch 2 times, most recently from ee9b0b3 to a4502c4 Compare April 24, 2026 07:23

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch 4 times, most recently from a4fc4ec to 378289a Compare April 30, 2026 02:50

wtomin mentioned this pull request May 7, 2026

[RFC]: Continuous Diffusion Model Acceleration Support #1217

Open

1 task

Bounty-hunter mentioned this pull request May 10, 2026

[RFC]: HunyuanImage Model deployment optimization #2015

Open

Bounty-hunter approved these changes May 10, 2026

View reviewed changes

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch from 378289a to 2eacaf2 Compare May 11, 2026 12:03

Fishermanykx requested review from Isotr0py, RuixiangMa, SamitHuang, ZJY0516, david6666666, princepride and wtomin as code owners May 11, 2026 12:03

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch 5 times, most recently from ae99ecc to c6f0e06 Compare May 14, 2026 02:42

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch from c6f0e06 to 8c4b866 Compare May 14, 2026 09:13

Gaohan123 added this to the v0.22.0 milestone May 14, 2026

Gaohan123 reviewed May 14, 2026

View reviewed changes

Fishermanykx requested a review from yenuo26 as a code owner May 15, 2026 07:02

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch from f338f4d to 014b54b Compare May 15, 2026 07:05

Fishermanykx added 2 commits May 15, 2026 16:46

[WIP][Feat.] Enable VAE parallel in HunyuanImage3

4e17c90

Signed-off-by: KexiongYu <yukexiong1@huawei.com>

[UT] Add Hunyuan distributed VAE tests

272bc98

Signed-off-by: KexiongYu <yukexiong1@huawei.com>

Fishermanykx force-pushed the yukexiong/hunyuan_vae_opt branch from 014b54b to 272bc98 Compare May 15, 2026 08:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Enable VAE parallel in HunyuanImage3#3091

[Feat] Enable VAE parallel in HunyuanImage3#3091
Fishermanykx wants to merge 2 commits into
vllm-project:mainfrom
Fishermanykx:yukexiong/hunyuan_vae_opt

Fishermanykx commented Apr 24, 2026 •

edited

Loading

Uh oh!

Fishermanykx commented Apr 24, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 commented Apr 24, 2026

Uh oh!

Bounty-hunter left a comment

Uh oh!

BLANKETusers commented May 14, 2026

Uh oh!

Gaohan123 left a comment

Uh oh!

Fishermanykx commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Fishermanykx commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Test Plan

Test Result

Uh oh!

Fishermanykx commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hsliuustc0106 commented Apr 24, 2026

Uh oh!

Bounty-hunter left a comment

Choose a reason for hiding this comment

Uh oh!

BLANKETusers commented May 14, 2026

Test Plan

Test Result

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Fishermanykx commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fishermanykx commented Apr 24, 2026 •

edited

Loading

Fishermanykx commented Apr 24, 2026 •

edited

Loading