[Feature]Adding decode vae patch parallel supports for LTX-2 by erfgss · Pull Request #2135 · vllm-project/vllm-omni

erfgss · 2026-03-24T12:07:36Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

This PR adds support for VAE patch parallelism in the LTX-2 text-to-video pipeline.

By enabling distributed VAE decoding when --vae-patch-parallel-size > 1, this change improves multi-GPU utilization and reduces VAE decode latency for video generation workloads.

Test Plan

Use LTX-2 text-to-video inference
Set tensor-parallel-size=2
Enable --vae-use-tiling
Run with:
- --vae-patch-parallel-size=1
- --vae-patch-parallel-size=2
Compare VAE decode time reported in logs

Test Result

Model	Task	Tensor Parallel Size	VAE Patch Parallel Size	VAE Decode (ms)
LTX-2	text-to-video	2	1	11474.19
LTX-2	text-to-video	2	2	293.67
LTX-2	image-to-video	2	1	730.73
LTX-2	image-to-video	2	2	397.07

text-to-video VAE Patch Parallel Size=1

ltx2_t2v_diffvae1.mp4

text-to-video VAE Patch Parallel Size=2

ltx2_t2v_diffvae2.mp4

image-to-video VAE Patch Parallel Size=1

ltx2_i2v_vae1.mp4

image-to-video VAE Patch Parallel Size=2

ltx2_i2v_vae2.mp4

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Chen Yang <2082464740@qq.com>

erfgss · 2026-03-24T12:11:10Z

@david6666666

david6666666 · 2026-03-24T12:16:53Z

add unit test and output video comparison

Signed-off-by: Chen Yang <2082464740@qq.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c98eb84932

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-24T12:22:11Z

+        timestep: torch.Tensor | None = None,
+        return_dict: bool = True,
+        *args: Any,


Preserve base decode argument order for causal flag

AutoencoderKLLTX2Video.decode takes (z, temb=None, causal=None, return_dict=True), but this override changes the positional order to (z, timestep=None, return_dict=True, ...). Any caller that passes causal positionally (for example decode(z, temb, False)) will now set return_dict=False instead, silently changing the return type and leaving causal unset. This is a behavioral regression in the public method contract and can break wrappers that rely on the original positional API.

Useful? React with 👍 / 👎.

david6666666 · 2026-03-24T12:27:42Z

update LTX-2 image-to-video also, and should update vllm-omni/docs/user_guide/diffusion_acceleration.md

erfgss · 2026-03-24T12:34:11Z

update LTX-2 image-to-video also, and should update vllm-omni/docs/user_guide/diffusion_acceleration.md

I will update these

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

david6666666 · 2026-03-30T03:08:02Z

@Bounty-hunter ptal thx

Bounty-hunter · 2026-03-30T03:56:16Z

+    ) -> torch.Tensor:
+        """Decode a single latent tile into video space."""
+        tile = task.tensor
+        if hasattr(self, "clear_cache"):


https://github.com/huggingface/diffusers/blob/f2be8bd6b3dc4035bd989dc467f15d86bf3c9c12/src/diffusers/models/autoencoders/autoencoder_kl_ltx2.py#L1440 why diffusers not check for this

Bounty-hunter · 2026-03-30T04:03:17Z

+        dec = torch.clamp(dec, min=-1.0, max=1.0)
+        return dec
+
+    def patch_split(self, z: torch.Tensor) -> tuple[list[TileTask], GridSpec]:


Have you evaluated the performance gain from patch splitting? When the height and width are small, the splited size (+blend) is almost equal to the total size, so the performance improvement may be limited? In this scenario, using temporal tiled decode parallel might be a better choice? https://github.com/huggingface/diffusers/blob/f2be8bd6b3dc4035bd989dc467f15d86bf3c9c12/src/diffusers/models/autoencoders/autoencoder_kl_ltx2.py#L1497

Have you evaluated the performance gain from patch splitting? When the height and width are small, the splited size (+blend) is almost equal to the total size, so the performance improvement may be limited? In this scenario, using temporal tiled decode parallel might be a better choice? https://github.com/huggingface/diffusers/blob/f2be8bd6b3dc4035bd989dc467f15d86bf3c9c12/src/diffusers/models/autoencoders/autoencoder_kl_ltx2.py#L1497

When 24 frames of video are generated, temporal tiled decoding does not bring obvious gains, but instead increases the overhead.

wtomin · 2026-03-30T06:31:47Z

A recent PR changed the diffusion features docs strucure. Pls PTAL #1928.

Signed-off-by: Chen Yang <2082464740@qq.com>

wtomin · 2026-04-02T07:29:55Z

@erfgss Can you help to create a L4 e2e test for LTX2 model, covering the existing diffusion features supported (See #1217). As for how to create a L4 e2e test, please refer to #1832 .

Please update the document docs/user_guide/diffusion_features.md

erfgss · 2026-04-02T07:31:41Z

@erfgss Can you help to create a L4 e2e test for LTX2 model, covering the existing diffusion features supported (See #1217). As for how to create a L4 e2e test, please refer to #1832 .

OK,I can do this

Signed-off-by: Chen Yang <2082464740@qq.com>

wtomin

LGTM. Pls resolve the conflicts.

david6666666 · 2026-04-08T12:33:50Z

Follow up pr can refer to #2368

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

[Feature]Adding vae patch parallel supports for VideoGen

c98eb84

Signed-off-by: Chen Yang <2082464740@qq.com>

erfgss requested a review from hsliuustc0106 as a code owner March 24, 2026 12:07

fix

c760d7a

Signed-off-by: Chen Yang <2082464740@qq.com>

chatgpt-codex-connector Bot reviewed Mar 24, 2026

View reviewed changes

david6666666 changed the title ~~[Feature]Adding vae patch parallel supports for VideoGen~~ [Feature]Adding vae patch parallel supports for LTX-2 Mar 24, 2026

erfgss added 5 commits March 25, 2026 08:37

Merge branch 'main' into videoGen_vae

3b584f2

Merge branch 'main' into videoGen_vae

4d79223

Update autoencoder_kl_ltx2.py

d3d6d8d

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Update parallelism_acceleration.md

a490461

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Merge branch 'main' into videoGen_vae

3e2673e

Bounty-hunter reviewed Mar 30, 2026

View reviewed changes

wtomin mentioned this pull request Mar 30, 2026

[RFC]: Continuous Diffusion Model Acceleration Support #1217

Open

1 task

erfgss and others added 3 commits March 31, 2026 10:13

fix

47c1acf

Signed-off-by: Chen Yang <2082464740@qq.com>

fix

980b87c

Signed-off-by: Chen Yang <2082464740@qq.com>

Merge branch 'main' into videoGen_vae

4c6257c

erfgss and others added 3 commits April 3, 2026 11:24

Merge branch 'main' into videoGen_vae

1be0050

add test

d662ca1

Signed-off-by: Chen Yang <2082464740@qq.com>

fix test

495a0cb

Signed-off-by: Chen Yang <2082464740@qq.com>

wtomin approved these changes Apr 8, 2026

View reviewed changes

erfgss added 2 commits April 8, 2026 22:09

Merge branch 'main' into videoGen_vae

4a05ea1

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Merge branch 'main' into videoGen_vae

50ad60d

erfgss changed the title ~~[Feature]Adding vae patch parallel supports for LTX-2~~ [Feature]Adding decode vae patch parallel supports for LTX-2 Apr 9, 2026

erfgss added 2 commits April 13, 2026 09:20

Merge branch 'main' into videoGen_vae

4a50839

Merge branch 'main' into videoGen_vae

26f1845

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Conversation

erfgss commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

text-to-video VAE Patch Parallel Size=1

text-to-video VAE Patch Parallel Size=2

image-to-video VAE Patch Parallel Size=1

image-to-video VAE Patch Parallel Size=2

Uh oh!

erfgss commented Mar 24, 2026

Uh oh!

david6666666 commented Mar 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

david6666666 commented Mar 24, 2026

Uh oh!

erfgss commented Mar 24, 2026

Uh oh!

david6666666 commented Mar 30, 2026

Uh oh!

Bounty-hunter Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

erfgss Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bounty-hunter Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

erfgss Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

wtomin commented Mar 30, 2026

Uh oh!

wtomin commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erfgss commented Apr 2, 2026

Uh oh!

wtomin left a comment

Choose a reason for hiding this comment

Uh oh!

david6666666 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

erfgss commented Mar 24, 2026 •

edited

Loading

erfgss Mar 30, 2026 •

edited

Loading

wtomin commented Apr 2, 2026 •

edited

Loading