Skip to content

[WIP][Feature] Pipeline Parallelism & Stream Batch for Real-Time Video (#2280)#2

Closed
mnasser02 wants to merge 155 commits into
mainfrom
stream-diffusion
Closed

[WIP][Feature] Pipeline Parallelism & Stream Batch for Real-Time Video (#2280)#2
mnasser02 wants to merge 155 commits into
mainfrom
stream-diffusion

Conversation

@mnasser02
Copy link
Copy Markdown
Collaborator

@mnasser02 mnasser02 commented Apr 16, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Implements vllm-project#2280: depth-partitioned pipeline parallelism + Stream Batch temporal scheduling for streaming video diffusion. Targets 40+ FPS on 4×H100, sub-500 ms TTFF.

Stacks on vllm-project#2322.

Phase Deliverable Status
1 SupportsStepExecution on Wan22Pipeline
2 StreamBatchScheduler (warmup/steady/cooldown) driving micro-step execution pipeline 🔲
3 SLO-adaptive step count + config 🔲
4 StreamVAE with chunk-wise decode + 3D conv caching 🔲
5 CFG parallel, DistributedKVCache stub (vllm-project#1987), VAE-as-stage (vllm-project#2089) 🔲
6 Benchmarks vs SP baseline, PP×SP hybrid, docs 🔲

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

hadipash and others added 30 commits April 10, 2026 15:40
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>

# Conflicts:
#	docs/user_guide/diffusion_features.md
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>

# Conflicts:
#	vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py
#	vllm_omni/diffusion/models/wan2_2/wan2_2_transformer.py
…#2524)

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
…llm-project#2690)

Signed-off-by: Sy03 <1370724210@qq.com>
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>

# Conflicts:
#	docs/user_guide/diffusion_features.md
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
…e single-GPU performance (vllm-project#2604)

Signed-off-by: Ian Carrasco <ian.carrasco@baseten.co>
Signed-off-by: Zhang <jianmusings@gmail.com>
Signed-off-by: Zhang Jian <jianmusings@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-N-senders to support Bagel TP/CFG parallel (vllm-project#2731)

Signed-off-by: natureofnature <wzliu@connect.hku.hk>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Signed-off-by: hsliuustc0106 <liuhongsheng4@huawei.com>
Signed-off-by: neptune <neptune@hust.edu.cn>
Co-authored-by: neptune <neptune@hust.edu.cn>
Signed-off-by: Alex Brooks <albrooks@redhat.com>
…ed task

Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
… pipeline (B and T hardcoded for now)

Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
…instead of sync send/recv

Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Plain P2P on size-2 PG triggers lazy sub-comm creation that requires the peer present.

Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
- Separate comms stream
- Double buffering
- Set rcv buffers for a new req
- Revert changes regarding Async structs

Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.