[WIP][Feature] Pipeline Parallelism & Stream Batch for Real-Time Video (#2280)#2
Closed
mnasser02 wants to merge 155 commits into
Closed
[WIP][Feature] Pipeline Parallelism & Stream Batch for Real-Time Video (#2280)#2mnasser02 wants to merge 155 commits into
mnasser02 wants to merge 155 commits into
Conversation
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com> # Conflicts: # docs/user_guide/diffusion_features.md
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com> # Conflicts: # vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py # vllm_omni/diffusion/models/wan2_2/wan2_2_transformer.py
…#2524) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
…llm-project#2690) Signed-off-by: Sy03 <1370724210@qq.com> Signed-off-by: Yueqian Lin <linyueqian@outlook.com> Co-authored-by: Yueqian Lin <linyueqian@outlook.com> Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com> # Conflicts: # docs/user_guide/diffusion_features.md
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
…t#2735) Signed-off-by: samithuang <285365963@qq.com>
…e single-GPU performance (vllm-project#2604) Signed-off-by: Ian Carrasco <ian.carrasco@baseten.co>
Signed-off-by: Zhang <jianmusings@gmail.com> Signed-off-by: Zhang Jian <jianmusings@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-N-senders to support Bagel TP/CFG parallel (vllm-project#2731) Signed-off-by: natureofnature <wzliu@connect.hku.hk>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
…llm-project#2598) Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
Signed-off-by: hsliuustc0106 <liuhongsheng4@huawei.com>
Signed-off-by: neptune <neptune@hust.edu.cn> Co-authored-by: neptune <neptune@hust.edu.cn>
Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: Alex Brooks <albrooks@redhat.com>
…ed task Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
… pipeline (B and T hardcoded for now) Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
9967162 to
1d341e8
Compare
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
…instead of sync send/recv Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Plain P2P on size-2 PG triggers lazy sub-comm creation that requires the peer present. Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
- Separate comms stream - Double buffering - Set rcv buffers for a new req - Revert changes regarding Async structs Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
1d341e8 to
4c8e2b7
Compare
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
8a0f16b to
b34cf19
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Implements vllm-project#2280: depth-partitioned pipeline parallelism + Stream Batch temporal scheduling for streaming video diffusion. Targets 40+ FPS on 4×H100, sub-500 ms TTFF.
Stacks on vllm-project#2322.
SupportsStepExecutiononWan22PipelineStreamBatchScheduler(warmup/steady/cooldown) driving micro-step execution pipelineStreamVAEwith chunk-wise decode + 3D conv cachingDistributedKVCachestub (vllm-project#1987), VAE-as-stage (vllm-project#2089)Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)