[WIP][Feature] Temporal Pipeline Parallelism & Stream Batch for Real-Time Video#3099
Draft
mnasser02 wants to merge 56 commits into
Draft
[WIP][Feature] Temporal Pipeline Parallelism & Stream Batch for Real-Time Video#3099mnasser02 wants to merge 56 commits into
mnasser02 wants to merge 56 commits into
Conversation
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com> # Conflicts: # docs/user_guide/diffusion_features.md
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com> # Conflicts: # vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py # vllm_omni/diffusion/models/wan2_2/wan2_2_transformer.py
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com> # Conflicts: # docs/user_guide/diffusion_features.md
# Conflicts: # docs/user_guide/diffusion_features.md
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com> # Conflicts: # docs/user_guide/diffusion_features.md # vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2.py # vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_i2v.py # vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py # vllm_omni/diffusion/models/wan2_2/wan2_2_transformer.py
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Implement SupportsStepExecution protocol on Wan22Pipeline, decomposing the monolithic forward() into prepare_encode, denoise_step,step_scheduler, and post_decode. Add denoise_micro_step for temporal PP. Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
>> Different ranks work on different chunks. A context manager that views the req state of a rank as a chunk state allows benefitting from existing functionalities. Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
…ed task Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
… pipeline (B and T hardcoded for now) Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
…instead of sync send/recv Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Plain P2P on size-2 PG triggers lazy sub-comm creation that requires the peer present. Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
- Separate comms stream - Double buffering - Set rcv buffers for a new req - Revert changes regarding Async structs Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
1d341e8 to
4c8e2b7
Compare
Collaborator
|
Ready for full review when WIP status is removed. Preliminary scan available on request. Note: Test plan and test results sections are currently empty. Please provide:
|
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
8a0f16b to
b34cf19
Compare
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
25 tasks
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
Signed-off-by: Mahdi Nasser <94046147+mnasser02@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Implements #2280.
Targets 40+ FPS on 4×H100, sub-500 ms TTFF (Wan2.1-1.3B).
Stacks on #2322.
SupportsStepExecutiononWan22PipelineStreamBatchSchedulerdriving micro-step execution pipelineStreamVAETest Plan
(NOT FINALIZED)
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)