[Model] Add HSDP support for LTX-2#2899
Conversation
Signed-off-by: hanzheli <hanzheli@kuaishou.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
can you profile with HSDP and check whether there are some free bubble in the first a few blocks with FSDP on? We notice some similar issues with WAN 2.2 |
gcanlin
left a comment
There was a problem hiding this comment.
Only one issue needs to be fixed. Otherwise LGTM.
|
Well-structured PR adding HSDP support to LTX-2. Good test coverage and documentation. |
Signed-off-by: hanzheli <hanzheli@kuaishou.com>
I’ll profile LTX-2 with HSDP enabled for this problem as soon as possible, is there any related issues already? |
Signed-off-by: hanzheli <hanzheli@kuaishou.com>
|
how about the peak vram consumption? |
Peak vram remains the same with HSDP enabled, no noticeable difference observed. |
Signed-off-by: fywc <hanzheli@kuaishou.com>
|
thanks, I hope you can profile higher resolution inputs which HSDP may benefit more |
Signed-off-by: hanzheli <hanzheli@kuaishou.com> Signed-off-by: fywc <hanzheli@kuaishou.com> Signed-off-by: nainiu258 <cperfect02@163.com>
Signed-off-by: hanzheli <hanzheli@kuaishou.com> Signed-off-by: fywc <hanzheli@kuaishou.com>
Signed-off-by: hanzheli <hanzheli@kuaishou.com> Signed-off-by: fywc <hanzheli@kuaishou.com>
Signed-off-by: hanzheli <hanzheli@kuaishou.com> Signed-off-by: fywc <hanzheli@kuaishou.com>
Signed-off-by: hanzheli <hanzheli@kuaishou.com> Signed-off-by: fywc <hanzheli@kuaishou.com>
Signed-off-by: hanzheli <hanzheli@kuaishou.com> Signed-off-by: fywc <hanzheli@kuaishou.com>




PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Add LTX-2 HSDP support.
This PR resolves the LTX-2 gap in HSDP support from RFC #1217 by:
LTX2VideoTransformer3DModelso HSDP can shard repeated transformer blocksTest Plan
Test Result
INFO 04-18 07:58:31 [diffusers_loader.py:324] Loading weights took 2.77 seconds
INFO 04-18 07:58:31 [hsdp.py:128] HSDP Inference: replicate_size=1, shard_size=2, world_size=2, rank=0, fs_world_size=2, fs_rank=0
INFO 04-18 07:58:34 [platform.py:77] Defaulting to diffusion attention backend FLASH_ATTN
INFO 04-18 07:58:37 [hsdp.py:128] HSDP Inference: replicate_size=1, shard_size=2, world_size=2, rank=1, fs_world_size=2, fs_rank=1
INFO 04-18 07:58:39 [hsdp.py:202] Sharded 912 modules + root
INFO 04-18 07:58:39 [hsdp.py:173] HSDP applied to model: FSDPLTX2VideoTransformer3DModel
INFO 04-18 07:58:40 [diffusion_model_runner.py:142] Model loading took 45.5504 GiB and 23.878847 seconds
INFO 04-18 07:58:40 [diffusion_model_runner.py:147] Model runner: Model loaded successfully.
INFO 04-18 07:58:40 [diffusion_model_runner.py:188] Model runner: Initialization complete.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)