[Doc] Add diffusion attention backend docs#3011
Merged
hsliuustc0106 merged 1 commit intoApr 23, 2026
Merged
Conversation
Signed-off-by: david6666666 <530634352@qq.com>
Collaborator
Author
|
Validation I ran local offline validation on H20 GPUs after rebuilding HunyuanVideo-1.5 Config:
Results:
FLASH_ATTN hv15_retest_fa3_steps8.mp4SAGE_ATTN hv15_retest_sage_steps8.mp4Wan2.2 TI2V 5B Config:
Results:
FLASH_ATTN wan22_fa3.mp4SAGE_ATTN wan22_sage.mp4Notes
|
6e448a3 to
30e8977
Compare
1 task
1 task
30e8977 to
739787d
Compare
Collaborator
Author
|
@ZJY0516 @lishunyang12 ptal thx |
Collaborator
hsliuustc0106
left a comment
There was a problem hiding this comment.
Good documentation. Covers all diffusion attention backends with clear installation and usage examples.
lishunyang12
approved these changes
Apr 22, 2026
lengrongfu
pushed a commit
to lengrongfu/vllm-omni
that referenced
this pull request
May 1, 2026
Signed-off-by: david6666666 <530634352@qq.com>
clodaghwalsh17
pushed a commit
to clodaghwalsh17/nm-vllm-omni-ent
that referenced
this pull request
May 12, 2026
Signed-off-by: david6666666 <530634352@qq.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds user-facing documentation for diffusion attention backend selection in vLLM-Omni.
What Changed
docs/user_guide/diffusion/attention_backends.mdDIFFUSION_ATTENTION_BACKENDselection and backend optionsWhy
Diffusion users need one place to understand how to switch attention backends, how to install SageAttention, and what to validate when comparing
SAGE_ATTNagainst the default FlashAttention path.Validation
pre-commit run --all-filespytest -q tests/diffusion/attention/test_flash_attn.pyHunyuanVideo-1.5Wan2.2-TI2V-5B-Diffusers