Skip to content

[Bugfix] Reduce _npu_flash_attention mask to 128x128 for memory savings#1100

Closed
ApsarasX wants to merge 1 commit into
vllm-project:mainfrom
ApsarasX:community-attn_mask
Closed

[Bugfix] Reduce _npu_flash_attention mask to 128x128 for memory savings#1100
ApsarasX wants to merge 1 commit into
vllm-project:mainfrom
ApsarasX:community-attn_mask

Commits

Commits on Aug 11, 2025