Fix missed `attention_chunk_divmod` param for block specifics in `mma_pv`. by wanderingai · Pull Request #1582 · Dao-AILab/flash-attention

wanderingai · 2025-04-10T15:37:52Z

No description provided.

Don't use FusedDense anymore to simplify code Fix FA3 qkvpacked interface Launch more thread blocks in layer_norm_bwd check valid tile before storing num_splits in split_idx (Dao-AILab#1578) Tune rotary kernel to use 2 warps if rotary_dim <= 64 Implement attention_chunk Fix missed attention chunk size param for block specifics in `mma_pv`. (Dao-AILab#1582) [AMD ROCm] Support MI350 (Dao-AILab#1586) * enable gfx950 support * update ck for gfx950 --------- Co-authored-by: illsilin <Illia.Silin@amd.com> Make attention_chunk work for non-causal cases Use tile size 128 x 96 for hdim 64,256 Fix kvcache tests for attention_chunk when precomputing metadata Fix kvcache test with precomputed metadata: pass in max_seqlen_q Pass 0 as attention_chunk in the bwd for now [LayerNorm] Implement option for zero-centered weight Make hopper build more robust (Dao-AILab#1598) In certain environments the relative path to the vendored nvcc is not picked up correctly if provided relative. In this PR, I just make it absolute. Fix L2 swizzle in causal tile scheduler Use LPT scheduler for causal backward pass

Dao-AILab#1582)

Fix missed attention chunk size param for block specifics in mma_pv.

6925f5d

tridao merged commit 7ff73af into Dao-AILab:main Apr 10, 2025

wanderingai deleted the patch-1 branch April 11, 2025 11:20

playerzer0x pushed a commit to Liqhtworks/flash-attention that referenced this pull request Jul 24, 2025

Fix missed attention chunk size param for block specifics in mma_pv. (

fc8f792

Dao-AILab#1582)

elewarr pushed a commit to elewarr/flash-attention that referenced this pull request Feb 4, 2026

Fix missed attention chunk size param for block specifics in mma_pv. (

acc3a94

Dao-AILab#1582)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missed `attention_chunk_divmod` param for block specifics in `mma_pv`.#1582

Fix missed `attention_chunk_divmod` param for block specifics in `mma_pv`.#1582
tridao merged 1 commit intoDao-AILab:mainfrom
wanderingai:patch-1

wanderingai commented Apr 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wanderingai commented Apr 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants