Make hopper build more robust by classner · Pull Request #1598 · Dao-AILab/flash-attention

classner · 2025-04-17T19:37:31Z

In certain environments the relative path to the vendored nvcc is not picked up correctly if provided relative. In this PR, I just make it absolute.

Don't use FusedDense anymore to simplify code Fix FA3 qkvpacked interface Launch more thread blocks in layer_norm_bwd check valid tile before storing num_splits in split_idx (Dao-AILab#1578) Tune rotary kernel to use 2 warps if rotary_dim <= 64 Implement attention_chunk Fix missed attention chunk size param for block specifics in `mma_pv`. (Dao-AILab#1582) [AMD ROCm] Support MI350 (Dao-AILab#1586) * enable gfx950 support * update ck for gfx950 --------- Co-authored-by: illsilin <Illia.Silin@amd.com> Make attention_chunk work for non-causal cases Use tile size 128 x 96 for hdim 64,256 Fix kvcache tests for attention_chunk when precomputing metadata Fix kvcache test with precomputed metadata: pass in max_seqlen_q Pass 0 as attention_chunk in the bwd for now [LayerNorm] Implement option for zero-centered weight Make hopper build more robust (Dao-AILab#1598) In certain environments the relative path to the vendored nvcc is not picked up correctly if provided relative. In this PR, I just make it absolute. Fix L2 swizzle in causal tile scheduler Use LPT scheduler for causal backward pass

In certain environments the relative path to the vendored nvcc is not picked up correctly if provided relative. In this PR, I just make it absolute.

Make hopper build more robust

8cf0427

In certain environments the relative path to the vendored nvcc is not picked up correctly if provided relative. In this PR, I just make it absolute.

tridao merged commit 934f6ad into Dao-AILab:main Apr 17, 2025

julian-q mentioned this pull request May 1, 2025

Make build more robust SandAI-org/MagiAttention#19

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make hopper build more robust#1598

Make hopper build more robust#1598
tridao merged 1 commit intoDao-AILab:mainfrom
classner:patch-1

classner commented Apr 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

classner commented Apr 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants