Skip to content

Conversation

@ptillet
Copy link
Collaborator

@ptillet ptillet commented Oct 16, 2024

No description provided.

ptillet and others added 2 commits October 16, 2024 00:16
Adjust the placement of LDS writes and reads to immediately follow the
definition of their operands in case where LDS write is in the loop but
it's operand is not. This is a heuristic for optimizing fused attention
by hoisting Q tensor LDS read/write operations outside of the loop, as Q
is a loop invariant and can be loaded once before entering the loop.

In the previous implementation, the heuristic incorrectly assumed that
the operand of the LDS write had to be a load operation, which is
unnecessary. Additionally, there was no explicit check to verify whether
the LDS write was in the loop while its defining operand was not. This
PR addresses both issues.

---------

Co-authored-by: Ognjen Plavsic <[email protected]>
@antiagainst antiagainst merged commit 8fb7342 into main Oct 16, 2024
@antiagainst antiagainst deleted the phil/amd-more-revert branch October 16, 2024 15:24
alexsamardzic pushed a commit to alexsamardzic/triton that referenced this pull request Oct 16, 2024
jtang10 pushed a commit to ROCm/triton that referenced this pull request Oct 21, 2024
Luosuu pushed a commit to Luosuu/triton that referenced this pull request Nov 13, 2024
guacamoleo pushed a commit to guacamoleo/triton that referenced this pull request Nov 14, 2024
bertmaher pushed a commit to bertmaher/triton that referenced this pull request Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants