Skip to content

Merge OpenAI Triton commit 100e2aa#1227

Merged
whitneywhtsang merged 13 commits intollvm-targetfrom
whitneywhtsang/merge
Jun 2, 2024
Merged

Merge OpenAI Triton commit 100e2aa#1227
whitneywhtsang merged 13 commits intollvm-targetfrom
whitneywhtsang/merge

Conversation

@whitneywhtsang
Copy link
Copy Markdown
Contributor

@whitneywhtsang whitneywhtsang commented Jun 1, 2024

This PR change the Triton base from 021fb72 to 100e2aa (May 28).
Pass rate: 98.64%

Please do not squash and merge this PR.

christopherhesse and others added 12 commits May 26, 2024 17:58
The `tl.dot` docs have a couple of typos and seem to be missing mention
of lower precision dtypes
Co-authored-by: Tori <vwbaker@google.com>
- Correct some indices errors.
When processing loops we were incorrectly setting all the local def as
livein.
Since part of the code assumes that only livein variable can be loop
carried dependency that was causing a mismatch in the logic. Change it
to enforce that local def not livein cannot be loop carried.
The TritonGPUPipeline pass has unused pass options and the
TritonGPUAccelerateMatmul pass option could instead be read from the
module attributes, where the data already exists. The goal is to reduce
redundancy.

---------

Signed-off-by: Finlay Marno <finlay.marno@codeplay.com>
… it is not needed (#3790)

This PR:
- moves shortcut check earlier, to not compute scratch buffer shape if
it is not needed
- raise priority of AMD specific over common conversions to eliminate
uncertainty which pattern to apply.
- add regression test for MFMA to Dot Op shortcut
This is a follow up PR of #3832
`wave` has been replaced with `warp` for the consistency between GPUs.
Unfortunately there are still remaining use of `wave` in the code as
below, although I've tried to minimize it.

## Referencing AMD features (HIP API or AMDGPU)
third_party/amd/backend/include/hip/:
third_party/amd/backend/include/roctracer/:
third_party/amd/backend/include/has/*:
- Cannot completely replace waves because the definition comes from
outside e.g., __AMDGCN_WAVEFRONT_SIZE, hsa_wavefront_info_t
- Mixing up `warp` and `wave` together in the same place could be even
worse.

## Using amdgpu compiler option
third_party/amd/backend/compiler.py:
python/tutorials/03-matrix-multiplication.py:
python/tutorials/06-fused-attention.py:
- `waves_per_eu` which is supposed to mapped to a CLANG attribute
`amdgpu-waves-per-eu`
- It is AMD only option and makes better sense to keep
This PR enables support of 3d dot for RDNA GPUs.
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
@whitneywhtsang whitneywhtsang self-assigned this Jun 1, 2024
@whitneywhtsang whitneywhtsang marked this pull request as ready for review June 1, 2024 23:20
@whitneywhtsang whitneywhtsang requested a review from pbchekin June 1, 2024 23:20
@whitneywhtsang whitneywhtsang changed the title Merge OpenAI Triton commit 18d691e Merge OpenAI Triton commit 100e2aa Jun 2, 2024
@whitneywhtsang whitneywhtsang merged commit 0cdcffa into llvm-target Jun 2, 2024
@whitneywhtsang whitneywhtsang deleted the whitneywhtsang/merge branch June 2, 2024 14:52
@vlad-penkin vlad-penkin linked an issue Jun 22, 2024 that may be closed by this pull request
wdziurdz pushed a commit that referenced this pull request Apr 7, 2026
Reverts intel-tools/intel-xpu-backend-for-triton#1130

This change causes problems in main-js CI when it is merged into
main-js. Also with #1179 we will have a different way of scheduled
running llama_kernels with profiling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Merge OpenAI Triton till June 7th