Skip to content

[GEN] Add sub_group_reduce operator#1214

Merged
whitneywhtsang merged 3 commits intollvm-targetfrom
whitneywhtsang/wave
May 31, 2024
Merged

[GEN] Add sub_group_reduce operator#1214
whitneywhtsang merged 3 commits intollvm-targetfrom
whitneywhtsang/wave

Conversation

@whitneywhtsang
Copy link
Copy Markdown
Contributor

@whitneywhtsang whitneywhtsang commented May 30, 2024

The gen.sub_group_reduce operation is invoked by all work items in a subgroup, each of them providing a $value. The $size argument is used to form groups of $size consecutive work items called clusters. Each cluster performs the reduction operation identified by $kind. The result of the cluster reduction is propagated to the work items belonging to that cluster.
It lowers to either WaveAll or WaveClustered depending on the given size.

@whitneywhtsang whitneywhtsang self-assigned this May 30, 2024
@whitneywhtsang whitneywhtsang linked an issue May 31, 2024 that may be closed by this pull request
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/wave branch 2 times, most recently from 07b8f7b to 6849f19 Compare May 31, 2024 01:10
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/wave branch 3 times, most recently from 6a8c524 to cadaf9f Compare May 31, 2024 03:47
@whitneywhtsang whitneywhtsang marked this pull request as ready for review May 31, 2024 03:47
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
Copy link
Copy Markdown
Contributor

@Dewei-Wang-sh Dewei-Wang-sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

Comment thread third_party/intel/include/Dialect/TritonGEN/IR/TritonGENOps.td Outdated
Comment thread third_party/intel/include/Dialect/TritonGEN/IR/TritonGENOps.td Outdated
Comment thread third_party/intel/include/Dialect/TritonGEN/IR/TritonGENOps.td Outdated
Comment thread third_party/intel/include/Dialect/TritonGEN/IR/TritonGENOps.td Outdated
Comment thread third_party/intel/lib/Dialect/TritonGEN/IR/TritonGENOps.cpp
Comment thread third_party/intel/include/TritonGENToLLVM/Passes.td Outdated
Comment thread third_party/intel/lib/TritonGENToLLVM/TritonGENToLLVMPass.cpp Outdated
Comment thread third_party/intel/lib/TritonGENToLLVM/TritonGENToLLVMPass.cpp
Comment thread third_party/intel/lib/TritonGENToLLVM/TritonGENToLLVMPass.cpp
Comment thread test/TritonGEN/tritongen-to-llvm.mlir
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
Comment thread third_party/intel/lib/TritonGENToLLVM/TritonGENToLLVMPass.cpp
@whitneywhtsang whitneywhtsang merged commit 5cf6579 into llvm-target May 31, 2024
@whitneywhtsang whitneywhtsang deleted the whitneywhtsang/wave branch May 31, 2024 14:43
whitneywhtsang added a commit that referenced this pull request Jun 3, 2024
address code review comment:
#1214 (comment)

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
wdziurdz pushed a commit that referenced this pull request Apr 7, 2026
This fix is needed for #1179. Currently ocloc installed in CRI
environment as a part of NEO has support only for CRI architecture and
no other GPUs are recognized by it, so this is a quick change to allow
benchmarks to build for CRI.

---------

Signed-off-by: Gregory Shimansky <gregory.shimansky@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[GEN] Add Wave[All|Clustered] operator

3 participants