Skip to content

[GEN] Add cache controls decoration to 2d_block_read builtin calls#1247

Merged
whitneywhtsang merged 6 commits into
llvm-targetfrom
victor-eds/2dblockread-attr
Jun 17, 2024
Merged

[GEN] Add cache controls decoration to 2d_block_read builtin calls#1247
whitneywhtsang merged 6 commits into
llvm-targetfrom
victor-eds/2dblockread-attr

Conversation

@victor-eds
Copy link
Copy Markdown

@victor-eds victor-eds commented Jun 4, 2024

#1224 converts triton_gen.2Dblockload to OCL builtin 2d_block_read calls. Add cache controls decoration capturing original cache_control semantics.

@victor-eds victor-eds added this to the 4.3 [Performance] Tracking milestone Jun 4, 2024
@victor-eds victor-eds self-assigned this Jun 4, 2024
@victor-eds
Copy link
Copy Markdown
Author

Addresses #1233

@whitneywhtsang whitneywhtsang linked an issue Jun 4, 2024 that may be closed by this pull request
Comment thread third_party/intel/lib/TritonGENToLLVM/TritonGENToLLVMPass.cpp Outdated
@whitneywhtsang
Copy link
Copy Markdown
Contributor

Can you please check if the expected GenISA with expected cache control is generated at the end from shader dump?

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/2dblockread branch from 12d6da7 to 0ba9684 Compare June 5, 2024 05:11
@victor-eds
Copy link
Copy Markdown
Author

Can you please check if the expected GenISA with expected cache control is generated at the end from shader dump?

Issue's this won't work till IGC picks KhronosGroup/SPIRV-LLVM-Translator@c8bfc33.

@whitneywhtsang
Copy link
Copy Markdown
Contributor

Can you please check if the expected GenISA with expected cache control is generated at the end from shader dump?

Issue's this won't work till IGC picks KhronosGroup/SPIRV-LLVM-Translator@c8bfc33.

IIUC the feature is needed for LLVMIR -> SPV translation, which doesn't need IGC to update the SPV -> LLVMIR portion. #1251 updated the SPIRV-LLVM-Translator Triton uses.

@victor-eds
Copy link
Copy Markdown
Author

IIUC the feature is needed for LLVMIR -> SPV translation, which doesn't need IGC to update the SPV -> LLVMIR portion. #1251 updated the SPIRV-LLVM-Translator Triton uses.

Running latest llvm-spirv gives a sensible output:

5 Decorate 36 CacheControlLoadINTEL 0 0
5 Decorate 37 CacheControlLoadINTEL 1 0
...
4 Bitcast 4 38 37
4 Bitcast 8 39 31
10 FunctionCall 2 40 10 38 21 22 23 34 39

I don't have access to an IGC development version handling this, tho, so I cannot currently test on my side.

@victor-eds victor-eds requested a review from whitneywhtsang June 5, 2024 12:44
@victor-eds
Copy link
Copy Markdown
Author

ola.txt

Here's the full SPIR-V dump. It's .txt as apparently GitHub cares about extensions 🤦

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/2dblockread branch 4 times, most recently from f721dde to a24edde Compare June 7, 2024 04:42
Copy link
Copy Markdown
Contributor

@jopperm jopperm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just test case nits; code LGTM. You probably have to watch out for #1259 which extends the name mangling for the OCL builtins (transform/transpose).

Comment thread test/TritonGEN/tritongen-to-llvm.mlir Outdated
Comment thread test/TritonGEN/tritongen-to-llvm.mlir Outdated
@victor-eds victor-eds force-pushed the victor-eds/2dblockread-attr branch from 7dee82f to 66386ac Compare June 10, 2024 10:10
@victor-eds victor-eds closed this Jun 10, 2024
@victor-eds victor-eds reopened this Jun 10, 2024
@victor-eds
Copy link
Copy Markdown
Author

Just test case nits; code LGTM. You probably have to watch out for #1259 which extends the name mangling for the OCL builtins (transform/transpose).

I think this should work regardless of that. We would just accept more cases.

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/2dblockread branch from 761892b to c99dc29 Compare June 10, 2024 17:16
Comment thread test/TritonGEN/tritongen-to-llvm.mlir
Comment thread test/TritonGEN/tritongen-to-llvm.mlir Outdated
@whitneywhtsang
Copy link
Copy Markdown
Contributor

This change looks good to me, let's merge this after verifying it works end to end.

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/2dblockread branch from c99dc29 to 75411ea Compare June 10, 2024 22:08
Base automatically changed from whitneywhtsang/2dblockread to llvm-target June 10, 2024 22:39
Victor Perez added 5 commits June 11, 2024 08:22
!1224 converts `triton_gen.2Dblockload` to OCL builtin
`2d_block_read` calls. Add cache controls decoration capturing
original `cache_control` semantics.

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Signed-off-by: Victor Perez <victor.perez@codeplay.com>
@victor-eds victor-eds force-pushed the victor-eds/2dblockread-attr branch from 66386ac to 76bf92a Compare June 11, 2024 07:45
@whitneywhtsang whitneywhtsang merged commit 31f64d2 into llvm-target Jun 17, 2024
@whitneywhtsang whitneywhtsang deleted the victor-eds/2dblockread-attr branch June 17, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[GEN] Use cache control attribute for 2D block read OCL lowering

3 participants