[TritonGPU] Split MemDescSubview into MemDescIndex and MemDescSubslice #7622

lezcano · 2025-07-23T15:58:58Z

The first one will be used just for pipelining and it's equivalent to
x[i], the second one takes a full slice of constant shape x[:i1, :i2],
for example.

The first one will be used just for pipelining and it's equivalent to `x[i]`, the second one takes a full slice of constant shape `x[:i1, :i2]`, for example.

ThomasRaoux

Awesome great cleanup! Few minor comments

lib/Dialect/TritonNvidiaGPU/Transforms/TensorMemoryAllocation.cpp

ThomasRaoux · 2025-07-23T22:50:00Z

test/Analysis/test-alias.mlir

  %cst_0 = ttg.local_alloc : () -> !ttg.memdesc<128x32xf16, #A_SHARED, #ttg.shared_memory, mutable>
  // expected-remark @below {{%2 -> %0}}
-  %0 = ttg.memdesc_subview %cst[%idx, %idx] : !ttg.memdesc<128x32xf16, #A_SHARED, #ttg.shared_memory, mutable> -> !ttg.memdesc<128x32xf16, #A_SHARED, #ttg.shared_memory, mutable>
+  %0 = ttg.memdesc_subslice %cst {offsets=array<i32: 0, 0>} : !ttg.memdesc<128x32xf16, #A_SHARED, #ttg.shared_memory, mutable> -> !ttg.memdesc<128x32xf16, #A_SHARED, #ttg.shared_memory, mutable>


nit: we can probably have a better printing like %cts[0, 0] but it can be done as a follow up

vibecoded it with an agent in 5 min in c89d9c6

I didn't even know about that custom API...

ThomasRaoux · 2025-07-24T02:40:01Z

lib/Dialect/TritonGPU/IR/Ops.cpp

+  bool is1D =
+      srcTy.getRank() == 1 && dstTy.getRank() == 1 && dstTy.getDimSize(0) == 1;


when do we need the 1d case?

when we pipeline barriers and things like that.

In Gluon we do Nx1xi64 to get around having to support this in the APIs. Changing that in the compiler however would mean needing to update a LOT of tests...

after this PR I'm not scared of having to change a lot of tests (in reality it was horrible)

on a different note, this would be a lovely task for an agent

yeah would be nice to clean up

…tests

ThomasRaoux

Looks great

#7622 introduced `ttg.memdesc_index` which applies a constant offset to the base pointer of the smem object. For padded layouts we need to add padding based on the offset, similar to what #7404 did for the old subview operation. I also adjusted the lit test to check we actually generate padding from the ttg.memdesc_index. The previous version did not fail because it matched the lowering of the `ttg.local_load/store` as well.

…7696) triton-lang#7622 introduced `ttg.memdesc_index` which applies a constant offset to the base pointer of the smem object. For padded layouts we need to add padding based on the offset, similar to what triton-lang#7404 did for the old subview operation. I also adjusted the lit test to check we actually generate padding from the ttg.memdesc_index. The previous version did not fail because it matched the lowering of the `ttg.local_load/store` as well.

third_party/tlx/run_all.sh [TLX-3.5] Fix memdesc_subview refactoring from triton-lang#7622 pytest python/test/unit/language/test_tlx.py::test_load_store_smem_with_tl_load pytest python/test/unit/language/test_tlx.py::test_local_store pytest python/test/unit/language/test_tlx.py::test_local_load TODO. fix TLX layout propagation LITs using memdesc_subview [TLX-3.5] Fix barrier ops caused by 1D tensor handling by memdesc_index python/test/unit/language/test_tlx.py::test_wait_arrive_non_ws The root cause is memdesc_index fail its `verify()` for 1D tensor case. It's caused by a bug in merging conflicts. More related discussions: https://github.com/triton-lang/triton/pull/7622/files#r2227788997 [TLX-3.5] Fix all UTs python/test/unit/language/test_tlx.py::test_async_dot

[TritonGPU] Split MemDescSubview into MemDescIndex and MemDescSubslice

74e93ac

The first one will be used just for pipelining and it's equivalent to `x[i]`, the second one takes a full slice of constant shape `x[:i1, :i2]`, for example.

lezcano requested review from antiagainst, ptillet and zhanglx13 as code owners July 23, 2025 15:58

lezcano requested a review from ThomasRaoux July 23, 2025 15:59

ThomasRaoux reviewed Jul 24, 2025

View reviewed changes

lezcano added 4 commits July 24, 2025 09:29

fix a few things

31f88af

fix gluon

5ca093f

fix gluon

c8483b8

remove subslice from tmem

aae2a3c

lezcano requested review from Mogball and ThomasRaoux July 24, 2025 09:23

Vibecoding: pretty-print memdesc_subslice as %src[off0,off1]; update …

c89d9c6

…tests

apgoucher approved these changes Jul 24, 2025

View reviewed changes

fix

bdd371f

ThomasRaoux approved these changes Jul 24, 2025

View reviewed changes

lezcano enabled auto-merge (squash) July 24, 2025 15:30

lezcano merged commit 45a7da6 into main Jul 24, 2025
9 checks passed

lezcano deleted the split_memdesc_subview branch July 24, 2025 17:44

AlexAUT mentioned this pull request Jul 29, 2025

[BACKEND] Apply padding when lowering ttg.memdesc_index #7696

Merged

khasanovaa mentioned this pull request Aug 21, 2025

[TritonNvidiaGPU] Fix memory leak in TritonNvidiaGPU/Transforms/InterleaveTMem.cpp #7924

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TritonGPU] Split MemDescSubview into MemDescIndex and MemDescSubslice #7622

[TritonGPU] Split MemDescSubview into MemDescIndex and MemDescSubslice #7622

Uh oh!

lezcano commented Jul 23, 2025

Uh oh!

ThomasRaoux left a comment

Uh oh!

Uh oh!

Uh oh!

ThomasRaoux Jul 23, 2025

Uh oh!

lezcano Jul 24, 2025

Uh oh!

ThomasRaoux Jul 24, 2025

Uh oh!

lezcano Jul 24, 2025

Uh oh!

Mogball Jul 24, 2025

Uh oh!

lezcano Jul 24, 2025

Uh oh!

ThomasRaoux Jul 24, 2025

Uh oh!

ThomasRaoux left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		bool is1D =
		srcTy.getRank() == 1 && dstTy.getRank() == 1 && dstTy.getDimSize(0) == 1;

[TritonGPU] Split MemDescSubview into MemDescIndex and MemDescSubslice #7622

[TritonGPU] Split MemDescSubview into MemDescIndex and MemDescSubslice #7622

Uh oh!

Conversation

lezcano commented Jul 23, 2025

Uh oh!

ThomasRaoux left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ThomasRaoux Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

lezcano Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

ThomasRaoux Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

lezcano Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

Mogball Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

lezcano Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

ThomasRaoux Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

ThomasRaoux left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants