[BACKEND] Reinterpreted memory should represent the same amount of memory by lezcano · Pull Request #10243 · triton-lang/triton

lezcano · 2026-05-06T12:22:25Z

We also disallow performing reinterpret layouts of subslices as they'd be rather cursed when the subslice is not contiguous.
Instead we ask the user to reinterpret the base layout instead.

We also improve the API for _reinterpret in gluon by allowing to pass in just the attributes you want to change.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e9c3b61f8d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

lezcano · 2026-05-06T14:19:19Z

addressed

…mory To do this we compute look at the offset size while looking at the pseduoinverse of the layout to handle subviews properly. We extend the pseudoinvert to take a shape to handle subviews properly (whose layout shape is the allocshape, which may be different to the tensor shape).

lezcano · 2026-05-06T15:33:36Z

Changed the PR to disallow reinterpret of subslices which heavily simplfies the implementation. Can you please review @Mogball @ThomasRaoux @Jokeren

peterbell10

Looks like an unrelated change was committed?

FindDefinition · 2026-05-09T06:59:16Z

@lezcano this PR disallow reinterpret smem with "index" virtual dim to another smem without "index" dim (works on triton 3.7, llvm error on triton 3.6), is it possible to support it? otherwise we need to use terrible hack to implement "buffered processing", e.g. flush profile events when smem buffer is full, or do bitonic sort on whole smem buffer when full.

bitonic sort: we use global memory instead of smem as buffer.
custom profile events (not proton): we use inline ptx to generate illegal non-uniform scalar, then use it to index smem.

lezcano · 2026-05-09T07:11:19Z

Can you share a minimised repro of the issue?

FindDefinition · 2026-05-09T07:38:49Z

@lezcano Here is a script that works on triton 3.7.0, error on triton main, llvm error on triton 3.6.0:

import triton.language as tl
from triton.experimental.gluon import language as gl
from triton.experimental import gluon
import torch 
@gluon.jit
def smem_flush_kernel(out_ptr, ):
    num_smem_ev: tl.constexpr = 32
    layout: gl.constexpr = gl.BlockedLayout([1, 2], [32, 1], [1, 1], [1, 0])
    smem = gl.allocate_shared_memory(gl.uint32, [num_smem_ev, 2], gl.SwizzledSharedLayout(1, 1, 1, [0]))
    for j in range(32):
        value = gl.full([2], j, dtype=gl.uint32, layout=gl.SliceLayout(0, layout))
        smem.index(j).store(value)

    smem_rep = smem._reinterpret(gl.uint32, smem.shape, layout=gl.SwizzledSharedLayout(1, 1, 1, [1, 0]))
    smem_val = smem_rep.load(layout)
    offs_m = gl.arange(0, smem.shape[0], layout=gl.AutoLayout())
    offs_n = gl.arange(0, smem.shape[1], layout=gl.AutoLayout())
    gl.store(out_ptr + offs_m[:, None] * 2 + offs_n[None, :], smem_val)

def main():
    out = torch.zeros((32, 2), dtype=torch.uint32, device="cuda")
    smem_flush_kernel[(1,)](out, num_warps=1)
    print(out)

if __name__ == "__main__":
    main()

lezcano · 2026-05-09T08:52:52Z

Ah, right. Yeah, that we can support. Will add support for that on Monday

lezcano · 2026-05-11T09:29:38Z

fixed in #10286

We also check that when reinterpreting a pipelining buffer, the intial dimensions are the same. Addresses #10243 (comment)

lezcano requested review from peterbell10 and ptillet as code owners May 6, 2026 12:22

lezcano requested review from Jokeren and Mogball May 6, 2026 12:23

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

Comment thread lib/Tools/LinearLayout.cpp Outdated

Jokeren reviewed May 6, 2026

View reviewed changes

Comment thread include/triton/Tools/LinearLayout.h Outdated

Comment thread lib/Tools/LinearLayout.cpp Outdated

Comment thread lib/Dialect/TritonGPU/IR/Ops.cpp

Jokeren approved these changes May 6, 2026

View reviewed changes

lezcano force-pushed the reinterpret branch from 3ff4d1d to 1729c3f Compare May 6, 2026 15:30

jeffniu-openai approved these changes May 6, 2026

View reviewed changes

Mogball approved these changes May 6, 2026

View reviewed changes

peterbell10 requested changes May 6, 2026

View reviewed changes

Comment thread python/tutorials/gluon/07-persistence.py

Comment thread python/examples/gluon/01-attention-forward.py

lezcano requested a review from peterbell10 May 7, 2026 08:53

peterbell10 approved these changes May 7, 2026

View reviewed changes

lezcano merged commit 40e899b into main May 7, 2026
16 of 18 checks passed

lezcano deleted the reinterpret branch May 7, 2026 09:58

lezcano mentioned this pull request May 11, 2026

[BACKEND] Allow reinterpret to modify the rank #10286

Merged

lezcano added a commit that referenced this pull request May 13, 2026

[BACKEND] Allow reinterpret to modify the rank (#10286)

0dcc1c5

We also check that when reinterpreting a pipelining buffer, the intial dimensions are the same. Addresses #10243 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BACKEND] Reinterpreted memory should represent the same amount of memory#10243

[BACKEND] Reinterpreted memory should represent the same amount of memory#10243
lezcano merged 1 commit into
mainfrom
reinterpret

lezcano commented May 6, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lezcano commented May 6, 2026

Uh oh!

lezcano commented May 6, 2026

Uh oh!

peterbell10 left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FindDefinition commented May 9, 2026

Uh oh!

lezcano commented May 9, 2026

Uh oh!

FindDefinition commented May 9, 2026

Uh oh!

lezcano commented May 9, 2026

Uh oh!

lezcano commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

lezcano commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lezcano commented May 6, 2026

Uh oh!

lezcano commented May 6, 2026

Uh oh!

peterbell10 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FindDefinition commented May 9, 2026

Uh oh!

lezcano commented May 9, 2026

Uh oh!

FindDefinition commented May 9, 2026

Uh oh!

lezcano commented May 9, 2026

Uh oh!

lezcano commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

lezcano commented May 6, 2026 •

edited

Loading

peterbell10 left a comment •

edited

Loading