[enflame] fix stride mismatch int flash_attention_forward test #1100

xingxing588 · 2025-11-25T08:21:24Z

detail log:
test_attention_ops.py::attention_ref

   output = torch.einsum("bhts,bshd->bthd", attention_drop, drop_v)
   if query_padding_mask is not None:
        output.masked_fill_((~query_padding_mask)[:, :, None, None], 0.0)
    log:output.shape torch.Size([1, 2, 4, 128])
    log:output.stride (256, 128, 256, 1)
    output = output.contiguous() --need contiguous
    log:output.shape torch.Size([1, 2, 4, 128])
    log:output.stride (1024, 512, 128, 1)
flag_gems/ops/attention.py::flash_attention_forward
    query = transpose(1, 2) got a temp layout
    log:out torch.Size([1, 4, 2, 128])
    log:out.stride (1024, 128, 512, 1)
    log:out.is_contiguous() False
    out = out.contiguous()
    log:out.contiguous() torch.Size([1, 4, 2, 128])
    log:out.contiguous().stride (1024, 128, 512, 1)
    log:out.contiguous().is_contiguous() False

PR Category

Type of Change

Description

Issue

Progress

Change is properly reviewed (1 reviewer required, 2 recommended).
Change is responded to an issue.
Change is fully covered by a UT.

Performance

detail log: test_attention_ops.py::attention_ref output = torch.einsum("bhts,bshd->bthd", attention_drop, drop_v) if query_padding_mask is not None: output.masked_fill_((~query_padding_mask)[:, :, None, None], 0.0) log:output.shape torch.Size([1, 2, 4, 128]) log:output.stride (256, 128, 256, 1) output = output.contiguous() --need contiguous log:output.shape torch.Size([1, 2, 4, 128]) log:output.stride (1024, 512, 128, 1) flag_gems/ops/attention.py::flash_attention_forward query = transpose(1, 2) got a temp layout log:out torch.Size([1, 4, 2, 128]) log:out.stride (1024, 128, 512, 1) log:out.is_contiguous() False out = out.contiguous() log:out.contiguous() torch.Size([1, 4, 2, 128]) log:out.contiguous().stride (1024, 128, 512, 1) log:out.contiguous().is_contiguous() False

CLAassistant · 2025-11-25T08:21:30Z

All committers have signed the CLA.

kiddyjinjin

lgtm

0x45f · 2025-11-26T01:44:49Z

src/flag_gems/ops/attention.py

        non_null_window_right = -1

-    out = torch.empty_like(query)
+    out = torch.empty(query.shape, device=query.device, dtype=query.dtype)


why using torch.empty instead of torch.empty_like

kiddyjinjin approved these changes Nov 25, 2025

View reviewed changes

0x45f reviewed Nov 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[enflame] fix stride mismatch int flash_attention_forward test #1100

[enflame] fix stride mismatch int flash_attention_forward test #1100

Uh oh!

xingxing588 commented Nov 25, 2025

Uh oh!

CLAassistant commented Nov 25, 2025 •

edited

Loading

Uh oh!

kiddyjinjin left a comment

Uh oh!

0x45f Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[enflame] fix stride mismatch int flash_attention_forward test #1100

Are you sure you want to change the base?

[enflame] fix stride mismatch int flash_attention_forward test #1100

Uh oh!

Conversation

xingxing588 commented Nov 25, 2025

PR Category

Type of Change

Description

Issue

Progress

Performance

Uh oh!

CLAassistant commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kiddyjinjin left a comment

Choose a reason for hiding this comment

Uh oh!

0x45f Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CLAassistant commented Nov 25, 2025 •

edited

Loading