Skip to content

optimize 3264x3072x1536xNxN#2558

Closed
jfactory07 wants to merge 3 commits into
users/common/cvs_devfrom
users/jzhou/3264x3072x1536xNxN-2
Closed

optimize 3264x3072x1536xNxN#2558
jfactory07 wants to merge 3 commits into
users/common/cvs_devfrom
users/jzhou/3264x3072x1536xNxN-2

Conversation

@jfactory07
Copy link
Copy Markdown
Contributor

@jfactory07 jfactory07 commented Nov 10, 2025

Motivation

optimize 3264x3072x1536xNxN

Technical Details

change MT to 192x256x64, so it can leverage current CMS

Test Result

test for 3264x3072x1536xNxN
got 15% uplift

Submission Checklist

@jfactory07 jfactory07 added the gfx950 run CI on gfx950 label Nov 10, 2025
@jfactory07 jfactory07 marked this pull request as ready for review November 11, 2025 01:16
@jfactory07 jfactory07 requested a review from a team as a code owner November 11, 2025 01:16
@jfactory07 jfactory07 requested a review from b-shi November 11, 2025 01:16
@jfactory07 jfactory07 mentioned this pull request Nov 11, 2025
1 task
@jfactory07 jfactory07 closed this Nov 11, 2025
@jfactory07 jfactory07 deleted the users/jzhou/3264x3072x1536xNxN-2 branch November 11, 2025 02:49
ammallya pushed a commit that referenced this pull request Feb 3, 2026
ammallya pushed a commit that referenced this pull request Feb 3, 2026
…" (#2584)

This reverts commit a111f65.

[ROCm/composable_kernel commit: b80099c]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant