Skip to content

optimize 3264x3072x1536xNxN#2521

Closed
jfactory07 wants to merge 3 commits into
developfrom
users/jzhou/3264x3072x1536xNxN
Closed

optimize 3264x3072x1536xNxN#2521
jfactory07 wants to merge 3 commits into
developfrom
users/jzhou/3264x3072x1536xNxN

Conversation

@jfactory07
Copy link
Copy Markdown
Contributor

Motivation

optimize 3264x3072x1536xNxN

Technical Details

change MT to 192x256x64

Submission Checklist

@math-ci
Copy link
Copy Markdown

math-ci Bot commented Nov 7, 2025

perfci run on commit 67dae2f

math-ci run

jsandham pushed a commit that referenced this pull request Nov 7, 2025
… monorepo. (#2521)

Signed-off-by: MaheshRavishankar <mravisha@amd.com>
@msujon-AMD
Copy link
Copy Markdown
Collaborator

@jfactory07 , pls close the PR and submit it again to users/common/cvs_dev. Pls add tests in Tensile/Tests/common/gemm/gfx950/custom_mainloop_scheduling.yaml in the branch.

@jfactory07 jfactory07 closed this Nov 10, 2025
@msujon-AMD
Copy link
Copy Markdown
Collaborator

@jfactory07 , pls close the PR and submit it again to users/common/cvs_dev. Pls add tests in Tensile/Tests/common/gemm/gfx950/custom_mainloop_scheduling.yaml in the branch.

Sorry, I misunderstood. This PR uses existing CMS kernel to optimize on size. I am adding Babak to review it.

@msujon-AMD msujon-AMD reopened this Nov 10, 2025
@msujon-AMD msujon-AMD requested a review from babakpst November 10, 2025 15:01
@math-ci
Copy link
Copy Markdown

math-ci Bot commented Nov 10, 2025

perfci run on commit 0b26925

math-ci run

@jfactory07
Copy link
Copy Markdown
Contributor Author

jfactory07 commented Nov 11, 2025

replaced by #2596

@jfactory07 jfactory07 closed this Nov 11, 2025
@jfactory07 jfactory07 deleted the users/jzhou/3264x3072x1536xNxN branch December 9, 2025 07:41
ammallya pushed a commit that referenced this pull request Feb 3, 2026
[ROCm/composable_kernel commit: 7fc000d]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants