[CUTLASS] Add GeMM kernels for Blackwell GPUs #18033

MasterJH5574 · 2025-06-02T14:46:19Z

This PR introduces CUTLASS gemm kernels, groupwise-scaled gemm kernels and group gemm kernels for Blackwell GPUs.

Files are reorganized a bit so that the exposed global functions are now architecture agnostic. Prior to this PR, our global function names for CUTLASS kernels usually end with "_sm90", which brings extra complexity when the frontend compiler decides to dispatch kernels when there are multiple supported architectures, such as Hopper and Blackwell.

Therefore, this PR renames those global function so that the function names are arch agnostic. During the build time, only the kernels that the specific architecture supports will be built.

MasterJH5574 · 2025-06-03T00:53:45Z

@tvm-bot rerun

MasterJH5574 · 2025-06-03T13:50:30Z

@tvm-bot rerun

MasterJH5574 · 2025-06-03T20:44:19Z

@tvm-bot rerun

This PR introduces CUTLASS gemm kernels, groupwise-scaled gemm kernels and group gemm kernels for Blackwell GPUs. Files are reorganized a bit so that the exposed global functions are now architecture agnostic. Prior to this PR, our global function names for CUTLASS kernels usually end with `"_sm90"`, which brings extra complexity when the frontend compiler decides to dispatch kernels when there are multiple supported architectures, such as Hopper and Blackwell. Therefore, this PR renames those global function so that the function names are arch agnostic. During the build time, only the kernels that the specific architecture supports will be built.

The cutlass kernel build on Hopper GPU was broken since apache#18033. This PR fixes the issue.

The cutlass kernel build on Hopper GPU was broken since #18033. This PR fixes the issue.

This PR introduces CUTLASS gemm kernels, groupwise-scaled gemm kernels and group gemm kernels for Blackwell GPUs. Files are reorganized a bit so that the exposed global functions are now architecture agnostic. Prior to this PR, our global function names for CUTLASS kernels usually end with `"_sm90"`, which brings extra complexity when the frontend compiler decides to dispatch kernels when there are multiple supported architectures, such as Hopper and Blackwell. Therefore, this PR renames those global function so that the function names are arch agnostic. During the build time, only the kernels that the specific architecture supports will be built.

The cutlass kernel build on Hopper GPU was broken since apache#18033. This PR fixes the issue.

tqchen approved these changes Jun 5, 2025

View reviewed changes

MasterJH5574 force-pushed the tvm-dev/2025-06-02-cutlass-blackwell branch 2 times, most recently from af75beb to 4dd1743 Compare June 6, 2025 03:43

MasterJH5574 force-pushed the tvm-dev/2025-06-02-cutlass-blackwell branch from 4dd1743 to 5f8598e Compare June 6, 2025 03:52

tqchen merged commit fd9c091 into apache:main Jun 6, 2025
12 checks passed

MasterJH5574 added a commit to MasterJH5574/tvm that referenced this pull request Jun 16, 2025

[CUTLASS] Fix CUTLASS kernel build on Hopper

66edd74

The cutlass kernel build on Hopper GPU was broken since apache#18033. This PR fixes the issue.

MasterJH5574 mentioned this pull request Jun 16, 2025

[CUTLASS] Fix CUTLASS kernel build on Hopper #18064

Merged

tqchen pushed a commit that referenced this pull request Jun 17, 2025

[CUTLASS] Fix CUTLASS kernel build on Hopper (#18064)

6c540e0

The cutlass kernel build on Hopper GPU was broken since #18033. This PR fixes the issue.

ysh329 mentioned this pull request Jul 16, 2025

[Release] v0.21.0 Release Candidate Notes #18150

Closed

ShiboXing pushed a commit to ShiboXing/tvm that referenced this pull request Aug 10, 2025

[CUTLASS] Fix CUTLASS kernel build on Hopper (apache#18064)

13eb0d4

The cutlass kernel build on Hopper GPU was broken since apache#18033. This PR fixes the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUTLASS] Add GeMM kernels for Blackwell GPUs #18033

[CUTLASS] Add GeMM kernels for Blackwell GPUs #18033

Uh oh!

MasterJH5574 commented Jun 2, 2025

Uh oh!

MasterJH5574 commented Jun 3, 2025

Uh oh!

MasterJH5574 commented Jun 3, 2025

Uh oh!

MasterJH5574 commented Jun 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CUTLASS] Add GeMM kernels for Blackwell GPUs #18033

[CUTLASS] Add GeMM kernels for Blackwell GPUs #18033

Uh oh!

Conversation

MasterJH5574 commented Jun 2, 2025

Uh oh!

MasterJH5574 commented Jun 3, 2025

Uh oh!

MasterJH5574 commented Jun 3, 2025

Uh oh!

MasterJH5574 commented Jun 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants