Skip to content

Conversation

@vinx13
Copy link
Member

@vinx13 vinx13 commented Mar 20, 2024

This adds sm90a group gemm kernels from cutlass. It includes cmake improvement from #16638 .

Supersedes #16638.

Co-authored-by: Chris Sullivan [email protected]
Co-authored-by: masahi [email protected]

cc @csullivan @masahi @tqchen

csullivan and others added 9 commits March 19, 2024 21:46
… versions.

 * Each cutlass-based submodule library now uses its own cutlass submodule dependancy
 * TVM's cutlass submodule is decoupled from others and is bumped to
 v3.4.1 for H100 support
 * Add scaffold for new cutlass fp8 dequant gemm interface targetting
 TVM's cutlass submodule
@github-actions github-actions bot requested review from csullivan, masahi and tqchen March 20, 2024 00:26
@tqchen tqchen merged commit 89e9028 into apache:main Mar 20, 2024
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
* [CMAKE][CUTLASS] Improve dependancy management with different cutlass versions.
 * Each cutlass-based submodule library now uses its own cutlass submodule dependancy
 * TVM's cutlass submodule is decoupled from others and is bumped to
 v3.4.1 for H100 support
 * Add scaffold for new cutlass fp8 dequant gemm interface targetting
 TVM's cutlass submodule

* Remove handling for moe_gemm.cc and flash_decoding.cu which are no longer used upstream.

* Add cutlass fp8 group gemm

* Add fp16 grouped gemm support for sm90

* [Cutlass] Support alpha scaling in fp8 group gemm

* [Cutlass] Support device alpha_ptr for fp8 group gemm



---------

Co-authored-by: Chris Sullivan <[email protected]>
Co-authored-by: masahi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants