Fix bugs and typos in source code. #14

Xiashangning · 2025-09-06T02:46:48Z

Found some bugs and typos in the source code found during running the tests.

algo=wmma_implicit_gemm failed on A100 with pytorch 2.4.1 and cuda 12.1.

algo=cutlass_implicit_gemm returns -1 during benchmarking.

therefore, only explicit and implicit works on my side.

PS: the test code seems to be not up to date. Could you please have a look and maybe update it according the latest source code?

chrischoy · 2025-09-09T16:33:06Z

The wmma is not supported for fp32/fp64. Cutlass also does not support either but I convert them to fp16/bf16 in the kernel. Also, cutlass kernel can only run multiple of 16 channels or configurations allowed by engine cutlass engine so more restrictive.

fix bugs and typros

f80733e

Xiashangning added 2 commits September 11, 2025 11:01

Merge branch 'main' into bug_fix

d7c0a3f

bug fix

b31c0cb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bugs and typos in source code. #14

Fix bugs and typos in source code. #14

Uh oh!

Xiashangning commented Sep 6, 2025

Uh oh!

chrischoy commented Sep 9, 2025

Uh oh!

Uh oh!

Fix bugs and typos in source code. #14

Are you sure you want to change the base?

Fix bugs and typos in source code. #14

Uh oh!

Conversation

Xiashangning commented Sep 6, 2025

Uh oh!

chrischoy commented Sep 9, 2025

Uh oh!

Uh oh!