From 1cb0438f7ab3b6b28a7a64c51630723dc22cd45d Mon Sep 17 00:00:00 2001 From: AviralGoelAMD Date: Wed, 15 Oct 2025 17:44:29 +0000 Subject: [PATCH 1/2] docs: add quant mode comparison to readme --- example/ck_tile/38_block_scale_gemm/README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/example/ck_tile/38_block_scale_gemm/README.md b/example/ck_tile/38_block_scale_gemm/README.md index b7b14f9d132..67857ce7eb2 100644 --- a/example/ck_tile/38_block_scale_gemm/README.md +++ b/example/ck_tile/38_block_scale_gemm/README.md @@ -7,6 +7,15 @@ This folder contains examples of quant GEMMs using the ck_tile tile-programming - Row and Column-wise scaled: All of the rowwise elements in A Matrix and columwise elements in B Matrix will share the same quantization element and the elementwisde operation will complete in epilogue. - Tensor-wise scaled: Share the same scalar scale across the whole tensor of A or B +## Quantization Mode Comparison + +| Quant Mode | A Matrix Organization | A Scale Shape | B Matrix Organization | B Scale Shape | +|------------|----------------------|---------------|----------------------|---------------| +| **AQuant** | Blocks along K dimension
Each M×GroupSize block shares one scale | `[M, K/GroupSize]` | Not quantized | N/A | +| **BQuant** | Not quantized | N/A | Blocks along K dimension
Each GroupSize×N block shares one scale | `[K/GroupSize, N]` | +| **RowColQuant** | Per-row quantization
All K elements in each row share one scale | `[M, 1]` | Per-column quantization
All K elements in each column share one scale | `[1, N]` | +| **TensorQuant** | Tensor-wise quantization
All M×K elements share one scale | `[1]` | Tensor-wise quantization
All K×N elements share one scale | `[1]` | + --- ## Features From 4181b48768cd38744e90e3ea230184b79675f3c2 Mon Sep 17 00:00:00 2001 From: Aviral Goel Date: Wed, 15 Oct 2025 18:38:42 -0400 Subject: [PATCH 2/2] Update example/ck_tile/38_block_scale_gemm/README.md Co-authored-by: Christopher Millette <63608002+cgmillette@users.noreply.github.com> --- example/ck_tile/38_block_scale_gemm/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/example/ck_tile/38_block_scale_gemm/README.md b/example/ck_tile/38_block_scale_gemm/README.md index 67857ce7eb2..496697ca323 100644 --- a/example/ck_tile/38_block_scale_gemm/README.md +++ b/example/ck_tile/38_block_scale_gemm/README.md @@ -4,7 +4,7 @@ This folder contains examples of quant GEMMs using the ck_tile tile-programming - AQuant kernel with blocks of A matrix sharing scales: custom GEMM pipeline - BQuant kernel with blocks of B matrix sharing scales: custom GEMM pipeline -- Row and Column-wise scaled: All of the rowwise elements in A Matrix and columwise elements in B Matrix will share the same quantization element and the elementwisde operation will complete in epilogue. +- Row and Column-wise scaled: All of the row-wise elements in A Matrix and column-wise elements in B Matrix will share the same quantization element and the element-wise operation will complete in epilogue. - Tensor-wise scaled: Share the same scalar scale across the whole tensor of A or B ## Quantization Mode Comparison