Skip to content

[ROCM] Optimized deepseek-r1 fp8 model with + triton_gemm_a8w8 + batch_gemm_a8w8 + fused set_mla_kv_buffer kernel#13617

Merged
HaiShaw merged 6 commits intosgl-project:mainfrom
yctseng0211:triton_gemm
Nov 20, 2025
Merged

[ROCM] Optimized deepseek-r1 fp8 model with + triton_gemm_a8w8 + batch_gemm_a8w8 + fused set_mla_kv_buffer kernel#13617
HaiShaw merged 6 commits intosgl-project:mainfrom
yctseng0211:triton_gemm

Commits

Commits on Nov 19, 2025

Commits on Nov 20, 2025