Skip to content

[PA] Optimize PA Decode Gluon Performance for BF16/FP16 with KV_BLOCK_SIZE=64 and Fix ROCm 7.0 AOT Compilation#1691

Merged
coderfeli merged 4 commits into
mainfrom
pa_gluon_opt_bf16
Dec 23, 2025
Merged

[PA] Optimize PA Decode Gluon Performance for BF16/FP16 with KV_BLOCK_SIZE=64 and Fix ROCm 7.0 AOT Compilation#1691
coderfeli merged 4 commits into
mainfrom
pa_gluon_opt_bf16

Commits

Commits on Dec 19, 2025

Commits on Dec 22, 2025