Skip to content

[Bugfix] Zero-init MLA attention output buffers to prevent NaN from CUDA graph padding#37442

Merged
tlrmchlsmth merged 4 commits intovllm-project:mainfrom
elvircrn:zero-init-attn-output-pr
Mar 19, 2026
Merged

[Bugfix] Zero-init MLA attention output buffers to prevent NaN from CUDA graph padding#37442
tlrmchlsmth merged 4 commits intovllm-project:mainfrom
elvircrn:zero-init-attn-output-pr

Commits

Commits on Mar 18, 2026