Skip to content

[Bugfix] Fix KV cache overestimation for hybrid Mamba/attention model…#37124

Closed
swtb3 wants to merge 7 commits intovllm-project:mainfrom
swtb3:fix/hybrid-mamba-kv-cache-reporting
Closed

[Bugfix] Fix KV cache overestimation for hybrid Mamba/attention model…#37124
swtb3 wants to merge 7 commits intovllm-project:mainfrom
swtb3:fix/hybrid-mamba-kv-cache-reporting

Commits