[Bugfix] Fix DeepSeek V3.2 OOM during CG memory profiling by MatthewBonanni · Pull Request #36691 · vllm-project/vllm

MatthewBonanni · 2026-03-10T18:03:33Z

Purpose

The cudagraph memory profiler added in #30515 did not account for UniformTypeKVCacheSpecs in init_minimal_kv_cache_for_profiling, so the page_size was being improperly multiplied by the group_size, causing an allocation that was 61x too large. This PR fixes this and takes advantage of the existing num_blocks override mechanism instead of spoofing the available memory, so it should be more robust.

Test Plan

vllm serve deepseek-ai/DeepSeek-V3.2 -tp 8

Test Result

main: OOM during startup
PR: starts up successfully

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

robertgshaw2-redhat · 2026-03-10T18:08:05Z

do we need a bugfix in 0.17 for this?

MatthewBonanni · 2026-03-10T18:13:04Z

@robertgshaw2-redhat no, it was introduced by #30515, which isn't in 0.17. Updated the PR description to clarify

gemini-code-assist

Code Review

This pull request addresses an out-of-memory issue during CUDA graph memory profiling for DeepSeek V3.2. The fix correctly initializes the minimal KV cache by using the num_gpu_blocks_override mechanism, which is a more robust approach than the previous memory calculation that was incorrect for UniformTypeKVCacheSpecs. The change is sound, but I've suggested an improvement to ensure the configuration is always restored to its original state, even in the case of an exception, by using a try...finally block.

_{Note: Security Review did not run due to the size of the PR.}

vllm/v1/worker/gpu_model_runner.py

njhill

Thanks @MatthewBonanni , maybe just add a short comment?

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

…ct#36691) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

Fix

8161015

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

MatthewBonanni requested a review from njhill as a code owner March 10, 2026 18:03

MatthewBonanni added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 10, 2026

mergify bot added deepseek Related to DeepSeek models v1 bug Something isn't working labels Mar 10, 2026

gemini-code-assist bot reviewed Mar 10, 2026

View reviewed changes

vllm/v1/worker/gpu_model_runner.py Show resolved Hide resolved

njhill approved these changes Mar 10, 2026

View reviewed changes

Add comment

0af8be3

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

MatthewBonanni enabled auto-merge (squash) March 11, 2026 02:42

MatthewBonanni merged commit 8ab3d74 into vllm-project:main Mar 11, 2026
56 checks passed

huydhn mentioned this pull request Mar 17, 2026

[release/2.11] PyTorch 2.11 x vLLM benchmark OOM deepseek-ai/deepseek-v3.2 pytorch/pytorch#177426

Closed

wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026

[Bugfix] Fix DeepSeek V3.2 OOM during CG memory profiling (vllm-proje…

cecefe8

…ct#36691) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026

[Bugfix] Fix DeepSeek V3.2 OOM during CG memory profiling (vllm-proje…

62a9a9d

…ct#36691) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix DeepSeek V3.2 OOM during CG memory profiling#36691

[Bugfix] Fix DeepSeek V3.2 OOM during CG memory profiling#36691
MatthewBonanni merged 2 commits intovllm-project:mainfrom
MatthewBonanni:fix_dsv32_oom

MatthewBonanni commented Mar 10, 2026 •

edited by github-actions bot

Loading

Uh oh!

robertgshaw2-redhat commented Mar 10, 2026

Uh oh!

MatthewBonanni commented Mar 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

MatthewBonanni commented Mar 10, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

robertgshaw2-redhat commented Mar 10, 2026

Uh oh!

MatthewBonanni commented Mar 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MatthewBonanni commented Mar 10, 2026 •

edited by github-actions bot

Loading