[Utils] Refactor device cache emptying by hebiao064 · Pull Request #24861 · sgl-project/sglang

hebiao064 · 2026-05-09T21:59:19Z

Motivation

SGLang has several paths that empty the PyTorch device allocator cache while also using flush_cache to clear internal memory pools such as KV cache and Mamba cache. Some scheduler paths still hard-code torch.cuda.empty_cache(), which makes the allocator-emptying behavior CUDA-specific even though SGLang supports other device backends.

This PR keeps the existing external API behavior while making the internal distinction clearer:

flush_cache clears SGLang memory pools.
empty_device_cache only releases unused cached blocks from the active device allocator.

Modifications

Add empty_device_cache() as a small common helper for backend allocator cache emptying.
Use the helper in scheduler flush_cache and idle periodic cache emptying instead of directly calling torch.cuda.empty_cache().
Reuse the helper inside get_available_gpu_memory for CUDA, XPU, NPU, and MUSA empty-cache paths.
Deduplicate weight-update flush handling through flush_cache_after_weight_update.
Clarify the flush_cache docstring around memory pools such as KV cache and Mamba cache.

Accuracy Tests

Not applicable. This PR does not change model forward behavior or numerical outputs.

Speed Tests and Profiling

Not applicable. This is a small cache-management refactor and preserves existing defaults.

Validation

python3 -m py_compile python/sglang/srt/utils/common.py python/sglang/srt/managers/scheduler.py python/sglang/srt/managers/scheduler_update_weights_mixin.py python/sglang/srt/managers/io_struct.py
git diff --check

Note: local pytest with uv --directory python --extra test could not run on macOS arm64 because sgl-deep-gemm==0.0.1 has no wheel for this platform.

gemini-code-assist · 2026-05-09T21:59:23Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ByronHsu · 2026-05-09T23:46:36Z

/tag-and-rerun-ci

ByronHsu · 2026-05-10T04:18:09Z

/tag-and-rerun-ci

Replace direct torch.cuda.empty_cache() / memory_reserved() calls in continue_generation with the empty_device_cache() helper from sgl-project#24861, making the in-place pause resume path work on all device backends. Co-authored-by: Cursor <cursoragent@cursor.com>

Biao He added 5 commits May 9, 2026 14:52

Refactor device cache emptying

d569343

Keep torch_empty_cache docs explicit

ac1a78a

Clarify flush cache docstring

8980f7a

Rename weight update flush helper

e1e513c

Remove cache helper tests

92a985a

hebiao064 requested review from Ying1123, hnyls2002, merrymercy and xiezhq-hermann as code owners May 9, 2026 21:59

ByronHsu approved these changes May 9, 2026

View reviewed changes

hebiao064 changed the title ~~refactor device empty cache~~ Refactor device cache emptying May 9, 2026

hebiao064 added the run-ci label May 9, 2026

hebiao064 assigned hebiao064 and ByronHsu May 9, 2026

ByronHsu changed the title ~~Refactor device cache emptying~~ [Utils] Refactor device cache emptying May 10, 2026

ByronHsu merged commit 9578ba1 into main May 10, 2026
182 of 211 checks passed

ByronHsu deleted the codex/refactor-device-empty-cache branch May 10, 2026 04:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Utils] Refactor device cache emptying#24861

[Utils] Refactor device cache emptying#24861
ByronHsu merged 5 commits into
mainfrom
codex/refactor-device-empty-cache

hebiao064 commented May 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented May 9, 2026

Uh oh!

ByronHsu commented May 9, 2026

Uh oh!

ByronHsu commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hebiao064 commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Validation

Uh oh!

gemini-code-assist Bot commented May 9, 2026

Uh oh!

ByronHsu commented May 9, 2026

Uh oh!

ByronHsu commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hebiao064 commented May 9, 2026 •

edited

Loading