[Core] Optimize expensive deepcopy in GPU model runner by GOavi101 · Pull Request #31723 · vllm-project/vllm

GOavi101 · 2026-01-05T13:04:24Z

Summary

Replace expensive deepcopy() with selective shallow copying for scheduler_output when using async scheduling with speculative decoding. This optimization addresses the TODO comment at line 3108.

Performance Improvement

13-37x faster copy operations (measured in unit tests)
Scales better with more requests (performance gap widens with larger workloads)
Reduces memory usage by ~90% (only copies 2 dicts instead of entire object graph)

Changes

Replaced deepcopy(scheduler_output) with selective copy:
- Shallow copy the SchedulerOutput dataclass
- Only deep copy the 2 dict fields that get modified:
  - num_scheduled_tokens (modified via dict[key] -= value)
  - scheduled_spec_decode_tokens (modified via dict.pop())
- Share read-only fields (memory efficient and safe)

Testing

Verified correctness with unit tests comparing optimized copy vs deepcopy
Performance benchmarks confirm significant speedup

Code Review

This pull request introduces a significant performance optimization by replacing an expensive deepcopy of the scheduler_output object with a more efficient selective copy. This change is applied when using asynchronous scheduling with speculative decoding. The implementation correctly identifies the mutable fields that are modified and creates shallow copies of them, which is sufficient to prevent side effects while being much faster than a full deep copy. This is a well-executed optimization that should deliver the performance and memory benefits described.

mergify · 2026-01-05T15:26:16Z

Hi @GOavi101, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

GOavi101 · 2026-01-05T16:46:05Z

Hello @njhill, could you please take a look and review it?

mergify · 2026-01-05T17:50:44Z

Hi @GOavi101, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Replace expensive deepcopy() with selective shallow copying for scheduler_output when using async scheduling with speculative decoding. The optimization: - Shallow copies the SchedulerOutput dataclass - Only deep copies the 2 dict fields that get modified: * num_scheduled_tokens (modified via dict[key] -= value) * scheduled_spec_decode_tokens (modified via dict.pop()) - Shares read-only fields (memory efficient and safe) Fixes the TODO comment at line 3108. Signed-off-by: GOavi101 <avishek.official12@gmail.com>

njhill · 2026-01-06T16:47:06Z

Thanks for this @GOavi101, actually we're already simplifying this as part of #29821 so the code in question is disappearing anyhow.

mergify bot added the v1 label Jan 5, 2026

gemini-code-assist bot reviewed Jan 5, 2026

View reviewed changes

GOavi101 force-pushed the optimize-deepcopy-scheduler-output branch from 99d2968 to f96e8b0 Compare January 5, 2026 16:42

GOavi101 force-pushed the optimize-deepcopy-scheduler-output branch from f96e8b0 to b15c224 Compare January 5, 2026 18:20

GOavi101 closed this Jan 6, 2026

GOavi101 deleted the optimize-deepcopy-scheduler-output branch January 6, 2026 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Core] Optimize expensive deepcopy in GPU model runner#31723

[Core] Optimize expensive deepcopy in GPU model runner#31723
GOavi101 wants to merge 1 commit intovllm-project:mainfrom
GOavi101:optimize-deepcopy-scheduler-output

GOavi101 commented Jan 5, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

mergify bot commented Jan 5, 2026

Uh oh!

GOavi101 commented Jan 5, 2026

Uh oh!

mergify bot commented Jan 5, 2026

Uh oh!

njhill commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

GOavi101 commented Jan 5, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance Improvement

Changes

Testing

Related

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify bot commented Jan 5, 2026

Uh oh!

GOavi101 commented Jan 5, 2026

Uh oh!

mergify bot commented Jan 5, 2026

Uh oh!

njhill commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GOavi101 commented Jan 5, 2026 •

edited by github-actions bot

Loading