[Metrics] Refactor LoRA state tracking #26801

markmc · 2025-10-14T12:02:33Z

LoRARequestStates serves a simple purpose - keep track on a per-lora basis which requests are running, and which are waiting.

The current implementation is somewhat sloppy and over-complicated:

LoRAStats is never freed if a LoRA becomes unused or unloaded (reported by @WoosukKwon)
We leak if log_stats is disabled and LoRA is enabled (reported by @hidva)
It's not clear whether 'finish_request()' should be idempotent
The static methods are a bit weird
We're adding each request to the waiting set twice
...

Hopefully this implementation is more clear and easier to verify that e.g. there are no leaks. And now with a unit test!

gemini-code-assist

Code Review

This pull request provides a solid refactoring of the LoRA state tracking logic. The changes effectively address several issues, including memory leaks when log_stats is disabled, ambiguity around method idempotency, and overly complex state management. The new implementation is much clearer, more robust, and easier to maintain. The introduction of a comprehensive unit test is a significant improvement that validates the correctness of the new logic and will help prevent future regressions. Overall, this is an excellent contribution that improves the quality and reliability of the codebase.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

vllm/v1/metrics/stats.py

jeejeelee · 2025-10-30T08:57:33Z

@njhill Could you please look at this PR？

Signed-off-by: Mark McLoughlin <[email protected]>

LoRARequestStates serves a simple purpose - keep track on a per-lora basis which requests are running, and which are waiting. The current implementation is somewhat sloppy and over-complicated: - LoRAStats is never freed if a LoRA becomes unused or unloaded - We leak if log_stats is disabled and LoRA is enabled - It's not clear whether finish_request() should be idempotent - The static methods are a bit weird - We're adding each request to the waiting set twice - ... Hopefully this implementation is more clear and easier to verify that e.g. there are no leaks. Signed-off-by: Mark McLoughlin <[email protected]>

SchedulerStats is the right place for this really, just like the regular running/waiting counts. Make sure to call LoRARequestStates.update_scheduler_stats() even where there was no engine core outputs. Signed-off-by: Mark McLoughlin <[email protected]>

jeejeelee

Let's land this PR first

Signed-off-by: Mark McLoughlin <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Signed-off-by: Mark McLoughlin <[email protected]>

markmc requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners October 14, 2025 12:02

mergify bot added the v1 label Oct 14, 2025

gemini-code-assist bot reviewed Oct 14, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 14, 2025

View reviewed changes

vllm/v1/metrics/stats.py Outdated Show resolved Hide resolved

markmc added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 14, 2025

This comment was marked as resolved.

Sign in to view

DarkLight1337 requested a review from jeejeelee October 22, 2025 15:20

markmc force-pushed the lora-state-metrics branch from af4c449 to 3f1ba30 Compare October 30, 2025 07:50

markmc added 2 commits November 6, 2025 08:53

[Metrics] Add unit test for LoRA state tracking

918babf

Signed-off-by: Mark McLoughlin <[email protected]>

markmc force-pushed the lora-state-metrics branch from 3f1ba30 to cae669c Compare November 6, 2025 14:01

markmc force-pushed the lora-state-metrics branch from cae669c to e6aa440 Compare November 6, 2025 14:38

jeejeelee approved these changes Nov 10, 2025

View reviewed changes

jeejeelee merged commit 6f7de33 into vllm-project:main Nov 10, 2025
47 checks passed

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Nov 13, 2025

[Metrics] Refactor LoRA state tracking (vllm-project#26801)

ab8edac

Signed-off-by: Mark McLoughlin <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Metrics] Refactor LoRA state tracking (vllm-project#26801)

3c83ff1

Signed-off-by: Mark McLoughlin <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Metrics] Refactor LoRA state tracking #26801

[Metrics] Refactor LoRA state tracking #26801

Uh oh!

markmc commented Oct 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

This comment was marked as resolved.

jeejeelee commented Oct 30, 2025

Uh oh!

jeejeelee left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Metrics] Refactor LoRA state tracking #26801

[Metrics] Refactor LoRA state tracking #26801

Uh oh!

Conversation

markmc commented Oct 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

This comment was marked as resolved.

jeejeelee commented Oct 30, 2025

Uh oh!

jeejeelee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markmc commented Oct 14, 2025 •

edited by github-actions bot

Loading