Skip to content

Conversation

@markmc
Copy link
Member

@markmc markmc commented Oct 14, 2025

LoRARequestStates serves a simple purpose - keep track on a per-lora basis which requests are running, and which are waiting.

The current implementation is somewhat sloppy and over-complicated:

  • LoRAStats is never freed if a LoRA becomes unused or unloaded (reported by @WoosukKwon)
  • We leak if log_stats is disabled and LoRA is enabled (reported by @hidva)
  • It's not clear whether 'finish_request()' should be idempotent
  • The static methods are a bit weird
  • We're adding each request to the waiting set twice
  • ...

Hopefully this implementation is more clear and easier to verify that e.g. there are no leaks. And now with a unit test!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a solid refactoring of the LoRA state tracking logic. The changes effectively address several issues, including memory leaks when log_stats is disabled, ambiguity around method idempotency, and overly complex state management. The new implementation is much clearer, more robust, and easier to maintain. The introduction of a comprehensive unit test is a significant improvement that validates the correctness of the new logic and will help prevent future regressions. Overall, this is an excellent contribution that improves the quality and reliability of the codebase.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

@markmc markmc added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 14, 2025
@markmc

This comment was marked as resolved.

@jeejeelee
Copy link
Collaborator

@njhill Could you please look at this PR?

LoRARequestStates serves a simple purpose - keep track on a per-lora
basis which requests are running, and which are waiting.

The current implementation is somewhat sloppy and over-complicated:

- LoRAStats is never freed if a LoRA becomes unused or unloaded
- We leak if log_stats is disabled and LoRA is enabled
- It's not clear whether finish_request() should be idempotent
- The static methods are a bit weird
- We're adding each request to the waiting set twice
- ...

Hopefully this implementation is more clear and easier to verify
that e.g. there are no leaks.

Signed-off-by: Mark McLoughlin <[email protected]>
@markmc markmc force-pushed the lora-state-metrics branch from 3f1ba30 to cae669c Compare November 6, 2025 14:01
SchedulerStats is the right place for this really, just like
the regular running/waiting counts.

Make sure to call LoRARequestStates.update_scheduler_stats()
even where there was no engine core outputs.

Signed-off-by: Mark McLoughlin <[email protected]>
@markmc markmc force-pushed the lora-state-metrics branch from cae669c to e6aa440 Compare November 6, 2025 14:38
Copy link
Collaborator

@jeejeelee jeejeelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's land this PR first

@jeejeelee jeejeelee merged commit 6f7de33 into vllm-project:main Nov 10, 2025
47 checks passed
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Nov 13, 2025
Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants