Skip to content

[Core][Feat] Add max-waiting-queue-time parameter to reject requests#37413

Open
chaunceyjiang wants to merge 2 commits intovllm-project:mainfrom
chaunceyjiang:max_waiting_queue_time
Open

[Core][Feat] Add max-waiting-queue-time parameter to reject requests#37413
chaunceyjiang wants to merge 2 commits intovllm-project:mainfrom
chaunceyjiang:max_waiting_queue_time

Conversation

@chaunceyjiang
Copy link
Copy Markdown
Collaborator

@chaunceyjiang chaunceyjiang commented Mar 18, 2026

Purpose

Add max-waiting-queue-time parameter to reject requests

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a max-waiting-queue-time parameter to reject requests when the server is under high load. This is achieved by tracking the average queue time of recent requests using a new QueueTimeTracker class. The feature is well-integrated, with the new parameter added to EngineArgs and propagated down to the serving layers. The logic to check the queue time and reject requests with a 503 error is implemented in OpenAIServing. The QueueTimeTracker itself uses a sliding window with time-based decay, which is a solid approach. My review found one area for improvement in the QueueTimeTracker implementation regarding an unused variable, which I've commented on.

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant