[Refactor] [6/N] to simplify the vLLM openai chat_completion serving architecture by chaunceyjiang · Pull Request #32240 · vllm-project/vllm

chaunceyjiang · 2026-01-13T07:00:55Z

Purpose

refactors the OpenAI chat_completion_serving architecture,

split vllm/entrypoints/openai/protocol.py
TODO
[ ] completion_serving
[ ] responses_serving
[ ] transcription_serving
[ ] tests re-org
[ ] compatibility with the previous import of vllm/entrypoints/openai/protocol.py

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

gemini-code-assist

Code Review

This pull request refactors the OpenAI serving architecture by restructuring files and updating import paths. The changes are mostly mechanical, but I found a couple of critical issues in the newly added vllm/entrypoints/openai/chat_completion/protocol.py file: a syntax error in an import statement and a missing import for FunctionDefinition. These issues will prevent the code from running and need to be addressed.

vllm/entrypoints/openai/chat_completion/protocol.py

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

vllm/entrypoints/openai/chat_completion/api_router.py

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

DarkLight1337

LGTM as long as tests pass

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

…architecture (vllm-project#32240) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

…architecture (vllm-project#32240) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…architecture (vllm-project#32240) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Refactor] [6/N] to simplify the vLLM openai serving architecture

a48a6ea

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

mergify bot added deepseek Related to DeepSeek models frontend llama Related to Llama models qwen Related to Qwen models gpt-oss Related to GPT-OSS models labels Jan 13, 2026

github-project-automation bot added this to gpt-oss Issues & Enhancements Jan 13, 2026

mergify bot added v1 tool-calling labels Jan 13, 2026

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Jan 13, 2026

github-project-automation bot added this to Tool Calling Jan 13, 2026

chaunceyjiang changed the title ~~[Refactor] [6/N] to simplify the vLLM openai serving architecture~~ [Refactor] [6/N] to simplify the vLLM openai chat_completion serving architecture Jan 13, 2026

gemini-code-assist bot reviewed Jan 13, 2026

View reviewed changes

vllm/entrypoints/openai/chat_completion/protocol.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/chat_completion/protocol.py Show resolved Hide resolved

[Refactor] [6/N] to simplify the vLLM openai serving architecture

fe3771b

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

mergify bot added the multi-modality Related to multi-modality (#4194) label Jan 13, 2026

chaunceyjiang added 6 commits January 13, 2026 16:26

[Refactor] [6/N] to simplify the vLLM openai serving architecture

90fbafe

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Refactor] [6/N] to simplify the vLLM openai serving architecture

ce634ea

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Refactor] [6/N] to simplify the vLLM openai serving architecture

cdc0306

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Refactor] [6/N] to simplify the vLLM openai serving architecture

a44ce33

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Refactor] [6/N] to simplify the vLLM openai serving architecture

6810609

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Refactor] [6/N] to simplify the vLLM openai serving architecture

e228a93

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang marked this pull request as ready for review January 13, 2026 09:17

chaunceyjiang requested review from DarkLight1337, NickLucche, aarnphm, noooop and robertgshaw2-redhat as code owners January 13, 2026 09:17

cursor bot reviewed Jan 13, 2026

View reviewed changes

vllm/entrypoints/openai/chat_completion/api_router.py Show resolved Hide resolved

[Refactor] [6/N] to simplify the vLLM openai serving architecture

d9534eb

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 13, 2026

DarkLight1337 approved these changes Jan 13, 2026

View reviewed changes

github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Jan 13, 2026

chaunceyjiang added 2 commits January 13, 2026 18:12

[Refactor] [6/N] to simplify the vLLM openai serving architecture

47f24be

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Refactor] [6/N] to simplify the vLLM openai serving architecture

5a314e9

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang enabled auto-merge (squash) January 13, 2026 11:14

chaunceyjiang disabled auto-merge January 13, 2026 11:14

[Refactor] [6/N] to simplify the vLLM openai serving architecture

93b0678

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang enabled auto-merge (squash) January 13, 2026 11:18

chaunceyjiang merged commit fefce49 into vllm-project:main Jan 13, 2026
50 checks passed

github-project-automation bot moved this to Done in Tool Calling Jan 13, 2026

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Jan 13, 2026

chaunceyjiang deleted the vllm_open_refactor branch January 13, 2026 13:06

sammysun0711 pushed a commit to sammysun0711/vllm that referenced this pull request Jan 16, 2026

[Refactor] [6/N] to simplify the vLLM openai chat_completion serving …

058276f

…architecture (vllm-project#32240) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

[Refactor] [6/N] to simplify the vLLM openai chat_completion serving …

75c2729

…architecture (vllm-project#32240) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

jeffreywang-anyscale mentioned this pull request Jan 31, 2026

[deps][LLM] Upgrade vLLM to 0.15.0 ray-project/ray#60253

Closed

6 tasks

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[Refactor] [6/N] to simplify the vLLM openai chat_completion serving …

e9f428b

…architecture (vllm-project#32240) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Refactor] [6/N] to simplify the vLLM openai chat_completion serving architecture#32240

[Refactor] [6/N] to simplify the vLLM openai chat_completion serving architecture#32240
chaunceyjiang merged 12 commits intovllm-project:mainfrom
chaunceyjiang:vllm_open_refactor

chaunceyjiang commented Jan 13, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

chaunceyjiang commented Jan 13, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chaunceyjiang commented Jan 13, 2026 •

edited by github-actions bot

Loading