Skip to content

[Frontend] Split OpenAIServingModels into OpenAIModelRegistry + OpenAIServingModels#36536

Merged
vllm-bot merged 5 commits intovllm-project:mainfrom
sagearc:split-openai-serving-models
Mar 12, 2026
Merged

[Frontend] Split OpenAIServingModels into OpenAIModelRegistry + OpenAIServingModels#36536
vllm-bot merged 5 commits intovllm-project:mainfrom
sagearc:split-openai-serving-models

Conversation

@sagearc
Copy link
Copy Markdown
Contributor

@sagearc sagearc commented Mar 9, 2026

Purpose

After #36166, OpenAIServingRender (a CPU-only, engine-free handler) was receiving a bare list[str] of model names and reimplementing model lookup logic (_check_model, _is_model_supported, show_available_models).

This PR extracts OpenAIModelRegistry — a lightweight, engine-free class for base-model verification — and wires it into the render path via composition.

Changes

  • OpenAIModelRegistry (new): read-only base-model registry with no engine/LoRA dependency. Provides check_model, show_available_models, is_base_model, model_name.
  • OpenAIServingModels: composes OpenAIModelRegistry via self.registry; delegates base-model ops to it, layers LoRA adapter CRUD on top.
  • OpenAIServing._check_model: unchanged — retains the full LoRA-aware verification logic (static + runtime resolution).
  • OpenAIServingRender: accepts model_registry: OpenAIModelRegistry instead of served_model_names: list[str]; removes duplicate _check_model, _is_model_supported, and show_available_models.
  • /v1/models endpoint: return type widened to OpenAIModelRegistry | OpenAIServingModels.
  • init_render_app_state: constructs OpenAIModelRegistry directly for the render-only server.

Test Plan

Existing tests cover the affected code paths.

Test Result

Pre-commit checks pass.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR
  • The test plan
  • The test results
  • (Optional) Documentation update
  • (Optional) Release notes update

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 9, 2026

Hi @sagearc, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The code refactors the OpenAI API entrypoints by introducing an OpenAIModelRegistry class to centralize base model management and checks, with OpenAIServingModels now inheriting from it to handle LoRA-specific logic. This change streamlines model validation and listing across various API components, including api_server, engine/serving, generate/api_router, and render/serving, by delegating these operations to the new model_registry object. A review comment pointed out a potential IndexError in OpenAIModelRegistry.model_name if base_model_paths is empty, suggesting an explicit check for improved robustness.

)


class OpenAIServingModels(OpenAIModelRegistry):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer composition over inheritance here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since OpenAIServingModels requires an engine client, OpenAIServingRender relies on the base OpenAIModelRegistry to stay engine-free. Inheritance allows the overridden check_model to still pick up loras automatically during serving. How would you recommend structuring this dependency with composition?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that OpenAIServingModels can contain an instance of OpenAIModelRegistry. There is no need to change OpenAIServingRender itself.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand using composition there. My main hesitation is how that interacts with the renderer. If my understanding is correct, passing the contained OpenAIModelRegistry to OpenAIServingRender would mean the renderer only checks for base models.

I originally structured it this way to delegate both preprocessing and the model check from the serving layer to the renderer, to avoid having two separate entry points for preprocessing (one with the check, and one without, ref). I might be missing a cleaner way to wire this up though

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoRA doesn't affect Renderer itself, though I understand from the client's perspective they should be able to use the LoRA model for both components. Perhaps we need to integrate this logic even in the engine-less case then.

Copy link
Copy Markdown
Contributor Author

@sagearc sagearc Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After taking a further look, choosing composition here won't reduce the need for a duplicate check_model in both OpenAIServingRender and the serving layers, since check_model can dynamically load LoRAs.
What if, instead of the current approach, we revert the changes introduced in this PR and allow the engine client to be None in OpenAIServingModels?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open a new PR to show what that looks like?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like that?
#36655
@DarkLight1337

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 10, 2026

Hi @sagearc, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

1 similar comment
@mergify
Copy link
Copy Markdown

mergify bot commented Mar 10, 2026

Hi @sagearc, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 11, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @sagearc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 11, 2026
@sagearc sagearc force-pushed the split-openai-serving-models branch from 48a8a69 to d230fc0 Compare March 11, 2026 18:58
@mergify mergify bot removed the needs-rebase label Mar 11, 2026
@sagearc sagearc force-pushed the split-openai-serving-models branch 3 times, most recently from 4a5083e to 02dc788 Compare March 11, 2026 19:10
@sagearc
Copy link
Copy Markdown
Contributor Author

sagearc commented Mar 11, 2026

@DarkLight1337 kept changes minimal, registry handles the base models while keeping lora related logic untouched

@sagearc sagearc requested a review from DarkLight1337 March 11, 2026 19:14
sagearc added 3 commits March 11, 2026 21:57
…IServingModels

Introduce OpenAIModelRegistry as a lightweight, engine-free base class
for model verification (check_model, show_available_models), suitable
for CPU-only / render-only contexts with no LoRA support.

OpenAIServingModels composes an OpenAIModelRegistry and layers LoRA
adapter CRUD on top. OpenAIServing._check_model retains the full
LoRA-aware verification logic (static + runtime resolution).

OpenAIServingRender now accepts model_registry: OpenAIModelRegistry
instead of served_model_names: list[str], removing duplicated model
checking and show_available_models code.

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
@sagearc sagearc force-pushed the split-openai-serving-models branch from 02dc788 to aa2d318 Compare March 11, 2026 19:57
Copy link
Copy Markdown
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better, thanks

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 12, 2026 06:40
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 12, 2026
@vllm-bot vllm-bot merged commit 06e0bc2 into vllm-project:main Mar 12, 2026
42 of 45 checks passed
@sagearc sagearc deleted the split-openai-serving-models branch March 12, 2026 10:56
Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026
…OpenAIServingModels` (vllm-project#36536)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
…OpenAIServingModels` (vllm-project#36536)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026
…OpenAIServingModels` (vllm-project#36536)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026
…OpenAIServingModels` (vllm-project#36536)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026
…OpenAIServingModels` (vllm-project#36536)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026
…OpenAIServingModels` (vllm-project#36536)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Vinay Damodaran <vrdn@hey.com>
EricccYang pushed a commit to EricccYang/vllm that referenced this pull request Apr 1, 2026
…OpenAIServingModels` (vllm-project#36536)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: EricccYang <yangyang4991@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants