[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + test #672

JaredforReal · 2025-09-02T17:05:06Z

[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + dev smoke tooling

Fixes #667

Summary

Introduce OpenAPI/Swagger UI documentation and typed (Pydantic) request models for the three OpenAI‑style endpoints:

/v1/chat/completions
/v1/completions
/v1/embeddings
Add a mock backend + smoke test tooling to simplify local and CI verification.
Fix a routing bug where a request body rewrite did not refresh Content-Length, causing truncated JSON and backend 400 responses.
Improve non‑stream responses to return application/json instead of always text/event-stream.

Key Changes

New protocols.py with minimal OpenAI-compatible Pydantic models (ChatCompletionRequest, CompletionRequest, EmbeddingRequest).
main_router.py: endpoints now accept typed models; preserve raw Request for semantic cache, callbacks, and rewriting.
Bugfix in route_general_request:
- Refresh Content-Length after rewrite.
- Dynamic media type: text/event-stream only when stream=true, else application/json.
tests:
- test_swagger_integration.py unit test for swagger ui
- main.py mock engine (chat/completions/embeddings).
- _swagger_smoke_core.py shared smoke logic.
- swagger_smoke.py standalone CLI smoke test.

Bug Fix Details

Before: Request rewrite produced new JSON body but stale Content-Length, backend read partial body → JSONDecodeError → 400.
After: Always call update_content_length() post rewrite; valid 200 responses confirmed.

Testing

Layer	What	Status
Unit	test_swagger_integration.py (Pydantic validation, schema)	Pass
Smoke	swagger_smoke.py (8 checks)	Pass
Manual	Browser docs, “Try it out”	Verified
Curl	Chat/completions/embeddings 200 + 422 invalid	Verified
Regression	Non-stream media type now JSON	Verified

How to Reproduce Locally

# Terminal 1
python examples/mock_backend/main.py --port 8000

# Terminal 2
python -m vllm_router.app \
--service-discovery static \
--static-backends http://localhost:8000,http://localhost:8000 \
--static-models gpt-3.5-turbo,text-embedding-ada-002 \
--routing-logic roundrobin \
--host 0.0.0.0 --port 8080 --log-level debug

# Terminal 3 (smoke)
python scripts/swagger_smoke.py
# or just try it out in http://localhost:8080/doc

Observability / Metrics

No change to metrics emission. Mock backend intentionally omits /metrics (router logs 404—benign).

Backward Compatibility

No change to existing response shapes.
Extra request fields still accepted (extra='allow')—only logged as warnings.
Non‑OpenAPI endpoints untouched (e.g. /tokenize, /score).
Streaming semantics unchanged except for correct content type when not streaming.

Limitations / Deferred

messages elements are plain dict; future enhancement: strict Message model + role Enum.
Response models still untyped (can add response_model= later).
No streaming chunk schema; SSE contract unchanged.
Semantic cache specific fields currently allowed as extra (not declared).

Review Notes

Focus areas:

Ensure rewrite path + Content-Length fix is safe.
Confirm no unintended change to routing selection.
Validate OpenAPI schema suffices for downstream tooling.

Risk Assessment

Low runtime risk: new logic isolated to three endpoints + a small header update after rewrite.
Fallback: disabling Pydantic would require reverting router endpoint signatures (not included; no flag yet).

This is my first PR, just point out what I have done wrong, I will fix it ASAP.

Signed-off-by: JaredforReal <[email protected]>

…er media type Fix truncated JSON causing backend 400 responses by syncing Content-Length after request rewriting. Also return application/json for non-stream requests instead of always text/event-stream. Signed-off-by: JaredforReal <[email protected]>

Add mock backend (examples/mock_backend), CLI swagger_smoke script, shared core, and optional E2E pytest smoke test (RUN_E2E_SWAGGER gated). Replaces ad-hoc root-level scripts; improves local & CI verification workflow. Signed-off-by: JaredforReal <[email protected]>

Signed-off-by: JaredforReal <[email protected]>

YuhanLiu11 · 2025-09-04T05:41:34Z

@JaredforReal Can you fix the CI test errors?

JaredforReal · 2025-09-04T05:55:16Z

@YuhanLiu11 Yeah! There is an import and a requirement error; I'm working on it.

Signed-off-by: JaredforReal <[email protected]>

JaredforReal · 2025-09-04T06:39:17Z

@YuhanLiu11 Thanks for your time! I think this version should pass the CI test. (or I will work on it till it succeeds orz

YuhanLiu11 · 2025-09-04T23:35:45Z

src/vllm_router/services/request_service/request.py

        )


 async def route_general_request(


Why do we need to change this function to add this swagger UI mock requests?

When using Pydantic models, the request body is pre-parsed and serialized. Passing request_body directly avoids re-reading await request.body() (which could fail or duplicate work), ensuring efficiency and correctness in mock tests.
For non-Pydantic endpoints (e.g., /tokenize), the parameter defaults to None, and the function falls back to reading the raw body, maintaining backward compatibility.

YuhanLiu11 · 2025-09-04T23:37:50Z

src/vllm_router/routers/main_router.py


 @main_router.post("/v1/chat/completions")
-async def route_chat_completion(request: Request, background_tasks: BackgroundTasks):
+async def route_chat_completion(


Similarly, I don't get why do we need to change this function to add this swagger UI mock requests?

FastAPI uses the Pydantic model to generate the OpenAPI schema for /v1/chat/completions. This enables Swagger UI to display editable request fields and validate inputs at the API level.
Without this, the endpoint would lack schema details, making mock testing impossible (no "Try it out" functionality or 422 error simulation, mentioned in issue #667 ).

JaredforReal · 2025-09-05T03:54:19Z

@YuhanLiu11 Thanks for your time! I try to make a minimal change to achieve this feature, learning from the VLLM implementation. I am considering making a proposal to refactor route_general_request() without changing the routing logic to take a Pydantic model originally for type safety, if this PR is accepted.
I will try to fix the pre-commit error. Feel free to point out any mistakes I have made, love to learn from the community :)

Signed-off-by: JaredforReal <[email protected]>

davidgao7 · 2025-09-12T06:26:54Z

pyproject.toml

    "pytest>=8.3.4",
-    "pytest-asyncio>=0.25.3"
+    "pytest-asyncio>=0.25.3",
+    "httpx==0.28.1"


Suggested change

"httpx==0.28.1"

"httpx==0.28.1"

Hi, starting from this MR: #589 httpx has been replaced by aiohttp, the reason is huge performance boost, and aiohttp has been part of the package requirements

JaredforReal added 4 commits September 2, 2025 00:06

add protocols form vllm

a1d1a50

Signed-off-by: JaredforReal <[email protected]>

integrate in Router layer && add unit test

56d6c36

Signed-off-by: JaredforReal <[email protected]>

JaredforReal changed the title ~~[Swagger UI~~ [Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + dev smoke tooling Sep 2, 2025

JaredforReal and others added 2 commits September 3, 2025 01:38

Get rid of E2E test to simplify

84adf32

Signed-off-by: JaredforReal <[email protected]>

Merge branch 'main' into swagger-ui

3c7db71

JaredforReal marked this pull request as ready for review September 3, 2025 03:48

JaredforReal added 2 commits September 3, 2025 11:57

pass pre-commit locally

e2b53df

Signed-off-by: JaredforReal <[email protected]>

fix: lint and formatting for swagger integration test

3e73248

Signed-off-by: JaredforReal <[email protected]>

Merge branch 'main' into swagger-ui

2130c38

JaredforReal force-pushed the swagger-ui branch from fc65e3d to 2130c38 Compare September 4, 2025 06:13

[Fix] CI test error: import error and requirement error

324843b

Signed-off-by: JaredforReal <[email protected]>

YuhanLiu11 reviewed Sep 4, 2025

View reviewed changes

[FIX] pre-commit error

00765ec

Signed-off-by: JaredforReal <[email protected]>

davidgao7 reviewed Sep 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + test #672

[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + test #672

Uh oh!

JaredforReal commented Sep 2, 2025 •

edited

Loading

Uh oh!

YuhanLiu11 commented Sep 4, 2025

Uh oh!

JaredforReal commented Sep 4, 2025

Uh oh!

JaredforReal commented Sep 4, 2025

Uh oh!

YuhanLiu11 Sep 4, 2025

Uh oh!

JaredforReal Sep 5, 2025

Uh oh!

YuhanLiu11 Sep 4, 2025

Uh oh!

JaredforReal Sep 5, 2025

Uh oh!

JaredforReal commented Sep 5, 2025

Uh oh!

davidgao7 Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + test #672

Are you sure you want to change the base?

[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + test #672

Uh oh!

Conversation

JaredforReal commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Bug Fix Details

Testing

How to Reproduce Locally

Observability / Metrics

Backward Compatibility

Limitations / Deferred

Review Notes

Risk Assessment

Uh oh!

YuhanLiu11 commented Sep 4, 2025

Uh oh!

JaredforReal commented Sep 4, 2025

Uh oh!

JaredforReal commented Sep 4, 2025

Uh oh!

YuhanLiu11 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

JaredforReal Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

YuhanLiu11 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

JaredforReal Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

JaredforReal commented Sep 5, 2025

Uh oh!

davidgao7 Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JaredforReal commented Sep 2, 2025 •

edited

Loading