-
Notifications
You must be signed in to change notification settings - Fork 319
[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + test #672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
…er media type Fix truncated JSON causing backend 400 responses by syncing Content-Length after request rewriting. Also return application/json for non-stream requests instead of always text/event-stream. Signed-off-by: JaredforReal <[email protected]>
Add mock backend (examples/mock_backend), CLI swagger_smoke script, shared core, and optional E2E pytest smoke test (RUN_E2E_SWAGGER gated). Replaces ad-hoc root-level scripts; improves local & CI verification workflow. Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
|
@JaredforReal Can you fix the CI test errors? |
|
@YuhanLiu11 Yeah! There is an import and a requirement error; I'm working on it. |
fc65e3d to
2130c38
Compare
Signed-off-by: JaredforReal <[email protected]>
|
@YuhanLiu11 Thanks for your time! I think this version should pass the CI test. (or I will work on it till it succeeds orz |
| ) | ||
|
|
||
|
|
||
| async def route_general_request( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to change this function to add this swagger UI mock requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using Pydantic models, the request body is pre-parsed and serialized. Passing request_body directly avoids re-reading await request.body() (which could fail or duplicate work), ensuring efficiency and correctness in mock tests.
For non-Pydantic endpoints (e.g., /tokenize), the parameter defaults to None, and the function falls back to reading the raw body, maintaining backward compatibility.
|
|
||
| @main_router.post("/v1/chat/completions") | ||
| async def route_chat_completion(request: Request, background_tasks: BackgroundTasks): | ||
| async def route_chat_completion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I don't get why do we need to change this function to add this swagger UI mock requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FastAPI uses the Pydantic model to generate the OpenAPI schema for /v1/chat/completions. This enables Swagger UI to display editable request fields and validate inputs at the API level.
Without this, the endpoint would lack schema details, making mock testing impossible (no "Try it out" functionality or 422 error simulation, mentioned in issue #667 ).
|
@YuhanLiu11 Thanks for your time! I try to make a minimal change to achieve this feature, learning from the VLLM implementation. I am considering making a proposal to refactor |
Signed-off-by: JaredforReal <[email protected]>
| "pytest>=8.3.4", | ||
| "pytest-asyncio>=0.25.3" | ||
| "pytest-asyncio>=0.25.3", | ||
| "httpx==0.28.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "httpx==0.28.1" | |
| "httpx==0.28.1" |
Hi, starting from this MR: #589 httpx has been replaced by aiohttp, the reason is huge performance boost, and aiohttp has been part of the package requirements
[Router][Feat][Bugfix] Add Swagger UI (OpenAPI) support with Pydantic request models + fix rewrite Content-Length + dev smoke tooling
Fixes #667
Summary
Introduce OpenAPI/Swagger UI documentation and typed (Pydantic) request models for the three OpenAI‑style endpoints:
/v1/chat/completions/v1/completions/v1/embeddingsAdd a mock backend + smoke test tooling to simplify local and CI verification.
Fix a routing bug where a request body rewrite did not refresh
Content-Length, causing truncated JSON and backend 400 responses.Improve non‑stream responses to return
application/jsoninstead of alwaystext/event-stream.Key Changes
ChatCompletionRequest,CompletionRequest,EmbeddingRequest).Requestfor semantic cache, callbacks, and rewriting.route_general_request:Content-Lengthafter rewrite.text/event-streamonly whenstream=true, elseapplication/json.Bug Fix Details
Before: Request rewrite produced new JSON body but stale
Content-Length, backend read partial body → JSONDecodeError → 400.After: Always call
update_content_length()post rewrite; valid 200 responses confirmed.Testing
How to Reproduce Locally
Observability / Metrics
No change to metrics emission. Mock backend intentionally omits
/metrics(router logs 404—benign).Backward Compatibility
extra='allow')—only logged as warnings./tokenize,/score).Limitations / Deferred
messageselements are plaindict; future enhancement: strictMessagemodel + role Enum.response_model=later).Review Notes
Focus areas:
Risk Assessment
Low runtime risk: new logic isolated to three endpoints + a small header update after rewrite.
Fallback: disabling Pydantic would require reverting router endpoint signatures (not included; no flag yet).
This is my first PR, just point out what I have done wrong, I will fix it ASAP.