fix(mcp): stream SSE responses, fix mount order, remove OAuth root pollution#22072
fix(mcp): stream SSE responses, fix mount order, remove OAuth root pollution#22072JVenberg wants to merge 3 commits intoBerriAI:mainfrom
Conversation
…llution
Three fixes for the MCP server proxy that prevented external clients
(e.g. Claude Code) from properly connecting to upstream MCP servers:
1. dynamic_mcp_route now streams ASGI responses via StreamingResponse
instead of buffering the entire body. This is critical for SSE
(text/event-stream) used by MCP Streamable HTTP transport.
2. Reorder ASGI mounts so /sse is matched before the / catch-all.
Remove broken /mcp and /{mcp_server_name}/mcp mounts that either
mapped to wrong paths or used unsupported path parameters.
3. _resolve_oauth2_server_for_root_endpoints now always returns None.
The previous auto-resolution polluted root OAuth discovery endpoints
with metadata from an unrelated server when exactly one OAuth2 server
was configured, breaking non-OAuth MCP servers.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes three bugs preventing external MCP clients (like Claude Code) from connecting through LiteLLM's MCP server proxy:
Test coverage is thorough: 4 new streaming/mount tests, 4 new OAuth mixed-server tests, and 4 updated existing OAuth tests. Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/proxy_server.py | Replaces buffered ASGI response with queue-based streaming adapter for SSE support; removes unused MCPAuth import. Streaming logic is well-structured with proper sentinel handling and cancellation. |
| litellm/proxy/_experimental/mcp_server/server.py | Fixes mount ordering so /sse is matched before the / catch-all; removes extraneous /mcp and /{mcp_server_name}/mcp mounts that mapped to incorrect paths. |
| litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py | Refines OAuth root auto-resolution to only activate when all configured servers are OAuth2, preventing pollution of non-OAuth server discovery responses. |
| tests/test_litellm/proxy/_experimental/mcp_server/test_dynamic_mcp_route_streaming.py | New test file with 4 tests: SSE streaming, non-streaming JSON, 404 for unknown server, and mount ordering verification. All mocked correctly with no real network calls. |
| tests/test_litellm/proxy/_experimental/mcp_server/test_discoverable_endpoints.py | Updates 4 existing OAuth tests for refined behavior and adds 4 new tests verifying root resolution is skipped when non-OAuth servers are present. All properly mocked. |
Sequence Diagram
sequenceDiagram
participant Client as MCP Client
participant Route as dynamic_mcp_route
participant Task as asyncio.Task (run_handler)
participant ASGI as handle_streamable_http_mcp
participant Queue as body_queue
Client->>Route: POST /{server_name}/mcp
Route->>Route: Validate server exists
Route->>Route: Rewrite scope path to /mcp/{server_name}
Route->>Task: create_task(run_handler)
Task->>ASGI: call ASGI handler
ASGI->>Task: http.response.start (status + headers)
Task->>Route: headers_ready.set()
Route->>Route: Build StreamingResponse
Route-->>Client: StreamingResponse (headers)
loop SSE chunks
ASGI->>Task: http.response.body (chunk, more_body=True)
Task->>Queue: put(chunk)
Queue-->>Client: yield chunk
end
ASGI->>Task: http.response.body (more_body=False)
Task->>Queue: put(None) sentinel
Queue-->>Client: generator ends
Last reviewed commit: de74f85
Additional Comments (2)
The Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
… convenience Instead of always returning None, auto-resolve to the single OAuth2 server only when no non-OAuth servers are configured. When non-OAuth servers are also present, return None to avoid polluting their discovery responses. This preserves the convenience behavior for single-OAuth-only setups while fixing the mixed-server case. Add 4 new tests for the mixed-server scenario (OAuth + non-OAuth) and restore original test expectations for the single-OAuth-only case.
Add body_terminated flag so the run_handler finally block only sends the None sentinel when streaming_send hasn't already sent one on the normal more_body=False path.
Review1. Does this PR fix the issue it describes?
Changes are well-documented with 8 new tests covering streaming, non-streaming, 404s, mount order, and OAuth resolution. 2. Has this issue already been solved elsewhere? ✅ LGTM — comprehensive fix with solid test coverage. The MCP integration is complex and this addresses real issues blocking external clients like Claude Code. |
Relevant issues
Fixes #22073
Fixes #22074
Fixes #22075
Pre-Submission checklist
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
🐛 Bug Fix
Changes
Fixes three bugs in LiteLLM's MCP server proxy that prevent external MCP clients (like Claude Code) from properly connecting to upstream MCP servers via the proxy.
Bug 1: SSE streaming broken in
dynamic_mcp_route(#22073)The
dynamic_mcp_routehandler buffered the entire ASGI response body before returning a singleResponse, breaking SSE streaming. Replaced with aQueue-based streaming adapter that yields body chunks incrementally viaStreamingResponse.Bug 2:
/ssemount unreachable due to ordering (#22074)The
/catch-all mount was registered before/ssein the MCP sub-app, making SSE transport unreachable. Also removed two extraneous mounts (/mcpmapped to wrong external path/mcp/mcp,/{mcp_server_name}/mcpwhich doesn't work with Starlette mounts). Now only/sseand/are mounted, in the correct order.Bug 3: OAuth root endpoint pollution (#22075)
_resolve_oauth2_server_for_root_endpointsauto-resolved to the single OAuth2 server for root-level requests even when non-OAuth servers were also configured, polluting their discovery responses. Changed to only auto-resolve when all configured servers are OAuth2 (single OAuth-only setup). When non-OAuth servers are present, root endpoints return generic responses and clients should use server-specific paths (e.g./{server_name}/authorize).Tests
dynamic_mcp_routestreaming, non-streaming, 404 handling, and mount order