Skip to content

fix(mcp): stream SSE responses, fix mount order, remove OAuth root pollution#22072

Open
JVenberg wants to merge 3 commits intoBerriAI:mainfrom
JVenberg:fix-mcp-proxy-streaming-sse-oauth
Open

fix(mcp): stream SSE responses, fix mount order, remove OAuth root pollution#22072
JVenberg wants to merge 3 commits intoBerriAI:mainfrom
JVenberg:fix-mcp-proxy-streaming-sse-oauth

Conversation

@JVenberg
Copy link
Contributor

@JVenberg JVenberg commented Feb 25, 2026

Relevant issues

Fixes #22073
Fixes #22074
Fixes #22075

Pre-Submission checklist

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

Fixes three bugs in LiteLLM's MCP server proxy that prevent external MCP clients (like Claude Code) from properly connecting to upstream MCP servers via the proxy.

Bug 1: SSE streaming broken in dynamic_mcp_route (#22073)

The dynamic_mcp_route handler buffered the entire ASGI response body before returning a single Response, breaking SSE streaming. Replaced with a Queue-based streaming adapter that yields body chunks incrementally via StreamingResponse.

Bug 2: /sse mount unreachable due to ordering (#22074)

The / catch-all mount was registered before /sse in the MCP sub-app, making SSE transport unreachable. Also removed two extraneous mounts (/mcp mapped to wrong external path /mcp/mcp, /{mcp_server_name}/mcp which doesn't work with Starlette mounts). Now only /sse and / are mounted, in the correct order.

Bug 3: OAuth root endpoint pollution (#22075)

_resolve_oauth2_server_for_root_endpoints auto-resolved to the single OAuth2 server for root-level requests even when non-OAuth servers were also configured, polluting their discovery responses. Changed to only auto-resolve when all configured servers are OAuth2 (single OAuth-only setup). When non-OAuth servers are present, root endpoints return generic responses and clients should use server-specific paths (e.g. /{server_name}/authorize).

Tests

  • 4 new tests for dynamic_mcp_route streaming, non-streaming, 404 handling, and mount order
  • 4 new tests verifying OAuth root resolution is skipped when non-OAuth servers are present
  • 4 existing OAuth root resolution tests updated to reflect refined behavior
  • All 47 discoverable endpoint + streaming tests pass

…llution

Three fixes for the MCP server proxy that prevented external clients
(e.g. Claude Code) from properly connecting to upstream MCP servers:

1. dynamic_mcp_route now streams ASGI responses via StreamingResponse
   instead of buffering the entire body. This is critical for SSE
   (text/event-stream) used by MCP Streamable HTTP transport.

2. Reorder ASGI mounts so /sse is matched before the / catch-all.
   Remove broken /mcp and /{mcp_server_name}/mcp mounts that either
   mapped to wrong paths or used unsupported path parameters.

3. _resolve_oauth2_server_for_root_endpoints now always returns None.
   The previous auto-resolution polluted root OAuth discovery endpoints
   with metadata from an unrelated server when exactly one OAuth2 server
   was configured, breaking non-OAuth MCP servers.
@vercel
Copy link

vercel bot commented Feb 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 25, 2026 8:32am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 25, 2026

Greptile Summary

This PR fixes three bugs preventing external MCP clients (like Claude Code) from connecting through LiteLLM's MCP server proxy:

  • SSE streaming fix: dynamic_mcp_route now uses a queue-based streaming adapter instead of buffering the entire ASGI response body, which was breaking SSE (text/event-stream) incremental delivery. The implementation uses asyncio.Queue, an Event for header synchronization, and proper cancellation/cleanup in the generator.
  • Mount ordering fix: /sse is now mounted before the / catch-all in the MCP sub-app, fixing Starlette's first-match routing. Two extraneous mounts (/mcp and /{mcp_server_name}/mcp) that mapped to incorrect paths were removed.
  • OAuth root pollution fix: _resolve_oauth2_server_for_root_endpoints now only auto-resolves when all configured servers are OAuth2 (single-server convenience), preventing non-OAuth servers' discovery responses from being polluted with OAuth metadata.

Test coverage is thorough: 4 new streaming/mount tests, 4 new OAuth mixed-server tests, and 4 updated existing OAuth tests.

Confidence Score: 4/5

  • This PR is safe to merge — all three fixes are well-scoped with thorough test coverage and no regressions to existing behavior.
  • The streaming adapter is well-implemented with proper synchronization (asyncio.Event, Queue sentinel, body_terminated flag). The mount ordering and OAuth resolution fixes are straightforward and correct. All 8 new tests are properly mocked. The only minor concern is that errors occurring mid-stream (after headers are sent) will result in truncated responses rather than error responses, which is inherent to HTTP streaming.
  • Pay close attention to litellm/proxy/proxy_server.py — the streaming adapter is the most complex change and the critical path for SSE delivery.

Important Files Changed

Filename Overview
litellm/proxy/proxy_server.py Replaces buffered ASGI response with queue-based streaming adapter for SSE support; removes unused MCPAuth import. Streaming logic is well-structured with proper sentinel handling and cancellation.
litellm/proxy/_experimental/mcp_server/server.py Fixes mount ordering so /sse is matched before the / catch-all; removes extraneous /mcp and /{mcp_server_name}/mcp mounts that mapped to incorrect paths.
litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py Refines OAuth root auto-resolution to only activate when all configured servers are OAuth2, preventing pollution of non-OAuth server discovery responses.
tests/test_litellm/proxy/_experimental/mcp_server/test_dynamic_mcp_route_streaming.py New test file with 4 tests: SSE streaming, non-streaming JSON, 404 for unknown server, and mount ordering verification. All mocked correctly with no real network calls.
tests/test_litellm/proxy/_experimental/mcp_server/test_discoverable_endpoints.py Updates 4 existing OAuth tests for refined behavior and adds 4 new tests verifying root resolution is skipped when non-OAuth servers are present. All properly mocked.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Route as dynamic_mcp_route
    participant Task as asyncio.Task (run_handler)
    participant ASGI as handle_streamable_http_mcp
    participant Queue as body_queue

    Client->>Route: POST /{server_name}/mcp
    Route->>Route: Validate server exists
    Route->>Route: Rewrite scope path to /mcp/{server_name}
    Route->>Task: create_task(run_handler)
    Task->>ASGI: call ASGI handler
    ASGI->>Task: http.response.start (status + headers)
    Task->>Route: headers_ready.set()
    Route->>Route: Build StreamingResponse
    Route-->>Client: StreamingResponse (headers)

    loop SSE chunks
        ASGI->>Task: http.response.body (chunk, more_body=True)
        Task->>Queue: put(chunk)
        Queue-->>Client: yield chunk
    end

    ASGI->>Task: http.response.body (more_body=False)
    Task->>Queue: put(None) sentinel
    Queue-->>Client: generator ends
Loading

Last reviewed commit: de74f85

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 25, 2026

Additional Comments (2)

litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py
Unused import after refactor

MCPAuth is imported on this line but is no longer referenced anywhere in the file. It was previously used inside _resolve_oauth2_server_for_root_endpoints, which now just returns None. This will trigger lint warnings.

from litellm.types.mcp_server.mcp_server_manager import MCPServer

litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py
Dead client_ip parameter

The client_ip parameter is now unused since the function body just returns None. Consider removing it to avoid confusion, since all callers already pass no arguments. Alternatively, if the function is meant to be a placeholder that may regain logic later, this is fine as-is.

def _resolve_oauth2_server_for_root_endpoints() -> Optional[MCPServer]:
    """
    Resolve the MCP server for root-level OAuth endpoints (no server name in path).

    Always returns None. Root-level OAuth discovery endpoints should not
    auto-resolve to an arbitrary server because doing so pollutes non-OAuth
    servers' discovery responses when any single OAuth2 server is configured.
    Clients should use server-specific paths instead (e.g. /{server_name}/authorize).
    """
    return None

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

… convenience

Instead of always returning None, auto-resolve to the single OAuth2
server only when no non-OAuth servers are configured. When non-OAuth
servers are also present, return None to avoid polluting their discovery
responses. This preserves the convenience behavior for single-OAuth-only
setups while fixing the mixed-server case.

Add 4 new tests for the mixed-server scenario (OAuth + non-OAuth) and
restore original test expectations for the single-OAuth-only case.
@JVenberg
Copy link
Contributor Author

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Add body_terminated flag so the run_handler finally block only sends
the None sentinel when streaming_send hasn't already sent one on the
normal more_body=False path.
@JVenberg
Copy link
Contributor Author

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@shin-bot-litellm
Copy link
Contributor

Review

1. Does this PR fix the issue it describes?
Yes. Fixes three issues filed by the author:

  1. MCP proxy dynamic_mcp_route buffers SSE responses instead of streaming #22073 SSE streaming: Replaced buffered Response with Queue-based StreamingResponse for proper SSE streaming
  2. MCP sub-app mount order makes /sse transport unreachable #22074 Mount order: Fixed /sse being unreachable due to / catch-all registered first
  3. MCP OAuth root endpoint auto-resolution pollutes non-OAuth server discovery #22075 OAuth pollution: _resolve_oauth2_server_for_root_endpoints now only auto-resolves when ALL servers are OAuth2

Changes are well-documented with 8 new tests covering streaming, non-streaming, 404s, mount order, and OAuth resolution.

2. Has this issue already been solved elsewhere?
No. These are new MCP server proxy features and the bugs are specific to this experimental code path. No duplicates found.

✅ LGTM — comprehensive fix with solid test coverage. The MCP integration is complex and this addresses real issues blocking external clients like Claude Code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants