fix(mcp): Strip stale mcp-session-id to prevent 400 errors across proxy workers by gavksingh · Pull Request #21417 · BerriAI/litellm

gavksingh · 2026-02-17T22:12:52Z

Relevant Issues

Fixes #20992

Root Cause

In a multi-worker Uvicorn deployment, a client that reconnects to a different worker sends an mcp-session-id that the new worker has never seen. The MCP SDK returns HTTP 400 because the session ID is unknown.

Fix

Added _handle_stale_mcp_session() in litellm/proxy/_experimental/mcp_server/server.py which inspects the inbound mcp-session-id header before the request reaches the MCP SDK:

Non-DELETE requests: strips the stale header so the SDK creates a fresh stateless session (no 400 error)
DELETE requests: returns HTTP 200 immediately (idempotent — session already gone)

No new dependencies. No Redis. No hot-path latency.

Changes

Modified Files

`litellm/proxy/_experimental/mcp_server/server.py`

Added _handle_stale_mcp_session() — inspects and strips stale mcp-session-id headers
Called in handle_streamable_http_mcp() before delegating to the session manager

`tests/mcp_tests/test_mcp_server.py`

Added test_streamable_http_mcp_handler_mock and related tests covering:
- Session header present and in memory → pass through unchanged
- Session header present but stale → strip header, continue
- DELETE with stale session → return 200 immediately
- No session header → pass through unchanged

Testing

45 passed, 1 skipped in tests/mcp_tests/test_mcp_server.py

vercel · 2026-02-17T22:12:57Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 19, 2026 3:40pm

CLAassistant · 2026-02-17T22:13:00Z

All committers have signed the CLA.

greptile-apps · 2026-02-17T22:17:59Z

Greptile Summary

This PR refactors the _handle_stale_mcp_session function in the MCP server to be more defensive and robust. The actual changes (vs the PR description which describes a much larger Redis-backed session store) are:

Case-insensitive header matching: Added _normalize_header_name() helper so mcp-session-id is matched regardless of casing
Defensive type handling: Header values are now checked for bytes vs str type before decoding
Safer _server_instances access: The in check on the session manager's private attribute is wrapped in try/except
Issue reference fix: Corrected docstring link from [Bug]: MCP Server (stdio transport) fails to connect from VSCode: "Session not found" error #20292 to MCP sessions are not shared across workers - multi-worker proxy drops MCP connections randomly #20992
Log formatting: Switched from f-strings to lazy %s formatting in log calls
Minor code style: Flattened if/else structure, extracted _headers variable, reformatted multi-line call

No functional changes to session routing logic — stale sessions are still stripped on non-DELETE and return 200 on DELETE, same as before.

Confidence Score: 4/5

This PR is safe to merge — it contains only defensive refactoring with no behavioral changes to session handling logic.
The changes are low-risk refactoring: case-insensitive header matching, defensive type checks, and improved logging. The core session-stripping logic is unchanged. One minor style issue (spurious blank line in imports) is present but non-blocking.
No files require special attention. The changes to server.py are defensive and low-risk.

Important Files Changed

Filename	Overview
litellm/proxy/_experimental/mcp_server/server.py	Refactors `_handle_stale_mcp_session` with case-insensitive header matching, defensive type checks, improved error handling around `_server_instances` access, fixed issue reference (#20292→#20992), and lazy log formatting. Adds a spurious blank line in stdlib imports.
tests/mcp_tests/test_mcp_server.py	Minor test comment update clarifying that `send` is passed directly without a wrapper. No behavioral changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming MCP Request] --> B{Has mcp-session-id header?}
    B -- No --> G[Pass to session manager]
    B -- Yes --> C{Can inspect _server_instances?}
    C -- No --> G
    C -- Yes --> D{Session ID in worker memory?}
    D -- Yes --> G
    D -- Error --> G
    D -- No / Stale --> E{HTTP method?}
    E -- DELETE --> F[Return 200 OK immediately]
    E -- Non-DELETE --> H[Strip mcp-session-id header]
    H --> G

_{Last reviewed commit: 406c468}

greptile-apps

_{4 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T22:18:02Z

litellm/proxy/_experimental/mcp_server/server.py

+        # Check Redis first for multi-worker deployments
+        session_exists_in_redis = False
+        if _REDIS_SESSION_STORE:
+            try:
+                session_exists_in_redis = await _REDIS_SESSION_STORE.session_exists(_session_id)
+                verbose_logger.debug(
+                    f"MCP session {_session_id} exists in Redis: {session_exists_in_redis}"
+                )
+            except Exception as e:
+                verbose_logger.error(
+                    f"Error checking MCP session {_session_id} in Redis: {e}. "
+                    "Falling back to in-memory check."
+                )
+
+        # Fallback to in-memory check
        known_sessions = getattr(mgr, "_server_instances", None)
-        if known_sessions is None or _session_id in known_sessions:
+        session_exists_in_memory = known_sessions is not None and _session_id in known_sessions
+
+        # Session exists if it's in either Redis or memory
+        if session_exists_in_redis or session_exists_in_memory or known_sessions is None:


Sessions are never saved to Redis — Redis check is always false

The _handle_stale_mcp_session function calls _REDIS_SESSION_STORE.session_exists(), but save_session() is never called anywhere in the server code. No code path writes sessions to Redis when they are created (e.g., during /mcp/initialize). This means session_exists will always return False, and the multi-worker problem described in #20992 is not actually solved by this PR.

To fix this, you need to call _REDIS_SESSION_STORE.save_session() when a new MCP session is created (likely after the session manager processes an initialize request) and _REDIS_SESSION_STORE.delete_session() when sessions are terminated.

greptile-apps · 2026-02-17T22:18:04Z

litellm/proxy/_experimental/mcp_server/redis_session_store.py

+            value = json.dumps(metadata_with_ts)
+
+            # Save to Redis with TTL
+            await self.redis_cache.async_set_cache(
+                key=key,
+                value=value,
+                ttl=self.ttl


Double JSON encoding — value is serialized twice

save_session calls json.dumps(metadata_with_ts) to produce a JSON string, then passes this string to async_set_cache. However, RedisCache.async_set_cache internally calls json.dumps(value) again (see redis_cache.py:487), resulting in double-encoded JSON stored in Redis.

On read, _get_cache_logic does one json.loads, returning the still-JSON-encoded string, and then get_session does another json.loads — so it works by accident. But this is fragile: if the RedisCache internals change, or if anyone reads this key directly from Redis, they'll get a double-encoded string.

Remove the json.dumps here and pass the dict directly to async_set_cache.

greptile-apps · 2026-02-17T22:18:05Z

litellm/proxy/_experimental/mcp_server/server.py

+            try:
+                import litellm
+                from litellm.caching.redis_cache import RedisCache


Inline imports violate codebase convention

Per the project's CLAUDE.md, imports should be placed at module level, not inside methods. import litellm and from litellm.caching.redis_cache import RedisCache should be moved to the top of the file or within the existing if MCP_AVAILABLE: block where other imports live (around line 112-127).

Suggested change

try:

import litellm

from litellm.caching.redis_cache import RedisCache

from litellm.caching.redis_cache import RedisCache

Context Used: Context from dashboard - CLAUDE.md (source)

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-02-17T22:18:09Z

litellm/proxy/_experimental/mcp_server/redis_session_store.py

+            # Delete from Redis
+            result = await self.redis_cache.async_delete_cache(key=key)
+


Inconsistent call pattern for delete method

The async_delete_cache in the Redis cache class defines its parameter as a positional arg. Most callers in the codebase pass it positionally rather than as a named argument.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-02-17T22:18:12Z

Additional Comments (1)

litellm/proxy/_experimental/mcp_server/server.py
stateless=True undermines the session persistence approach

Both session managers are created with stateless=True, which means the MCP server does not maintain per-session state in _server_instances in the traditional sense. This raises the question of whether tracking sessions in Redis has any effect. The _handle_stale_mcp_session function checks _server_instances for in-memory sessions, but with stateless mode, the session manager may not populate this dictionary in the way the Redis store expects. The core architecture of this fix may need to be reconsidered in light of how stateless MCP sessions actually work.

greptile-apps · 2026-02-17T22:21:01Z

Greptile Summary

This PR adds a MCPRedisSessionStore class and integrates it into the MCP server's _handle_stale_mcp_session() function, intended to fix multi-worker session persistence (#20992). However, the implementation has a critical gap: save_session() is never called anywhere, so the Redis store will always be empty and the session_exists() check will always return False. The fix is effectively a no-op.

Additionally, the session managers are configured with stateless=True, which means _server_instances (the dictionary that supposedly causes the multi-worker issue) is not used for session persistence. The PR's diagnosis of the problem may not match the actual root cause when running in stateless mode.

Critical: save_session() is defined but never invoked — sessions are never written to Redis, making the entire Redis integration non-functional
Critical: The stateless=True session manager configuration means _server_instances is not the session persistence mechanism, undermining the PR's stated root cause analysis
Style: Inline imports (import litellm, from litellm.caching.redis_cache import RedisCache) inside initialize_session_managers() violate the project's code style conventions (CLAUDE.md)
Performance: session_exists() fetches and deserializes the full JSON value instead of using Redis EXISTS command

Confidence Score: 1/5

This PR introduces dead code that does not achieve its stated goal — sessions are never written to Redis, so the multi-worker fix is non-functional.
Score of 1 reflects that while the PR won't break existing functionality (the Redis check gracefully returns False and falls through to the existing in-memory path), it fundamentally does not solve the stated problem. The save_session() method is never called, the stateless=True configuration contradicts the PR's root cause analysis, and the tests only validate the isolated session store class without testing the actual server integration.
Pay close attention to litellm/proxy/_experimental/mcp_server/server.py — the Redis session store integration is incomplete (no save calls) and the interaction with stateless session managers needs to be reconsidered.

Important Files Changed

Filename	Overview
litellm/proxy/_experimental/mcp_server/server.py	Adds Redis session store initialization and existence check in `_handle_stale_mcp_session`, but `save_session()` is never called so the Redis store is always empty. The stateless session manager config also means `_server_instances` isn't used for session persistence, making the Redis check ineffective.
litellm/proxy/_experimental/mcp_server/redis_session_store.py	New Redis-backed session store with save/get/delete/exists methods. Implementation is well-structured with error handling, but the class is never meaningfully used since `save_session()` is never called from the server code.
tests/mcp_tests/test_redis_session_store.py	Unit tests for the Redis session store class. Tests cover CRUD operations and error handling well but only test the isolated class — they don't verify integration with the actual MCP server flow.
tests/mcp_tests/test_multi_worker.py	Multi-worker integration tests simulate cross-worker session sharing, but only test the session store in isolation — they don't test the actual multi-worker MCP request flow where the session store is supposedly integrated.

Flowchart

flowchart TD
    A[Client Request with mcp-session-id] --> B{_handle_stale_mcp_session}
    B --> C{_REDIS_SESSION_STORE exists?}
    C -->|Yes| D[session_exists in Redis]
    C -->|No| F[Check _server_instances in-memory]
    D -->|Always False - save_session never called| E[session_exists_in_redis = False]
    E --> F
    F --> G{session in memory OR known_sessions is None?}
    G -->|Yes| H[Return False - let session manager handle]
    G -->|No - stale session| I{DELETE request?}
    I -->|Yes| J[Return 200 success]
    I -->|No| K[Strip stale header, return False]

    style D fill:#f66,stroke:#333,color:#fff
    style E fill:#f66,stroke:#333,color:#fff

_{Last reviewed commit: e5f2fad}

greptile-apps

_{4 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T22:21:04Z

litellm/proxy/_experimental/mcp_server/server.py

+        # Check Redis first for multi-worker deployments
+        session_exists_in_redis = False
+        if _REDIS_SESSION_STORE:
+            try:
+                session_exists_in_redis = await _REDIS_SESSION_STORE.session_exists(_session_id)
+                verbose_logger.debug(
+                    f"MCP session {_session_id} exists in Redis: {session_exists_in_redis}"
+                )
+            except Exception as e:
+                verbose_logger.error(
+                    f"Error checking MCP session {_session_id} in Redis: {e}. "
+                    "Falling back to in-memory check."
+                )
+
+        # Fallback to in-memory check
        known_sessions = getattr(mgr, "_server_instances", None)
-        if known_sessions is None or _session_id in known_sessions:
+        session_exists_in_memory = known_sessions is not None and _session_id in known_sessions
+
+        # Session exists if it's in either Redis or memory
+        if session_exists_in_redis or session_exists_in_memory or known_sessions is None:


save_session() is never called — Redis store is always empty

The _REDIS_SESSION_STORE.session_exists() check will always return False because save_session() is never called anywhere in the codebase. The MCPRedisSessionStore class defines save_session(), get_session(), delete_session(), and session_exists(), but the only integration point with server.py is this session_exists() call in _handle_stale_mcp_session(). No code ever writes session data to Redis.

This means the Redis-backed session store is entirely inert — it will never find a session because no session was ever stored. The multi-worker fix described in the PR cannot work as implemented.

To actually fix the multi-worker issue, sessions would need to be saved to Redis when they're created (e.g., during /mcp/initialize) and the session manager would need to be able to reconstruct server instances from Redis-stored metadata on a different worker.

greptile-apps · 2026-02-17T22:21:08Z

litellm/proxy/_experimental/mcp_server/server.py

+            try:
+                import litellm
+                from litellm.caching.redis_cache import RedisCache


Inline imports violate project style guide

The import litellm and from litellm.caching.redis_cache import RedisCache statements inside the function body violate the project's code style convention. Per CLAUDE.md: "Avoid imports within methods — place all imports at the top of the file (module-level)."

These imports should be moved to the top of the file, inside the if MCP_AVAILABLE: block alongside the other conditional imports.

Context Used: Context from dashboard - CLAUDE.md (source)

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-02-17T22:21:09Z

litellm/proxy/_experimental/mcp_server/redis_session_store.py

+    async def session_exists(self, session_id: str) -> bool:
+        """
+        Check if a session exists in Redis.
+
+        Args:
+            session_id: The MCP session ID to check
+
+        Returns:
+            True if session exists, False otherwise
+        """
+        if not self.redis_cache:
+            return False
+
+        try:
+            metadata = await self.get_session(session_id)
+            return metadata is not None
+        except Exception as e:
+            verbose_logger.error(
+                f"Failed to check MCP session {session_id} existence: {e}"
+            )
+            return False


session_exists performs full GET+deserialize instead of using Redis EXISTS

The session_exists() method calls get_session() which fetches the full value and deserializes it from JSON just to check existence. This is wasteful — Redis has a dedicated EXISTS command that returns a boolean without transferring the value. For a check that runs on every MCP request in the hot path, this adds unnecessary latency and bandwidth.

Consider using a direct Redis EXISTS check or the RedisCache equivalent if available.

greptile-apps · 2026-02-17T22:21:12Z

Additional Comments (1)

litellm/proxy/_experimental/mcp_server/server.py
Redis check cannot fix multi-worker with stateless=True

The session managers are configured with stateless=True, which means each request creates a fresh server instance that is not stored in _server_instances. In stateless mode, _server_instances is effectively unused for session persistence — the MCP library handles each request independently.

The core multi-worker issue (sessions stored in per-worker _server_instances) described in the PR only applies to stateful session managers. Since these are stateless, the problem described in #20992 may already be resolved by the stateless=True configuration, or the root cause is different from what the PR diagnoses.

Adding a Redis existence check on top of a stateless session manager doesn't change the request handling behavior in a meaningful way.

gavksingh · 2026-02-17T23:51:36Z

@greptileai review

greptile-apps · 2026-02-17T23:57:59Z

Greptile Summary

This PR adds a Redis-backed session registry (MCPRedisSessionStore) to enable cross-worker MCP session awareness, alongside ASGI send interception to capture and persist mcp-session-id response headers. The implementation is clean and well-tested with 34 new test cases.

Redis session store (redis_session_store.py): New MCPRedisSessionStore class with CRUD operations, O(1) EXISTS check, TTL expiration, and graceful fallback when Redis is unavailable.
Server integration (server.py): _handle_stale_mcp_session now checks Redis before stripping stale headers; _wrap_send_for_session_capture intercepts response headers to save new session IDs; _persist_captured_session writes them to Redis.
Fundamental design concern: Because the session manager runs in stateless=True mode, _server_instances is never populated — every request creates a fresh transport. This means sessions saved to Redis will never match in any worker's memory, so the Redis check only changes the log level (info vs warning) without altering session routing behavior. The Redis EXISTS + SET on every request adds latency without functional benefit in stateless mode.
Incorrect worker_pid metadata: _persist_captured_session stores id(session_manager) (Python object address) instead of os.getpid() for worker identification.

Confidence Score: 2/5

This PR is low risk to existing behavior (graceful fallback) but the core fix does not materially resolve the multi-worker issue it claims to address.
Score of 2 reflects that while the code is well-structured and backward-compatible, the Redis session awareness has no practical effect in stateless=True mode — sessions are never in _server_instances on any worker, so Redis lookup results don't change behavior. The worker_pid metadata bug further undermines the implementation's correctness.
Pay close attention to litellm/proxy/_experimental/mcp_server/server.py — the integration between Redis session awareness and stateless mode does not achieve the stated goal of cross-worker session handling.

Important Files Changed

Filename	Overview
litellm/proxy/_experimental/mcp_server/redis_session_store.py	New Redis-backed session store class. Clean implementation with proper error handling and TTL support, but `session_exists()` uses a different Redis access path than write operations, and the store's value is questionable with `stateless=True` mode.
litellm/proxy/_experimental/mcp_server/server.py	Core integration of Redis session store into MCP request handling. The `_wrap_send_for_session_capture` pattern is clever, but the Redis awareness doesn't change session routing behavior in `stateless=True` mode — it only changes log levels. `worker_pid` metadata uses `id(session_manager)` instead of `os.getpid()`.
tests/mcp_tests/test_mcp_server.py	Minor test update to accommodate the `send` wrapper — correctly verifies that the third argument is callable rather than checking identity. Clean adaptation.
tests/mcp_tests/test_multi_worker.py	New integration tests for multi-worker session handling. Tests are well-structured but some re-implement the capture logic manually instead of testing the actual `_wrap_send_for_session_capture` function, reducing confidence in true end-to-end coverage.
tests/mcp_tests/test_redis_session_store.py	Comprehensive unit tests for the Redis session store covering CRUD, error handling, concurrent access, and cross-worker simulation. Good test coverage.

Flowchart

flowchart TD
    A[Client sends mcp-session-id header] --> B{Header present?}
    B -- No --> Z[Pass to session_manager.handle_request]
    B -- Yes --> C{Session in worker memory?}
    C -- Yes --> Z
    C -- No --> D{Redis available?}
    D -- Yes --> E[Redis EXISTS check]
    D -- No --> G
    E -- Found --> F["Log info: 'exists in Redis'\nStrip header"]
    E -- Not found --> G["Log warning: 'not found'\nStrip header"]
    F --> Z
    G --> Z
    Z --> H[_wrap_send_for_session_capture intercepts response]
    H --> I{mcp-session-id in response?}
    I -- Yes --> J[_persist_captured_session saves to Redis]
    I -- No --> K[No-op]
    J --> L[Response sent to client]
    K --> L

    style F fill:#fff3cd
    style G fill:#fff3cd
    style J fill:#d4edda

_{Last reviewed commit: 91881ee}

greptile-apps

_{5 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T23:58:02Z

litellm/proxy/_experimental/mcp_server/server.py

+            try:
+                await _REDIS_SESSION_STORE.save_session(
+                    captured_session_id,
+                    {"worker_pid": str(id(session_manager))},


worker_pid stores Python object ID, not OS PID

id(session_manager) returns the memory address of the Python object, not the actual OS process ID. This is misleading metadata — different workers with forked processes will have different os.getpid() values, but id(session_manager) could coincidentally collide across workers since each fork has its own address space. Use os.getpid() instead for meaningful worker identification.

Suggested change

{"worker_pid": str(id(session_manager))},

{"worker_pid": str(__import__('os').getpid())},

greptile-apps · 2026-02-17T23:58:03Z

litellm/proxy/_experimental/mcp_server/server.py

+        if session_exists_in_redis:
+            verbose_logger.info(
+                "MCP session '%s' exists in Redis (created by another worker) "
+                "but not in this worker's memory.  Stripping header so a "
+                "fresh session can be created on this worker.",
+                _session_id,
+            )
        else:
-            # Non-DELETE: strip stale session ID to allow new session creation
            verbose_logger.warning(
-                "MCP session ID '%s' not found in active sessions. "
-                "Stripping stale header to force new session creation.",
+                "MCP session ID '%s' not found in active sessions (memory or "
+                "Redis).  Stripping stale header to force new session creation.",
                _session_id,
            )
-            scope["headers"] = [
-                (k, v) for k, v in scope["headers"]
-                if k != _mcp_session_header
-            ]
-            return False
+        scope["headers"] = [
+            (k, v)
+            for k, v in scope["headers"]
+            if k != _mcp_session_header
+        ]
+        return False


Redis awareness has no practical effect in stateless mode

In stateless=True mode (line 158), _server_instances is never populated with session IDs by the session manager — each request creates a fresh transport with mcp_session_id=None. This means:

When a client sends back a session ID it received from a previous response, that ID will never be in _server_instances on any worker.

The Redis session_exists check will return True (because _persist_captured_session saved it), but the code still strips the header and creates a fresh session — identical behavior to when the session is not in Redis.

The only difference between "session in Redis" and "session not in Redis" is whether an info or warning log message is emitted (lines 2045-2056).

The Redis store adds latency to every request (one EXISTS call on inbound + one SET on outbound) but doesn't change session routing behavior. Consider documenting this limitation clearly, or reconsidering whether Redis awareness provides value with stateless=True.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-02-17T23:58:04Z

litellm/proxy/_experimental/mcp_server/redis_session_store.py

+            key = self.redis_cache.check_and_fix_namespace(
+                key=self._make_key(session_id)
+            )
+
+            # Use Redis EXISTS directly — O(1), no data transfer
+            _client = self.redis_cache.init_async_client()
+            result = await _client.exists(key)
+            return bool(result)
+        except Exception as e:
+            verbose_logger.error(
+                "Failed to check MCP session %s existence: %s",
+                session_id,
+                e,


session_exists bypasses namespace used by save_session

save_session writes via async_set_cache which internally calls check_and_fix_namespace on the key. But session_exists calls check_and_fix_namespace manually and then uses init_async_client().exists(key) directly — bypassing whatever key transformations async_set_cache might apply beyond namespacing. If async_set_cache applies any additional key mutations (e.g., encoding or prefixing beyond check_and_fix_namespace), the EXISTS check could miss keys that were written via save_session.

It would be safer to use the same access path (e.g., async_get_cache) or verify that async_set_cache doesn't transform the key beyond check_and_fix_namespace.

gavksingh · 2026-02-18T00:25:08Z

@greptileai review

greptile-apps · 2026-02-18T00:29:57Z

Greptile Summary

This PR adds a Redis-backed session registry (MCPRedisSessionStore) for MCP multi-worker support, allowing workers to be aware of sessions created by other workers. Sessions are saved to Redis after each response via an ASGI send wrapper, and checked on inbound requests in _handle_stale_mcp_session.

Key concern: Because the MCP session manager runs in stateless=True mode, _server_instances is always empty — so the Redis lookup result only changes the log level (INFO for "known session from another worker" vs WARNING for "truly stale session"), not the actual behavior (the stale header is stripped in both cases). This means 2 Redis round-trips are added to every MCP request for observability-only benefit, which conflicts with the project's guideline to avoid new requests in the critical path.

Adds MCPRedisSessionStore with CRUD operations, TTL-based expiration, graceful fallback when Redis is unavailable, and proper error handling
Integrates Redis checks into _handle_stale_mcp_session and session capture into handle_streamable_http_mcp via _wrap_send_for_session_capture
Addresses previous review feedback: no double JSON encoding, imports moved to module level, os.getpid() used instead of id(), session_exists uses consistent access path
Includes 34 tests (23 unit + 11 integration) covering CRUD, error handling, cross-worker simulation, and graceful degradation
The docstring in _handle_stale_mcp_session references issue [Bug]: MCP Server (stdio transport) fails to connect from VSCode: "Session not found" error #20292 instead of the correct MCP sessions are not shared across workers - multi-worker proxy drops MCP connections randomly #20992
Some integration tests re-implement the capture logic inline rather than testing the actual _wrap_send_for_session_capture function

Confidence Score: 2/5

The PR is backward-compatible and has graceful fallback, but adds Redis latency to every MCP request for observability-only benefit in the current stateless mode.
The Redis session registry is well-implemented in isolation, with proper error handling and no breaking changes. However, in stateless=True mode, the Redis lookups don't change any routing behavior — they only affect log levels. This means 2 Redis round-trips per MCP request are added to the hot path with no functional benefit. The code is positioned as a "foundation" for future stateful mode, but the current cost-to-benefit ratio is concerning for production deployments.
Pay close attention to litellm/proxy/_experimental/mcp_server/server.py — the Redis calls in the request hot path (session_exists on inbound, save_session on outbound) add latency with no behavioral effect in stateless mode.

Important Files Changed

Filename	Overview
litellm/proxy/_experimental/mcp_server/redis_session_store.py	New Redis-backed session store class with CRUD operations. Well-structured with proper error handling and graceful fallback. No double JSON encoding. `session_exists` now uses `async_get_cache` for consistency.
litellm/proxy/_experimental/mcp_server/server.py	Core integration: Redis session store init, stale session handling with Redis awareness, ASGI send wrapper for session capture. Adds Redis GET on every inbound request and Redis SET on every outbound response, but in stateless mode this only changes log verbosity — no behavioral impact on session routing.
tests/mcp_tests/test_mcp_server.py	Minor test update to accommodate the new `_wrap_send_for_session_capture` wrapper — now checks callable instead of exact identity for the send argument.
tests/mcp_tests/test_multi_worker.py	11 integration tests covering stale handler with/without Redis, capture pattern, full multi-worker flow. Tests are well-structured but some re-implement the capture logic inline instead of testing the actual `_wrap_send_for_session_capture` function.
tests/mcp_tests/test_redis_session_store.py	23 unit tests for MCPRedisSessionStore covering CRUD, error handling, graceful fallback, serialization correctness, TTL, concurrent access, and cross-worker simulation. Thorough coverage.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Worker as Any Worker (handle_streamable_http_mcp)
    participant StaleHandler as _handle_stale_mcp_session
    participant Redis as Redis Session Store
    participant SDK as MCP SDK (stateless=True)

    Client->>Worker: POST /mcp (mcp-session-id: abc123)
    Worker->>StaleHandler: Check session header
    StaleHandler->>Redis: session_exists("abc123")
    Redis-->>StaleHandler: true/false (observability only)
    Note over StaleHandler: _server_instances always empty<br/>in stateless mode
    StaleHandler->>StaleHandler: Strip mcp-session-id header
    StaleHandler-->>Worker: return False (continue)
    Worker->>SDK: handle_request(scope, receive, wrapped_send)
    SDK-->>Worker: Response with new mcp-session-id
    Worker->>Worker: _send_and_capture extracts session ID
    Worker->>Redis: save_session(new_id, {worker_pid})
    Worker-->>Client: Response

_{Last reviewed commit: f34ac3d}

greptile-apps

_{5 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-18T00:30:00Z

litellm/proxy/_experimental/mcp_server/server.py

+            wrapped_send, get_captured_id = _wrap_send_for_session_capture(send)
+
+            await session_manager.handle_request(scope, receive, wrapped_send)
+
+            # Persist the session ID in Redis so other workers know it exists
+            await _persist_captured_session(get_captured_id())


Redis calls on every MCP request add latency with no behavioral effect

In stateless=True mode, every request creates a fresh transport and _server_instances is always empty. This means:

Inbound: session_exists() (a full Redis GET + deserialize) runs on every request with a session header, but the result only changes the log level (INFO vs WARNING) — the header is stripped either way.

Outbound: _persist_captured_session() (a Redis SET) runs after every response, writing session metadata that will only ever be used for the log-level distinction above.

This adds ~2 Redis round-trips to every MCP request for observability-only benefit. Per project guidelines, avoid creating new requests in the critical request path (custom rule 0c2a17ad). Consider either:

Making the Redis calls conditional on a config flag (e.g., MCP_SESSION_REGISTRY_ENABLED)

Deferring _persist_captured_session to a background task (asyncio.create_task) so it doesn't block the response

Documenting clearly that this latency is intentional and expected to be useful only when migrating to stateful mode later

Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.

Why: Cre... (source)

greptile-apps · 2026-02-18T00:30:01Z

tests/mcp_tests/test_multi_worker.py

+@pytest.mark.asyncio
+async def test_send_and_capture_saves_session_to_redis(shared_redis):
+    """
+    Simulate the _send_and_capture wrapper: when the response includes
+    a mcp-session-id header, save_session should be called.
+    """
+    redis_store = MCPRedisSessionStore(redis_cache=shared_redis, ttl=3600)
+
+    # Simulate what handle_streamable_http_mcp does
+    captured_session_id = None
+
+    async def mock_send(message):
+        pass
+
+    async def send_and_capture(message):
+        nonlocal captured_session_id
+        if message.get("type") == "http.response.start":
+            for name, value in message.get("headers", []):
+                if isinstance(name, bytes) and name.lower() == b"mcp-session-id":
+                    captured_session_id = value.decode("utf-8")
+                    break
+        await mock_send(message)
+
+    # Simulate a response with mcp-session-id header
+    await send_and_capture(
+        {
+            "type": "http.response.start",
+            "status": 200,
+            "headers": [
+                (b"content-type", b"application/json"),
+                (b"mcp-session-id", b"new-session-abc123"),
+            ],
+        }
+    )
+
+    assert captured_session_id == "new-session-abc123"
+
+    # Now simulate the post-request save
+    if captured_session_id:
+        await redis_store.save_session(captured_session_id, {"worker": "1"})
+
+    # Verify the session is in Redis
+    assert await redis_store.session_exists("new-session-abc123") is True
+    meta = await redis_store.get_session("new-session-abc123")
+    assert meta is not None
+    assert meta["worker"] == "1"


Tests re-implement capture logic instead of testing the actual function

test_send_and_capture_saves_session_to_redis and test_send_and_capture_ignores_no_session_header manually re-implement the header-capture logic inline (lines 238-244, 277-283) rather than importing and testing the actual _wrap_send_for_session_capture function from server.py. This means the tests don't verify the real implementation — they only prove that a separately-written copy of the logic works.

Consider importing _wrap_send_for_session_capture and _persist_captured_session and testing them directly, similar to how _handle_stale_mcp_session is tested above.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-02-18T00:30:02Z

litellm/proxy/_experimental/mcp_server/server.py

-            False if the request should continue to the session manager
+            True if the request was fully handled (e.g. DELETE on
+            non-existent session).  False if the request should continue
+            to the session manager.

        Fixes https://github.com/BerriAI/litellm/issues/20292


Docstring references wrong issue number

The docstring says "Fixes #20292" but the actual issue this PR addresses is #20992. This appears to be a typo from the original code that was carried forward.

Suggested change

Fixes https://github.com/BerriAI/litellm/issues/20292

Fixes https://github.com/BerriAI/litellm/issues/20992

gavksingh · 2026-02-18T01:06:33Z

@greptileai review

greptile-apps · 2026-02-18T01:12:05Z

Greptile Summary

This PR adds a Redis-backed session registry (MCPRedisSessionStore) to track MCP session IDs across multiple Uvicorn workers, addressing #20992. The store is integrated into _handle_stale_mcp_session (inbound check) and a new _wrap_send_for_session_capture / _persist_captured_session flow (outbound save). However, because the session manager runs in stateless=True mode, _server_instances is never populated — meaning the Redis lookup only changes the log level (INFO vs WARNING) and does not alter session routing behavior. The header is stripped and a fresh session is created regardless of whether the session is found in Redis.

Observability-only benefit: In stateless mode, Redis provides distinguishing legitimate cross-worker sessions from fabricated/stale IDs for logging purposes, plus idempotent DELETE cleanup. It does not enable actual session sharing across workers.
Hot-path latency concern: A blocking Redis GET + JSON deserialize runs on every inbound MCP request with a session header, and a background Redis SET fires on every outbound response — both for logging-only value.
PR description inaccuracy: Claims "O(1) existence check via Redis EXISTS command" but the implementation uses async_get_cache (full GET + deserialize), not the Redis EXISTS command.
Good error handling: All Redis operations have proper try/except with graceful fallback to in-memory behavior when Redis is unavailable.
Solid test coverage: 34 tests (23 unit + 11 integration) covering CRUD, error handling, cross-worker simulation, and stale session handling.

Confidence Score: 2/5

This PR is low risk (no breaking changes, graceful fallback) but provides limited functional value — Redis integration only affects log levels in the current stateless mode, while adding latency to every MCP request.
Score of 2 reflects: (1) Redis adds a network round-trip on every inbound MCP request for observability-only benefit, which conflicts with the project's rule against new requests in the critical path; (2) the PR description inaccurately claims O(1) EXISTS usage when the code uses full GET+deserialize; (3) the core multi-worker session problem described in MCP sessions are not shared across workers - multi-worker proxy drops MCP connections randomly #20992 is not functionally solved — sessions are still stateless and not shared across workers. The code is well-structured with proper error handling and won't break existing behavior, but the value proposition is questionable.
Pay close attention to litellm/proxy/_experimental/mcp_server/server.py (hot-path Redis calls) and litellm/proxy/_experimental/mcp_server/redis_session_store.py (session_exists implementation mismatch with PR description).

Important Files Changed

Filename	Overview
litellm/proxy/_experimental/mcp_server/redis_session_store.py	New Redis session store with CRUD operations. `session_exists()` uses full GET+deserialize instead of Redis EXISTS as the PR description claims. Overall well-structured with proper error handling and graceful degradation.
litellm/proxy/_experimental/mcp_server/server.py	Core integration: adds Redis session awareness, `_handle_stale_mcp_session` refactored with Redis check, `_wrap_send_for_session_capture` intercepts response headers, `_persist_captured_session` fires as untracked background task. Redis adds latency on every MCP request but only provides observability benefit in stateless mode.
tests/mcp_tests/test_mcp_server.py	Minor adjustment to existing test to account for the new `send` wrapper — asserts wrapped send is callable instead of identity check. Correct adaptation.
tests/mcp_tests/test_multi_worker.py	11 integration tests covering stale session handling with Redis, send-and-capture wrapper, full multi-worker flow, and fallback behavior. Good coverage but tests use mocked Redis which may not catch namespace/key transformation mismatches with the real RedisCache.
tests/mcp_tests/test_redis_session_store.py	23 unit tests for MCPRedisSessionStore covering CRUD, error handling, graceful fallback, serialization correctness, TTL, key namespace, concurrent access, and cross-worker simulation. Thorough coverage of the store class.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ASGI as handle_streamable_http_mcp
    participant Stale as _handle_stale_mcp_session
    participant Redis as MCPRedisSessionStore
    participant SDK as session_manager.handle_request
    participant Persist as _persist_captured_session

    Client->>ASGI: POST /mcp (mcp-session-id: abc-123)
    ASGI->>Stale: Check stale session
    Stale->>Redis: session_exists("abc-123") [GET + deserialize]
    Redis-->>Stale: true/false
    Note over Stale: Check _server_instances (always empty in stateless mode)
    alt Session not in memory (always true in stateless)
        Stale->>Stale: Strip mcp-session-id header
        Note over Stale: Log INFO (if in Redis) or WARNING (if not)
    end
    Stale-->>ASGI: False (continue processing)
    
    ASGI->>ASGI: _wrap_send_for_session_capture(send)
    ASGI->>SDK: handle_request(scope, receive, wrapped_send)
    SDK-->>ASGI: Response with new mcp-session-id header
    Note over ASGI: Capture new session ID from response headers
    
    ASGI->>Persist: asyncio.create_task (fire-and-forget)
    Persist->>Redis: save_session(new-id, {worker_pid})
    Redis-->>Persist: OK
    
    ASGI-->>Client: Response (new mcp-session-id)

_{Last reviewed commit: cdc4316}

greptile-apps

_{5 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-18T01:12:09Z

litellm/proxy/_experimental/mcp_server/server.py

+            # Persist the session ID in Redis (background task, non-blocking)
+            captured_id = get_captured_id()
+            if captured_id and _REDIS_SESSION_STORE:
+                asyncio.create_task(_persist_captured_session(captured_id))


Untracked asyncio.create_task may be garbage collected

The task returned by asyncio.create_task() is not stored in any variable or set. Per Python docs, the event loop only holds a weak reference to tasks, so an unreferenced task can be garbage collected before it completes — silently dropping the Redis write.

While this pattern exists elsewhere in the codebase, it's worth noting that for a best-effort background save, a silent failure due to GC is hard to debug. Consider storing the task reference (e.g., in a module-level set with a done-callback that removes it), which is the recommended practice:

_background_tasks = set() task = asyncio.create_task(_persist_captured_session(captured_id)) _background_tasks.add(task) task.add_done_callback(_background_tasks.discard)

greptile-apps · 2026-02-18T01:12:10Z

litellm/proxy/_experimental/mcp_server/redis_session_store.py

+    async def session_exists(self, session_id: str) -> bool:
+        """
+        Check if a session exists in Redis.
+
+        Uses the same ``async_get_cache`` access path as ``save_session``
+        to guarantee key transformations (namespacing, prefixing) are
+        consistent.
+
+        Args:
+            session_id: The MCP session ID to check
+
+        Returns:
+            True if session exists, False otherwise
+        """
+        if not self.redis_cache:
+            return False
+
+        try:
+            key = self._make_key(session_id)
+            value = await self.redis_cache.async_get_cache(key=key)
+            return value is not None
+        except Exception as e:
+            verbose_logger.error(
+                "Failed to check MCP session %s existence: %s",
+                session_id,
+                e,
+            )
+            return False


PR description claims O(1) EXISTS but code does full GET

The PR description states "O(1) existence check: session_exists() uses Redis EXISTS command directly (not GET+deserialize)" but the implementation calls async_get_cache which performs a full GET followed by json.loads deserialization. While the docstring here was updated to acknowledge this, the PR description is misleading.

Since session_exists is called on every inbound MCP request with a session header, the difference matters: Redis EXISTS returns a boolean without transferring the value payload, whereas GET transfers the full serialized JSON and then the client deserializes it. For a hot-path existence check, consider using a lighter-weight approach if the RedisCache API supports it, or at minimum update the PR description to reflect the actual implementation.

greptile-apps · 2026-02-18T01:12:11Z

litellm/proxy/_experimental/mcp_server/server.py

+        # Check Redis for multi-worker session awareness
+        session_exists_in_redis = False
+        if _REDIS_SESSION_STORE:
+            try:
+                session_exists_in_redis = await _REDIS_SESSION_STORE.session_exists(
+                    _session_id
+                )
+                verbose_logger.debug(
+                    "MCP session %s exists in Redis: %s",
+                    _session_id,
+                    session_exists_in_redis,
+                )
+            except Exception as e:
+                verbose_logger.error(
+                    "Error checking MCP session %s in Redis: %s. "
+                    "Falling back to in-memory check.",
+                    _session_id,
+                    e,
+                )


Blocking Redis call on every inbound MCP request in the hot path

session_exists() (which performs a full Redis GET + JSON deserialize) is awaited synchronously on every MCP request that carries an mcp-session-id header. In stateless=True mode, this call only determines the log level (INFO vs WARNING) — it does not change any routing or session behavior (the header is stripped either way).

This adds non-trivial latency to every MCP request for observability-only benefit. Per project custom instruction 0c2a17ad: "Avoid creating new database requests or Router objects in the critical request path." While Redis isn't a database in the traditional sense, the spirit of that rule applies — adding a network round-trip to every request for logging purposes is a concern.

Consider making this conditional on a debug/verbose flag, or moving it to a background task so it doesn't block the request.

Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.

Why: Cre... (source)

gavksingh · 2026-02-18T01:41:26Z

@greptileai review

greptile-apps · 2026-02-18T01:45:44Z

Greptile Summary

This PR adds a Redis-backed session registry (MCPRedisSessionStore) intended to provide cross-worker MCP session awareness for multi-worker deployments (issue #20992). However, the Redis store is effectively write-only in the current implementation: session IDs are persisted to Redis on every outbound response via _persist_captured_session, but _handle_stale_mcp_session never reads from Redis on the inbound path to influence routing decisions. In stateless=True mode, the stale session header is always stripped regardless of Redis state, making the Redis writes pure overhead with no behavioral impact.

Core issue: The Redis registry adds a SET operation (as a background task) on every MCP response, but the inbound handler never calls session_exists() — so Worker B behaves identically whether Worker A's session is in Redis or not. The multi-worker 400 error described in MCP sessions are not shared across workers - multi-worker proxy drops MCP connections randomly #20992 is not resolved by this change.
PR_DESCRIPTION.md committed to repo root: This file duplicates the PR body and should be removed from the commit.
Well-structured store implementation: MCPRedisSessionStore itself is clean with proper error handling, graceful fallback, and TTL support — but it's not utilized for its stated purpose.
Tests confirm the gap: test_full_multi_worker_flow asserts the same outcome (header stripped, fresh session created) whether or not Redis has the session, inadvertently proving Redis has no effect.

Confidence Score: 1/5

This PR adds infrastructure that does not achieve its stated goal — Redis writes add overhead without changing multi-worker behavior.
The Redis session store is never consulted on the inbound request path, so cross-worker session awareness doesn't actually work. In stateless mode, the stale handler strips headers identically with or without Redis. Every MCP response triggers a background Redis SET that is never read for routing. The PR claims to fix MCP sessions are not shared across workers - multi-worker proxy drops MCP connections randomly #20992 but the 400 error scenario is not resolved. Additionally, PR_DESCRIPTION.md is committed to the repo root.
Pay close attention to litellm/proxy/_experimental/mcp_server/server.py — the _handle_stale_mcp_session function never reads from Redis, and _persist_captured_session writes to Redis on every response with no consumer. PR_DESCRIPTION.md should be removed from the commit.

Important Files Changed

Filename	Overview
litellm/proxy/_experimental/mcp_server/server.py	Adds Redis session capture/persist on outbound and DELETE cleanup, but the inbound `_handle_stale_mcp_session` never consults Redis for routing decisions. In stateless mode, Redis writes fire on every response with no behavioral impact — adding overhead without solving the multi-worker problem.
litellm/proxy/_experimental/mcp_server/redis_session_store.py	Well-structured Redis session store with proper error handling, graceful fallback, and TTL support. The implementation itself is clean, but the store is effectively unused for its stated purpose (cross-worker session awareness) since the server never reads from it on inbound requests.
tests/mcp_tests/test_redis_session_store.py	Comprehensive unit tests for the Redis session store CRUD operations, error handling, and cross-worker simulation. Tests are well-organized and thorough for the store itself.
tests/mcp_tests/test_multi_worker.py	Integration tests for the multi-worker flow, but `test_full_multi_worker_flow` inadvertently proves Redis has no effect — the outcome is identical with or without Redis. Tests don't verify any behavioral difference enabled by the Redis store.
tests/mcp_tests/test_mcp_server.py	Minor update to accommodate the new `_wrap_send_for_session_capture` wrapper — correctly verifies that the third argument to `handle_request` is now a wrapped callable rather than the original `send`.
PR_DESCRIPTION.md	PR description file committed to repo root — should be removed as it doesn't belong in the codebase.

Flowchart

flowchart TD
    A[Client sends mcp-session-id header] --> B{Session in local _server_instances?}
    B -- Yes --> C[Let session manager handle]
    B -- No --> D{_server_instances is None?}
    D -- Yes --> C
    D -- No --> E{Method is DELETE?}
    E -- Yes --> F[Best-effort Redis delete_session]
    F --> G[Return 200 success]
    E -- No --> H[Strip mcp-session-id header]
    H --> I[Session manager creates fresh session]
    I --> J[_wrap_send_for_session_capture intercepts response]
    J --> K{Response has mcp-session-id?}
    K -- Yes --> L[Background task: save_session to Redis]
    K -- No --> M[No-op]
    L --> N["Redis SET (never read on inbound)"]

    style N fill:#f96,stroke:#333
    style L fill:#f96,stroke:#333

_{Last reviewed commit: 2e03590}

greptile-apps

_{6 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-18T01:45:48Z

litellm/proxy/_experimental/mcp_server/server.py

+    async def _persist_captured_session(
+        captured_session_id: Optional[str],
+    ) -> None:
+        """Persist a newly-captured MCP session ID to Redis (best-effort)."""
+        if captured_session_id and _REDIS_SESSION_STORE:
+            try:
+                await _REDIS_SESSION_STORE.save_session(
+                    captured_session_id,
+                    {"worker_pid": str(os.getpid())},
+                )
+            except Exception as exc:
+                verbose_logger.error(
+                    "Failed to save MCP session %s to Redis: %s",
+                    captured_session_id,
+                    exc,
+                )


Redis registry is write-only in stateless mode

_persist_captured_session saves session IDs to Redis on every outbound response, but _handle_stale_mcp_session never reads from Redis to influence routing decisions. The inbound path at lines 1989-2036 only checks _server_instances (local memory), and since stateless=True means _server_instances is never populated by the SDK, the session header is always stripped for any request carrying a session ID.

The Redis store is effectively write-only:

Outbound: save_session() writes on every response (line 2178-2183)

DELETE: delete_session() cleans up (line 2010)

Inbound: Redis is never consulted for routing

This means Worker B handling a request originally from Worker A will behave identically whether Redis is configured or not — the stale header is always stripped and a fresh session is created. The Redis writes add overhead (one SET per response) with no observable behavioral difference.

If cross-worker awareness is purely for observability/logging, consider documenting that clearly and making the Redis writes conditional on a verbose/debug flag to avoid unnecessary Redis traffic in production.

Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.

Why: Cre... (source)

greptile-apps · 2026-02-18T01:45:49Z

PR_DESCRIPTION.md

@@ -0,0 +1,146 @@
+## Relevant Issues


PR_DESCRIPTION.md should not be committed to the repo

This file duplicates the PR description and doesn't belong in the repository root. It should be removed from the commit — the PR description belongs in the GitHub PR body, not as a tracked file.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-02-18T01:45:50Z

litellm/proxy/_experimental/mcp_server/server.py

+            # Wrap `send` to capture the session ID from response headers
+            wrapped_send, get_captured_id = _wrap_send_for_session_capture(send)
+
+            await session_manager.handle_request(scope, receive, wrapped_send)
+
+            # Persist the session ID in Redis (background task, non-blocking)
+            captured_id = get_captured_id()
+            if captured_id and _REDIS_SESSION_STORE:
+                task = asyncio.create_task(
+                    _persist_captured_session(captured_id)
+                )
+                _background_tasks.add(task)
+                task.add_done_callback(_background_tasks.discard)


Background Redis write fires on every MCP response

_wrap_send_for_session_capture intercepts every outbound response, and if a mcp-session-id header is present, a background task is created to write to Redis. In stateless=True mode, the MCP SDK generates a new session ID on every request (since each creates a fresh StreamableHTTPServerTransport). This means every single MCP response triggers a Redis SET — writing a session ID that will never be looked up on inbound.

Per custom rule 0c2a17ad: "Avoid creating new database requests or Router objects in the critical request path." While this is technically a background task, it still creates a Redis round-trip per request. Consider gating this behind a configuration flag (e.g., MCP_SESSION_REGISTRY_ENABLED) or only writing when the session ID is new (not already seen).

Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.

Why: Cre... (source)

greptile-apps · 2026-02-18T01:45:51Z

tests/mcp_tests/test_multi_worker.py

+@pytest.mark.asyncio
+async def test_full_multi_worker_flow(shared_redis):
+    """
+    End-to-end scenario:
+    1. Worker A handles initialize → captures session ID → saves to Redis
+    2. Worker B receives follow-up with that session ID
+    3. _handle_stale_mcp_session finds session in Redis → strips header (good)
+    4. Worker B creates fresh session (stateless mode)
+    """
+    handle = _get_handle_stale()
+    redis_store = MCPRedisSessionStore(redis_cache=shared_redis, ttl=3600)
+
+    # Step 1: Worker A creates session and saves to Redis
+    await redis_store.save_session(
+        "abc123",
+        {"origin": "worker_a", "worker_pid": "12345"},
+    )
+
+    # Step 2-3: Worker B receives request with session ID from Worker A
+    scope = _make_scope(session_id="abc123")
+    mgr_b = _mock_session_manager(session_ids=[])  # Worker B has no sessions
+
+    with patch(
+        "litellm.proxy._experimental.mcp_server.server._REDIS_SESSION_STORE",
+        redis_store,
+    ):
+        result = await handle(scope, AsyncMock(), AsyncMock(), mgr_b)
+
+    # Should return False (not handled) so request continues to session_manager
+    assert result is False
+    # Header should be stripped for fresh session creation
+    header_names = [h[0] for h in scope["headers"]]
+    assert b"mcp-session-id" not in header_names
+
+    # Step 4: Session should still exist in Redis
+    assert await redis_store.session_exists("abc123") is True


"Full multi-worker flow" test proves Redis has no effect

This test demonstrates the core architectural issue: even when Worker A's session is in Redis (line 311), Worker B still strips the header (line 330) and creates a fresh session. The test asserts result is False and b"mcp-session-id" not in header_names — which is exactly the same behavior as without Redis (see test_stale_handler_session_not_found_strips_header).

The test comment says "finds session in Redis → strips header (good)" but this is the same outcome as when the session is not in Redis. The Redis lookup isn't even happening in _handle_stale_mcp_session — the function doesn't call session_exists() on inbound requests. This confirms the Redis store has no behavioral impact in the current implementation.

gavksingh · 2026-02-19T13:53:11Z

@greptileai review

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-19T13:57:45Z

tests/mcp_tests/test_mcp_server.py

+        # Note: send is wrapped by _wrap_send_for_session_capture, so we
+        # check scope/receive identity and that the third arg is callable


Comment references non-existent _wrap_send_for_session_capture function — this function doesn't exist in server.py. The actual code at line 2075 calls session_manager.handle_request(scope, receive, send) directly without any wrapper.

greptile-apps · 2026-02-19T13:57:46Z

litellm/proxy/_experimental/mcp_server/server.py

+        # Check in-memory session tracking
        known_sessions = getattr(mgr, "_server_instances", None)
-        if known_sessions is None or _session_id in known_sessions:
-            # Session exists or we can't check - let the session manager handle it
+        session_exists_in_memory = (
+            known_sessions is not None and _session_id in known_sessions
+        )
+
+        # If session exists in this worker's memory, let the manager handle it
+        if session_exists_in_memory:
            return False

-        # Session doesn't exist - handle based on request method
+        # If we cannot inspect known_sessions, let the manager handle it
+        if known_sessions is None:
+            return False


Early return on line 1960 makes the session_exists_in_memory variable on line 1951 redundant. Since you return False if known_sessions is None, you can simplify:

Suggested change

# Check in-memory session tracking

known_sessions = getattr(mgr, "_server_instances", None)

if known_sessions is None or _session_id in known_sessions:

# Session exists or we can't check - let the session manager handle it

session_exists_in_memory = (

known_sessions is not None and _session_id in known_sessions

)

# If session exists in this worker's memory, let the manager handle it

if session_exists_in_memory:

return False

# Session doesn't exist - handle based on request method

# If we cannot inspect known_sessions, let the manager handle it

if known_sessions is None:

return False

# Check in-memory session tracking

known_sessions = getattr(mgr, "_server_instances", None)

if known_sessions is None:

# If we cannot inspect known_sessions, let the manager handle it

return False

# If session exists in this worker's memory, let the manager handle it

if _session_id in known_sessions:

return False

gavksingh · 2026-02-19T14:14:54Z

@greptileai review

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-19T14:19:57Z

litellm/proxy/_experimental/mcp_server/server.py

+        # Check in-memory session tracking
        known_sessions = getattr(mgr, "_server_instances", None)
-        if known_sessions is None or _session_id in known_sessions:
-            # Session exists or we can't check - let the session manager handle it
+        # If we cannot inspect known_sessions, let the manager handle it
+        if known_sessions is None:
            return False


redundant early return makes session_exists_in_memory variable unnecessary

since you return early on line 1953 if known_sessions is None, the variable assigned on line 1951 is never used

Suggested change

# Check in-memory session tracking

known_sessions = getattr(mgr, "_server_instances", None)

if known_sessions is None or _session_id in known_sessions:

# Session exists or we can't check - let the session manager handle it

# If we cannot inspect known_sessions, let the manager handle it

if known_sessions is None:

return False

# Check in-memory session tracking

known_sessions = getattr(mgr, "_server_instances", None)

# If we cannot inspect known_sessions, let the manager handle it

if known_sessions is None:

return False

greptile-apps · 2026-02-19T14:19:58Z

litellm/proxy/_experimental/mcp_server/server.py

CRITICAL: PR description completely misrepresents the changes

The PR description claims this PR adds:

MCPRedisSessionStore class in redis_session_store.py

34 test cases (23 unit + 11 integration)

Redis-backed session registry

save_session(), get_session(), session_exists(), delete_session() methods

Background task pattern with _background_tasks set

None of these exist in the actual commit. The real changes are:

Minor docstring/comment updates

Fixed issue reference from [Bug]: MCP Server (stdio transport) fails to connect from VSCode: "Session not found" error #20292 to MCP sessions are not shared across workers - multi-worker proxy drops MCP connections randomly #20992

Refactored early return logic in _handle_stale_mcp_session

Updated log message wording

This is either the wrong commit for PR #21417, or the PR description was copy-pasted from a different PR entirely. This makes it impossible to properly review because all the previous review comments reference Redis code that doesn't exist here.

gavksingh · 2026-02-19T15:21:20Z

@greptileai review

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-19T15:25:30Z

litellm/proxy/_experimental/mcp_server/server.py

PR description describes Redis-backed session store with MCPRedisSessionStore class, 34 tests, and multiple new files. The actual commit only contains minor refactoring: fixed issue reference (#20292→#20992), improved docstring clarity, split early return logic, and updated log messages. No Redis code, no new files, no 34 tests.

…orker deployments (BerriAI#20992) In a multi-worker Uvicorn setup, a client that reconnects to a different worker sends an mcp-session-id that the new worker has never seen. The MCP SDK returns 400 because the session is unknown. Fix: add _handle_stale_mcp_session() which inspects the inbound mcp-session-id header before the request reaches the SDK. If the session is not in this worker's _server_instances: - Non-DELETE: strip the header so the SDK creates a fresh session - DELETE: return 200 immediately (idempotent, session already gone) No new dependencies, no Redis, no latency added to the hot path. Fixes BerriAI#20992

gavksingh · 2026-02-19T17:43:13Z

@greptileai review

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-19T17:46:14Z

Additional Comments (1)

litellm/proxy/_experimental/mcp_server/server.py
PR description completely misrepresents the changes. It claims to add Redis-backed session store with MCPRedisSessionStore class, 45 tests, and new files — none of which exist in this commit. The actual changes are: fixed issue reference (#20292→#20992), added header normalization, improved error handling and documentation.

shivamrawat1 · 2026-02-25T02:24:41Z

@greptile re-review the entire PR

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-25T02:28:03Z

litellm/proxy/_experimental/mcp_server/server.py

 import contextlib
+


Spurious blank line splits stdlib import group

This blank line separates contextlib from traceback, both of which are stdlib imports. PEP 8 and isort convention group all stdlib imports together without blank lines between them. This likely needs to be removed to keep the import block clean.

Suggested change

import contextlib

import contextlib

import traceback

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

krrishdholakia · 2026-02-25T17:37:54Z

@gavksingh the PR doesn't seem to do any redis work - is this intentional? Seems in conflict with your PR title

@jquinter for monitoring

gavksingh · 2026-02-25T19:36:55Z

@gavksingh the PR doesn't seem to do any redis work - is this intentional? Seems in conflict with your PR title

@jquinter for monitoring

Hey @krrishdholakia - Yes it is fully intentional! Sorry for the confusing PR title history; I've just updated the title to properly reflect the fix.

I initially went with a Redis implementation, but realized it wasn't the right approach here. Since the proxy uses stateless=True for MCP StreamableHTTP, the MCP SDK generates ephemeral transport sessions for every request anyway. There is no persistent session state to actually share between workers.

The 400 error in #20992 happens simply because a reconnecting client hits a new worker, and that new worker rejects the unknown mcp-session-id. That's why instead of adding Redis latency to the hot path, my fix handles this by properly inspecting and stripping the stale header before it reaches the SDK. This allows the MCP SDK to spin up a fresh session naturally - zero latency and dependency -free fix without Redis overhead.

greptile-apps

_{2 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

gavksingh · 2026-02-26T15:53:15Z

@greptileai review

shivamrawat1 · 2026-02-27T01:48:15Z

@greptile re-review any give the score again

shivamrawat1 · 2026-03-03T02:01:33Z

@gavksingh was this tested e2e in a multi instance env?

gavksingh · 2026-03-03T04:30:09Z

@shivamrawat1 Yes, just validated it e2e.

Spun up two independent litellm proxy containers via docker-compose, different ports, same config, no shared state between them. Each container has its own process and its own _server_instances dict, so this is the exact scenario from #20992.

Scenario	Result
`tools/list` to Proxy B with a session ID it has never seen	200
DELETE on Proxy B with the stale session ID	200 (idempotent)
Fresh initialize on Proxy B, no session header	200

This works because the endpoint runs in stateless mode (stateless=True from #21323), so every request is independent and there's no "initialize first" requirement across requests. The _handle_stale_mcp_session() guard strips stale headers and the SDK spins up a fresh ephemeral session per request.

Happy to open a follow-up PR with the e2e test files if you want them in the test suite.

vercel bot deployed to Preview February 17, 2026 22:14 View deployment

greptile-apps bot reviewed Feb 17, 2026

View reviewed changes

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from e5f2fad to ff42117 Compare February 17, 2026 22:59

vercel bot deployed to Preview February 17, 2026 23:00 View deployment

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from ff42117 to 91881ee Compare February 17, 2026 23:22

vercel bot deployed to Preview February 17, 2026 23:24 View deployment

greptile-apps bot reviewed Feb 17, 2026

View reviewed changes

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from 91881ee to f34ac3d Compare February 18, 2026 00:14

vercel bot deployed to Preview February 18, 2026 00:16 View deployment

greptile-apps bot reviewed Feb 18, 2026

View reviewed changes

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from f34ac3d to cdc4316 Compare February 18, 2026 00:44

vercel bot deployed to Preview February 18, 2026 00:45 View deployment

greptile-apps bot reviewed Feb 18, 2026

View reviewed changes

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from cdc4316 to 2e03590 Compare February 18, 2026 01:26

vercel bot deployed to Preview February 18, 2026 01:28 View deployment

greptile-apps bot reviewed Feb 18, 2026

View reviewed changes

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from 2e03590 to 5728c03 Compare February 18, 2026 21:04

vercel bot deployed to Preview February 18, 2026 21:05 View deployment

greptile-apps bot reviewed Feb 19, 2026

View reviewed changes

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from 5728c03 to 1bb2cad Compare February 19, 2026 14:14

vercel bot deployed to Preview February 19, 2026 14:15 View deployment

greptile-apps bot reviewed Feb 19, 2026

View reviewed changes

gavksingh force-pushed the fix/issue-20992-distributed-mcp-sessions branch from 1bb2cad to 406c468 Compare February 19, 2026 15:38

vercel bot deployed to Preview February 19, 2026 15:40 View deployment

greptile-apps bot reviewed Feb 19, 2026

View reviewed changes

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

krrishdholakia changed the base branch from main to litellm_oss_staging_02_24_2026 February 25, 2026 17:35

krrishdholakia changed the base branch from litellm_oss_staging_02_24_2026 to main February 25, 2026 17:37

gavksingh changed the title ~~fix(mcp): Add Redis-backed session store for multi-worker support (#20992)~~ fix(mcp): Strip stale mcp-session-id to prevent 400 errors across proxy workers Feb 25, 2026

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

krrishdholakia merged commit 29bb73f into BerriAI:main Feb 27, 2026
17 of 21 checks passed

		# Delete from Redis
		result = await self.redis_cache.async_delete_cache(key=key)

	{"worker_pid": str(id(session_manager))},
	{"worker_pid": str(__import__('os').getpid())},

	Fixes https://github.com/BerriAI/litellm/issues/20292
	Fixes https://github.com/BerriAI/litellm/issues/20992

		# Note: send is wrapped by _wrap_send_for_session_capture, so we
		# check scope/receive identity and that the third arg is callable

Uh oh!

Conversation

gavksingh commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant Issues

Root Cause

Fix

Changes

Modified Files

litellm/proxy/_experimental/mcp_server/server.py

tests/mcp_tests/test_mcp_server.py

Testing

Uh oh!

vercel bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 17, 2026

Uh oh!

greptile-apps bot commented Feb 17, 2026

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 17, 2026

Uh oh!

gavksingh commented Feb 17, 2026

Uh oh!

greptile-apps bot commented Feb 17, 2026

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

gavksingh commented Feb 17, 2026 •

edited

Loading

`litellm/proxy/_experimental/mcp_server/server.py`

`tests/mcp_tests/test_mcp_server.py`

vercel bot commented Feb 17, 2026 •

edited

Loading

CLAassistant commented Feb 17, 2026 •

edited

Loading

greptile-apps bot commented Feb 17, 2026 •

edited

Loading