[Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support by ishaan-jaff · Pull Request #20788 · BerriAI/litellm

ishaan-jaff · 2026-02-09T22:51:17Z

[Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support

Working M2M Oauth 2 support for MCPs

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature
✅ Test

Changes

vercel · 2026-02-09T22:51:23Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 10, 2026 1:21am

greptile-apps · 2026-02-09T22:58:03Z

Greptile Overview

Greptile Summary

This PR adds OAuth2 client_credentials support for MCP servers by introducing an in-memory token cache (oauth2_token_cache.py) and centralizing auth selection in resolve_mcp_auth(). MCPServerManager is updated to await auth resolution during MCP client creation, and MCPServer gains helpers to distinguish client-credential OAuth2 vs user-token OAuth2. New constants define cache sizing/TTL defaults.

Key integration points are MCP client creation (now async) and server discovery/health-check logic that skips servers requiring per-user OAuth2 tokens.

Confidence Score: 2/5

This PR needs fixes before merging due to a definite OAuth2 server classification bug affecting runtime behavior.
Confidence is reduced because needs_user_oauth_token currently compares incompatible enum types, so OAuth2 user-token servers will be misclassified and no longer skipped in tool discovery / health checks. There’s also a concrete edge case where computed TTL can be zero, potentially causing repeated token fetches under load, plus synchronous JSON parsing in an async request path.
litellm/types/mcp_server/mcp_server_manager.py, litellm/proxy/_experimental/mcp_server/mcp_server_manager.py, litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

Important Files Changed

Filename	Overview
litellm/constants.py	Adds default constants for MCP OAuth2 token cache sizing/TTL/expiry buffer; values are currently hard-coded (not env-configurable).
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py	Makes MCP client creation async and centralizes auth resolution via resolve_mcp_auth; introduces a bug where OAuth2 user-token servers may not be detected correctly due to enum mismatch in MCPServer.needs_user_oauth_token usage.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py	Introduces an in-memory async OAuth2 client_credentials token cache and auth resolver; watch for ttl=0 causing refetch loops and synchronous JSON parsing in async path.
litellm/types/mcp_server/mcp_server_manager.py	Adds MCPServer helpers has_client_credentials and needs_user_oauth_token; needs_user_oauth_token currently compares incompatible enum types so it will never be true.

Sequence Diagram

sequenceDiagram
    autonumber
    participant Req as Request
    participant MSM as MCPServerManager
    participant Auth as Auth resolver
    participant Cache as Token cache
    participant Tok as OAuth2 endpoint
    participant Srv as MCP server

    Req->>MSM: create client (server, optional override)
    MSM->>Auth: resolve auth
    alt override provided
        Auth-->>MSM: override
    else client-credentials configured
        Auth->>Cache: get token
        alt cache hit
            Cache-->>Auth: token
        else cache miss
            Cache->>Tok: fetch token
            Tok-->>Cache: token response
            Cache-->>Auth: token
        end
        Auth-->>MSM: token
    else static token
        Auth-->>MSM: static token
    end
    MSM->>Srv: instantiate MCP client with auth
    Srv-->>Req: response

greptile-apps

_{4 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

litellm/types/mcp_server/mcp_server_manager.py

litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

litellm/constants.py

…docs - Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s) - Make all MCP OAuth2 constants env-configurable via os.getenv() - Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py) - Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections - Update FAQ in mcp.md to reflect M2M support - Add E2E test script and config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ishaan-jaff · 2026-02-09T23:31:19Z

@greptile can u review again

greptile-apps · 2026-02-09T23:39:41Z

Greptile Overview

Greptile Summary

This PR adds OAuth2 machine-to-machine (client_credentials) support for MCP servers by introducing an in-memory token cache and a single auth-resolution function (resolve_mcp_auth) that prioritizes (1) per-request override headers, then (2) cached OAuth2 tokens, then (3) static configured tokens. It also updates MCPServerManager._create_mcp_client to be async so it can fetch tokens when needed, and adjusts tool discovery/health-check skipping to only skip OAuth2 servers that still require per-user tokens.

Docs are updated to reflect the new OAuth behavior and add a dedicated mcp_oauth.md guide with PKCE and M2M flow diagrams. Tests are updated/added to cover token caching and the new auth priority logic.

Must-fix before merge: the new E2E shell script does not start the proxy and references PROXY_PID without setting it (will fail under set -u), and the new doc file is missing a trailing newline.

Confidence Score: 3/5

This PR is close to mergeable, but it includes a broken E2E script that will fail as written.
Core OAuth2 M2M token caching and auth-resolution changes look internally consistent and call sites were updated for async _create_mcp_client, but the added E2E script does not start the proxy and will error under set -u due to PROXY_PID being unset. Docs also have a minor formatting issue (missing newline).
tests/mcp_tests/test_oauth2_e2e.sh, docs/my-website/docs/mcp_oauth.md

Important Files Changed

Filename	Overview
docs/my-website/docs/mcp_oauth.md	New MCP OAuth guide covering PKCE and M2M flows; currently missing trailing newline at EOF (fix required).
litellm/constants.py	Introduces env-configurable constants for OAuth2 token caching behavior (expiry buffer, max size, default/min TTL).
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py	Centralizes auth resolution (header override → OAuth2 client_credentials cache → static token), makes _create_mcp_client async, and updates discovery/health-check skip logic via needs_user_oauth_token.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py	Adds in-memory OAuth2 client_credentials token fetch + per-server locking and resolve_mcp_auth priority helper.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py	Updates REST endpoint execution path to await the now-async _create_mcp_client.
litellm/types/mcp_server/mcp_server_manager.py	Adds MCPServer.has_client_credentials and needs_user_oauth_token properties to distinguish M2M OAuth2 vs per-user token flows.
tests/mcp_tests/test_oauth2_e2e.sh	Adds an E2E shell script for OAuth2 client_credentials, but it currently never starts the proxy and references PROXY_PID without setting it (will fail under set -u).
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py	Adds unit tests for resolve_mcp_auth priority, caching, and needs_user_oauth_token behavior.

Sequence Diagram

sequenceDiagram
    participant Caller as MCP REST / Manager
    participant Manager as MCPServerManager
    participant Resolver as resolve_mcp_auth()
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token URL
    participant MCP as MCP Server

    Caller->>Manager: _create_mcp_client(server, mcp_auth_header)
    Manager->>Resolver: resolve_mcp_auth(server, mcp_auth_header)

    alt mcp_auth_header provided
        Resolver-->>Manager: mcp_auth_header (override)
    else server has client_credentials
        Resolver->>Cache: async_get_token(server)
        alt token cached
            Cache-->>Resolver: access_token
        else cache miss
            Cache->>Auth: POST token_url (grant_type=client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache-->>Resolver: access_token (cached w/ TTL)
        end
        Resolver-->>Manager: access_token
    else no client_credentials
        Resolver-->>Manager: server.authentication_token
    end

    Manager->>MCP: MCP request (Authorization derived from auth_type + auth_value)
    MCP-->>Manager: MCP response
    Manager-->>Caller: MCP response

greptile-apps

_{8 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-09T23:39:44Z

tests/mcp_tests/test_oauth2_e2e.sh

+#!/usr/bin/env bash
+# E2E test for OAuth2 client_credentials MCP flow
+# Usage: bash tests/mcp_tests/test_oauth2_e2e.sh


Broken E2E script cleanup

With set -euo pipefail, this script will error on exit because cleanup() references PROXY_PID but the variable is never set (and the proxy is never started). As written, the curl calls to :$PROXY_PORT will also fail because nothing launches LiteLLM. Either start the proxy and set PROXY_PID, or guard kill/wait with ${PROXY_PID:-} (and similarly add a missing “Step 2” that runs litellm --config ...).

No E2E script exists in this branch — it was removed from the PR.

greptile-apps · 2026-02-09T23:39:45Z

docs/my-website/docs/mcp_oauth.md

+| `auth_type` | Yes | Must be `oauth2` |
+| `client_id` | Yes | OAuth2 client ID. Supports `os.environ/VAR_NAME` |
+| `client_secret` | Yes | OAuth2 client secret. Supports `os.environ/VAR_NAME` |
+| `token_url` | Yes | Token endpoint URL |


Missing trailing newline

This file is missing a final newline (\ No newline at end of file in the diff). This commonly trips formatting/lint checks; add a newline at EOF.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

File already has a trailing newline — verified with xxd.

ishaan-jaff · 2026-02-09T23:43:17Z

@greptile review again

greptile-apps · 2026-02-09T23:48:43Z

Greptile Overview

Greptile Summary

This PR adds first-class OAuth2 client_credentials (machine-to-machine) support for MCP servers by introducing an async token fetch + in-memory TTL cache (oauth2_token_cache.py) and centralizing auth resolution in MCPServerManager._create_mcp_client() (per-request override → cached M2M token → static configured token). It also updates tool discovery and health-check skip behavior to only skip OAuth2 servers that require user-provided tokens, adds tests around auth priority/caching behavior, and ships new docs + sidebar entries for the MCP OAuth guide.

Main merge-blockers are around robustness and leakage in the new token fetch path: the token fetch currently assumes well-formed token responses and can raise unexpectedly on expires_in: null/invalid types, and it logs/echoes potentially sensitive token endpoint/body details in ways that can end up in logs on failures.

Confidence Score: 3/5

This PR is close to safe to merge, but the new OAuth2 token fetch path needs a couple robustness/logging fixes to avoid runtime failures or sensitive-data leakage.
Core design (async auth resolution + per-server locking + TTL cache) is sound and tests cover expected behavior, but the new token fetch code assumes certain response shapes and includes potentially sensitive details in logs/exceptions. Fixing those edge cases should reduce operational risk significantly.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

Important Files Changed

Filename	Overview
docs/my-website/docs/mcp_oauth.md	Adds new MCP OAuth documentation with PKCE + M2M sections and Mermaid diagrams; ensure doc build accepts the new page.
litellm/constants.py	Adds env-configurable constants for MCP OAuth2 token caching (expiry buffer, cache size, default TTL, min TTL).
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py	Introduces OAuth2 client_credentials token fetch + in-memory cache and auth resolution helper; has a few edge cases where token responses/logging could leak or raise unexpectedly.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py	Switches MCP client creation to async and centralizes auth resolution (per-request override > M2M token > static token) and skips only OAuth2 servers that need user tokens.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py	Updates REST endpoints to await async MCP client creation in test flows; change is localized.
litellm/proxy/proxy_config.yaml	Adds a localhost test OAuth2 MCP server example config; may not be appropriate to ship in default proxy_config.yaml for all deployments.
litellm/types/mcp_server/mcp_server_manager.py	Adds MCPServer helpers for client_credentials presence and whether user OAuth token is needed; used to drive tool discovery/health-check skip logic.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py	Adds new unit tests covering M2M token fetch, caching, auth priority, and needs_user_oauth_token property.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Proxy as LiteLLM Proxy
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token Endpoint
    participant MCP as MCP Server

    Client->>Proxy: MCP REST/JSON-RPC request
    Note over Proxy: resolve_mcp_auth()
    Proxy->>Cache: async_get_token(server)
    alt Token cached
        Cache-->>Proxy: access_token
    else Cache miss
        Cache->>Cache: acquire per-server lock
        Cache->>Auth: POST token_url (client_credentials)
        Auth-->>Cache: access_token + expires_in
        Cache-->>Proxy: access_token (cached with TTL)
    end
    Proxy->>MCP: MCP request (Bearer access_token)
    MCP-->>Proxy: MCP response
    Proxy-->>Client: Response

greptile-apps

_{8 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

ishaan-jaff · 2026-02-10T00:00:56Z

@greptile review again

ishaan-jaff · 2026-02-10T00:02:34Z

@greptile review again

greptile-apps · 2026-02-10T00:08:12Z

Greptile Overview

Greptile Summary

This PR adds first-class machine-to-machine (M2M) OAuth2 client_credentials support for MCP servers by introducing an in-memory token cache (oauth2_token_cache.py) and routing all MCP client auth selection through a single resolver (resolve_mcp_auth). MCPServerManager._create_mcp_client is made async so it can fetch/cache tokens before instantiating the MCP client, and server discovery/health-check logic now only skips OAuth2 servers that require per-user tokens (needs_user_oauth_token). Docs are expanded with a new mcp_oauth.md guide and sidebar link, and tests are updated/added to cover auth priority and token caching behavior.

Key integration points:

_create_mcp_client becomes async and is awaited by tool listing/calling and REST test endpoints.
resolve_mcp_auth() determines which auth value is used per request (header override > cached M2M token > static config token).
Token caching is per-process (in-memory) via InMemoryCache and controlled by new env-configurable constants in litellm/constants.py.

Confidence Score: 3/5

Reasonably safe to merge after fixing a couple of concurrency/error-handling issues in the new OAuth2 token cache.
Core logic is localized and covered by tests, but the new per-server lock map is not created atomically and the token response parsing assumes a JSON object, both of which can cause real runtime issues under concurrency or non-standard token responses.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

Important Files Changed

Filename	Overview
docs/my-website/docs/mcp.md	Updates MCP OAuth section to point to new OAuth guide and clarifies that client_credentials is now proxy-managed.
docs/my-website/docs/mcp_oauth.md	Adds new MCP OAuth documentation with mermaid diagrams and M2M setup instructions; no code issues but references an external mock server script not included here.
docs/my-website/sidebars.js	Adds mcp_oauth doc to sidebar navigation.
litellm/constants.py	Adds env-configurable constants for OAuth2 token caching; risk if env vars set to invalid non-int strings causing startup ValueError.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py	Makes _create_mcp_client async and centralizes auth resolution via resolve_mcp_auth; introduces await call sites and skip logic via needs_user_oauth_token.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py	Introduces in-memory OAuth2 client_credentials token cache and resolver; possible KeyError in lock map under concurrent access and per-process cache semantics.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py	Updates REST test endpoints to await async _create_mcp_client; otherwise unchanged.
litellm/types/mcp_server/mcp_server_manager.py	Adds has_client_credentials/needs_user_oauth_token helpers; needs_user_oauth_token uses MCPAuth enum which must match auth_type runtime values.
tests/mcp_tests/test_mcp_auth_priority.py	Adjusts tests for async _create_mcp_client and validates header priority remains correct.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server.py	Updates mocked _create_mcp_client to async for oauth2 header passing test.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py	Updates manager tests to treat _create_mcp_client as async and switches MagicMock to AsyncMock where needed.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py	Adds unit tests for resolve_mcp_auth, caching behavior, and needs_user_oauth_token property.
tests/test_litellm/proxy/_experimental/mcp_server/test_rest_endpoints.py	Updates REST endpoint tests to mock async client creation.

Sequence Diagram

sequenceDiagram
    participant MCPClient as MCP Client
    participant REST as /mcp-rest
    participant Manager as MCPServerManager
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token URL
    participant MCP as MCP Server

    MCPClient->>REST: POST /mcp-rest/tools/call (optional mcp_auth_header)
    REST->>Manager: execute_mcp_tool / _get_tools_from_server
    Manager->>Cache: resolve_mcp_auth(server, mcp_auth_header)
    alt mcp_auth_header provided
        Cache-->>Manager: return mcp_auth_header
    else server.has_client_credentials
        Cache->>Cache: get_cache(server_id)
        alt cache hit
            Cache-->>Manager: return cached access_token
        else cache miss
            Cache->>Auth: POST token_url (client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache->>Cache: set_cache(server_id, access_token, ttl)
            Cache-->>Manager: return access_token
        end
    else
        Cache-->>Manager: return server.authentication_token
    end
    Manager->>MCP: MCP request with Authorization
    MCP-->>Manager: MCP response
    Manager-->>REST: tool response
    REST-->>MCPClient: JSON response

greptile-apps

_{13 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T00:08:16Z

litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

+    def _get_lock(self, server_id: str) -> asyncio.Lock:
+        if server_id not in self._locks:
+            self._locks[server_id] = asyncio.Lock()
+        return self._locks[server_id]


Non-atomic lock creation

_get_lock() mutates self._locks with a if server_id not in self._locks: self._locks[server_id] = asyncio.Lock() check. Under concurrent calls, this check/insert can interleave and create multiple locks for the same server_id, which defeats the “single in-flight fetch per server” guarantee and can lead to duplicate token fetches. Consider making lock creation atomic (e.g., using setdefault) or initializing locks in a thread-safe way.

Fixed — _get_lock() now uses self._locks.setdefault(server_id, asyncio.Lock()) which is atomic and avoids creating duplicate locks under concurrent calls.

greptile-apps · 2026-02-10T00:08:17Z

litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

+        body = response.json()
+
+        access_token = body.get("access_token")
+        if not access_token:
+            raise ValueError(


JSON shape assumptions

body = response.json() is assumed to be a dict (body.get(...)). If the token endpoint returns a non-object JSON payload (e.g., a list or string on some error paths), this will raise AttributeError and surface as an unexpected 500 rather than a clearer OAuth/token error. Adding an explicit isinstance(body, dict) check before accessing .get() would make failures deterministic.

Fixed — added isinstance(body, dict) check after response.json(). Non-dict responses now raise a clear ValueError with the type name instead of an unexpected AttributeError.

- Use dict.setdefault() for atomic per-server lock creation - Add isinstance(body, dict) check before accessing token response fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ishaan-jaff · 2026-02-10T01:15:44Z

@greptile review again

greptile-apps · 2026-02-10T01:18:54Z

Greptile Overview

Greptile Summary

This PR adds first-class OAuth2 client_credentials (machine-to-machine) support for MCP servers by introducing an async in-memory token cache and centralizing auth resolution in MCPServerManager (per-request header override → cached OAuth2 token → static configured token). It also updates tool discovery/health checks to only skip OAuth2 servers that require user-provided tokens, and adds docs + tests for the new behavior.

Confidence Score: 4/5

This PR is close to merge-ready, but at least one updated async test is likely not executing as intended.
Core OAuth2 token caching/auth-resolution changes are cohesive and include targeted tests, but the test suite changes include an async def test without an asyncio marker, which can cause silent non-execution or failures depending on pytest configuration.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py

Important Files Changed

Filename	Overview
docs/my-website/docs/mcp_oauth.md	Adds new MCP OAuth doc covering PKCE + client_credentials with mermaid diagrams and setup examples.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py	Makes _create_mcp_client async and centralizes auth resolution (per-request header → M2M token → static token); updates tool discovery/health-check skip logic.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py	Introduces async in-memory cache and fetcher for OAuth2 client_credentials tokens with per-server locks and safer JSON/TTL parsing.
litellm/types/mcp_server/mcp_server_manager.py	Adds MCPServer.has_client_credentials and needs_user_oauth_token convenience properties.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py	Converts tests/mocks to AsyncMock for async _create_mcp_client; one test method became async without an asyncio marker.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py	Adds unit tests for resolve_mcp_auth priority, caching, and needs_user_oauth_token behavior.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant LiteLLM as LiteLLM Proxy
    participant Cache as OAuth2 Token Cache
    participant Auth as OAuth2 Token Endpoint
    participant MCP as MCP Server

    Client->>LiteLLM: MCP REST / tools call
    LiteLLM->>Cache: resolve_mcp_auth(server, mcp_auth_header?)
    alt per-request header provided
        Cache-->>LiteLLM: return mcp_auth_header
    else has client_credentials configured
        Cache->>Cache: get_cache(server_id)
        alt cache hit
            Cache-->>LiteLLM: cached access_token
        else cache miss
            Cache->>Cache: lock(server_id)
            Cache->>Auth: POST token_url (grant_type=client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache->>Cache: set_cache(server_id, token, ttl=expires_in-buffer)
            Cache-->>LiteLLM: access_token
        end
    else
        Cache-->>LiteLLM: server.authentication_token
    end
    LiteLLM->>MCP: MCP request with resolved Authorization
    MCP-->>LiteLLM: MCP response
    LiteLLM-->>Client: response

greptile-apps

_{6 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T01:19:02Z

Additional Comments (1)

tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py
Async test not awaited

test_create_mcp_client_stdio was changed to async def but isn’t marked with @pytest.mark.asyncio (and doesn’t appear to be part of an IsolatedAsyncioTestCase). In pytest this typically means the coroutine is never awaited and the assertions won’t run (or the test will error depending on config). Add the asyncio marker (or convert back to sync) so this test actually executes.

- Replace `assert` statements with `if/raise ValueError` (asserts can be disabled with python -O in production) - Wrap `httpx.HTTPStatusError` to provide a clear error message with server_id and status code - Add tests for HTTP error and non-dict JSON response error paths - Remove unused imports Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ishaan-jaff · 2026-02-10T01:20:08Z

@greptile review again

greptile-apps

_{10 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T01:35:59Z

Greptile Overview

Greptile Summary

This PR adds first-class OAuth2 machine-to-machine (client_credentials) support for MCP servers. It introduces an in-memory OAuth2 token cache with per-server locks and TTL-based refresh, centralizes MCP auth resolution in one place (per-request header override > cached M2M token > static server token), and updates tool discovery / health checks to only skip OAuth2 servers that require user-provided tokens.

Documentation is expanded with a new MCP OAuth guide (PKCE + M2M) and sidebar wiring, and tests are updated/added to cover auth priority and token-cache behavior across the experimental MCP proxy endpoints.

Confidence Score: 3/5

This PR is close to mergeable, but there are a couple of concrete lifecycle/type-consistency issues to address around OAuth2 token fetching and OAuth2 detection.
Core flow (auth priority, caching/locking, tests/docs) is cohesive, but the token fetch path may leak httpx clients if the helper returns a new AsyncClient per call, and oauth2 server detection relies on auth_type == MCPAuth.oauth2 even though the field is typed as a Literal-based MCPAuthType; if auth_type is stored as a plain string, skip logic can be wrong. These should be fixed/verified before merging since they affect runtime behavior under load.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py; litellm/types/mcp_server/mcp_server_manager.py

Important Files Changed

Filename	Overview
docs/my-website/docs/mcp.md	Updates MCP docs to link to new MCP OAuth guide and clarifies that client_credentials tokens are now proxy-managed.
docs/my-website/docs/mcp_oauth.md	Adds dedicated documentation for MCP OAuth flows (PKCE + M2M client_credentials) including mermaid diagrams and local test instructions.
docs/my-website/sidebars.js	Adds new mcp_oauth doc page to the MCP docs sidebar.
litellm/constants.py	Introduces env-configurable constants for MCP OAuth2 token caching (expiry buffer, cache size, default TTL, min TTL).
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py	Switches MCP client creation to async and centralizes auth resolution (header override > cached M2M token > static token); adjusts tool discovery/health-check skipping to use needs_user_oauth_token.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py	Adds in-memory OAuth2 client_credentials token fetch + TTL cache + per-server locks and resolve_mcp_auth helper; possible issue if httpx client returned is not closed/shared.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py	Updates REST endpoints to await async _create_mcp_client after manager change.
litellm/types/mcp_server/mcp_server_manager.py	Adds has_client_credentials / needs_user_oauth_token properties to MCPServer; needs_user_oauth_token may mis-detect oauth2 if auth_type is stored as a string literal rather than MCPAuth enum.
tests/mcp_tests/test_mcp_auth_priority.py	Updates auth priority tests to await async _create_mcp_client and marks the relevant test async.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server.py	Adjusts MCP server tests to mock async _create_mcp_client.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py	Updates MCP server manager tests to use AsyncMock/new_callable=AsyncMock for async _create_mcp_client.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py	Adds targeted tests for resolve_mcp_auth, token caching behavior, error wrapping, and needs_user_oauth_token logic.
tests/test_litellm/proxy/_experimental/mcp_server/test_rest_endpoints.py	Updates REST endpoint tests to mock async _create_mcp_client for execute_with_mcp_client helpers.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Proxy as LiteLLM Proxy
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token Endpoint
    participant MCP as MCP Server

    Client->>Proxy: MCP request (may include Authorization override)
    Proxy->>Proxy: resolve_mcp_auth(server, mcp_auth_header)

    alt mcp_auth_header provided
        Proxy-->>Client: Use provided auth header value
    else server.has_client_credentials
        Proxy->>Cache: async_get_token(server)
        alt token cached
            Cache-->>Proxy: access_token
        else token missing/expired
            Cache->>Cache: acquire per-server lock
            Cache->>Auth: POST token_url (client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache->>Cache: set_cache(server_id, token, ttl)
            Cache-->>Proxy: access_token
        end
    else fallback
        Proxy-->>Client: Use server.authentication_token
    end

    Proxy->>MCP: Forward MCP request with resolved auth
    MCP-->>Proxy: MCP response
    Proxy-->>Client: Response

greptile-apps

_{13 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T01:36:03Z

litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

+        client = get_async_httpx_client(llm_provider=httpxSpecialProvider.MCP)
+
+        if not server.client_id or not server.client_secret or not server.token_url:
+            raise ValueError(
+                f"MCP server '{server.server_id}' missing required OAuth2 fields: "
+                f"client_id={bool(server.client_id)}, "
+                f"client_secret={bool(server.client_secret)}, "
+                f"token_url={bool(server.token_url)}"
+            )
+
+        data: Dict[str, str] = {
+            "grant_type": "client_credentials",
+            "client_id": server.client_id,
+            "client_secret": server.client_secret,
+        }
+        if server.scopes:
+            data["scope"] = " ".join(server.scopes)
+
+        verbose_logger.debug(
+            "Fetching OAuth2 client_credentials token for MCP server %s",
+            server.server_id,
+        )
+
+        try:
+            response = await client.post(server.token_url, data=data)
+            response.raise_for_status()
+        except httpx.HTTPStatusError as exc:
+            raise ValueError(
+                f"OAuth2 token request for MCP server '{server.server_id}' "
+                f"failed with status {exc.response.status_code}"
+            ) from exc


Unclosed HTTP client

_fetch_token() creates a new httpx client via get_async_httpx_client(...) but never closes it. If that helper returns a fresh AsyncClient (vs a shared singleton), this leaks connections/file descriptors over time as tokens refresh. Either ensure the helper returns a process-wide shared client for httpxSpecialProvider.MCP, or create/close the client in an async context manager for this call path.

greptile-apps · 2026-02-10T01:36:04Z

litellm/types/mcp_server/mcp_server_manager.py

+    def has_client_credentials(self) -> bool:
+        """True if this server has OAuth2 client_credentials config (client_id, client_secret, token_url)."""
+        return bool(self.client_id and self.client_secret and self.token_url)
+
+    @property
+    def needs_user_oauth_token(self) -> bool:
+        """True if this is an OAuth2 server that relies on per-user tokens (no client_credentials)."""
+        return self.auth_type == MCPAuth.oauth2 and not self.has_client_credentials


OAuth2 detection too narrow

needs_user_oauth_token only returns true when auth_type == MCPAuth.oauth2, but auth_type is typed as Optional[MCPAuthType] (from litellm.proxy._types). If callers populate auth_type with the string literal (e.g., 'oauth2') rather than the MCPAuth enum instance, this check will be false and OAuth2 servers could incorrectly not be skipped for tool discovery / health checks. Consider normalizing/comparing against the same representation MCPAuthType uses (string literal) or casting auth_type to MCPAuth consistently at model creation.

…logging_payload is missing (#20851) * fix: Preserved nullable object fields by carrying schema properties * Fix: _convert_schema_types * Fix all mypy issues * Add alert about email notifications * fixing tests * extending timeout for long running tests * Text changes * [Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support (#20788) * add has_client_credentials * MCPOAuth2TokenCache * init MCP Oauth2 constants * MCPOAuth2TokenCache * resolve_mcp_auth * test fixes * docs fix * address greptile review: min TTL, env-configurable constants, tests, docs - Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s) - Make all MCP OAuth2 constants env-configurable via os.getenv() - Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py) - Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections - Update FAQ in mcp.md to reflect M2M support - Add E2E test script and config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix mypy lint * fix oauth2 * remove old files * docs fix * address greptile comments * fix: atomic lock creation + validate JSON response shape - Use dict.setdefault() for atomic per-server lock creation - Add isinstance(body, dict) check before accessing token response fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace asserts with proper guards, wrap HTTP errors with context - Replace `assert` statements with `if/raise ValueError` (asserts can be disabled with python -O in production) - Wrap `httpx.HTTPStatusError` to provide a clear error message with server_id and status code - Add tests for HTTP error and non-dict JSON response error paths - Remove unused imports Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * [UI] M2M OAuth2 UI Flow (#20794) * add has_client_credentials * MCPOAuth2TokenCache * init MCP Oauth2 constants * MCPOAuth2TokenCache * resolve_mcp_auth * test fixes * docs fix * address greptile review: min TTL, env-configurable constants, tests, docs - Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s) - Make all MCP OAuth2 constants env-configurable via os.getenv() - Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py) - Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections - Update FAQ in mcp.md to reflect M2M support - Add E2E test script and config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix mypy lint * fix oauth2 * ui feat fixes * test M2M * test fix * ui feats * ui fixes * ui fix client ID * fix: backend endpoints * docs fix * fixes greptile --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * [Fix] prevent shared backend model key from being polluted by per-deployment custom pricing (#20679) * bug: custom price override for models * added associated test * fix(mcp): resolve OAuth2 root endpoints returning "MCP server not found" (#20784) When MCP SDK hits root-level /register, /authorize, /token without server name prefix, auto-resolve to the single configured OAuth2 server. Also fix WWW-Authenticate header to use correct public URL behind reverse proxy. * Add support for langchain_aws via litellm passthrough * fix(proxy): return early instead of raising ValueError when standard_logging_payload is missing The `_PROXY_VirtualKeyModelMaxBudgetLimiter.async_log_success_event` hook raises `ValueError` when `standard_logging_payload` is `None`. This breaks non-standard call types (e.g. vLLM `/classify`) that do not populate the payload, and the resulting exception disrupts downstream success callbacks like Langfuse. Return early with a debug log instead, matching the existing pattern used for missing `user_api_key_model_max_budget`. Fixes #18986 --------- Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Shivam Rawat <161387515+shivamrawat1@users.noreply.github.com> Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>

…plementation Tests in test_oauth2_token_cache.py expect MCPServer.has_client_credentials to return True when client_id/client_secret are set, but the property requires oauth2_flow='client_credentials' which is not set by the test fixtures or _execute_with_mcp_client. Skipping until the feature is fully implemented (PR #20788). Co-authored-by: yuneng-jiang <yuneng-jiang@users.noreply.github.com>

ishaan-jaff added 5 commits February 9, 2026 14:41

add has_client_credentials

c1636c6

MCPOAuth2TokenCache

9b3fd6a

init MCP Oauth2 constants

b7ea713

MCPOAuth2TokenCache

3c25cda

resolve_mcp_auth

2579169

vercel bot deployed to Preview February 9, 2026 22:52 View deployment

test fixes

2ac13de

greptile-apps bot reviewed Feb 9, 2026

View reviewed changes

ishaan-jaff and others added 2 commits February 9, 2026 15:21

docs fix

1d8de53

vercel bot deployed to Preview February 9, 2026 23:33 View deployment

ishaan-jaff added 2 commits February 9, 2026 15:35

fix mypy lint

08a6197

fix oauth2

4f12a0f

vercel bot deployed to Preview February 9, 2026 23:38 View deployment

greptile-apps bot reviewed Feb 9, 2026

View reviewed changes

ishaan-jaff added 2 commits February 9, 2026 15:41

remove old files

3d8207f

docs fix

d12ec23

vercel bot deployed to Preview February 9, 2026 23:44 View deployment

greptile-apps bot reviewed Feb 9, 2026

View reviewed changes

address greptile comments

3a0b725

vercel bot deployed to Preview February 10, 2026 00:02 View deployment

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

fix: atomic lock creation + validate JSON response shape

e1cd2ea

- Use dict.setdefault() for atomic per-server lock creation - Add isinstance(body, dict) check before accessing token response fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot deployed to Preview February 10, 2026 01:17 View deployment

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

vercel bot deployed to Preview February 10, 2026 01:21 View deployment

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

BerriAI deleted a comment from greptile-apps bot Feb 10, 2026

ishaan-jaff merged commit 19024e0 into main Feb 10, 2026
56 of 67 checks passed

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

ishaan-jaff mentioned this pull request Mar 9, 2026

fix(mcp): don't auto-detect M2M OAuth from field presence #23187

Merged

7 tasks

Uh oh!

Conversation

ishaan-jaff commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!