Skip to content

[Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support#20788

Merged
ishaan-jaff merged 15 commits intomainfrom
litellm_mcp_oauth2_m2m
Feb 10, 2026
Merged

[Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support#20788
ishaan-jaff merged 15 commits intomainfrom
litellm_mcp_oauth2_m2m

Conversation

@ishaan-jaff
Copy link
Member

@ishaan-jaff ishaan-jaff commented Feb 9, 2026

[Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support

Working M2M Oauth 2 support for MCPs

Screenshot 2026-02-09 at 3 08 08 PM

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
✅ Test

Changes

@vercel
Copy link

vercel bot commented Feb 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 10, 2026 1:21am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 9, 2026

Greptile Overview

Greptile Summary

This PR adds OAuth2 client_credentials support for MCP servers by introducing an in-memory token cache (oauth2_token_cache.py) and centralizing auth selection in resolve_mcp_auth(). MCPServerManager is updated to await auth resolution during MCP client creation, and MCPServer gains helpers to distinguish client-credential OAuth2 vs user-token OAuth2. New constants define cache sizing/TTL defaults.

Key integration points are MCP client creation (now async) and server discovery/health-check logic that skips servers requiring per-user OAuth2 tokens.

Confidence Score: 2/5

  • This PR needs fixes before merging due to a definite OAuth2 server classification bug affecting runtime behavior.
  • Confidence is reduced because needs_user_oauth_token currently compares incompatible enum types, so OAuth2 user-token servers will be misclassified and no longer skipped in tool discovery / health checks. There’s also a concrete edge case where computed TTL can be zero, potentially causing repeated token fetches under load, plus synchronous JSON parsing in an async request path.
  • litellm/types/mcp_server/mcp_server_manager.py, litellm/proxy/_experimental/mcp_server/mcp_server_manager.py, litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

Important Files Changed

Filename Overview
litellm/constants.py Adds default constants for MCP OAuth2 token cache sizing/TTL/expiry buffer; values are currently hard-coded (not env-configurable).
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Makes MCP client creation async and centralizes auth resolution via resolve_mcp_auth; introduces a bug where OAuth2 user-token servers may not be detected correctly due to enum mismatch in MCPServer.needs_user_oauth_token usage.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py Introduces an in-memory async OAuth2 client_credentials token cache and auth resolver; watch for ttl=0 causing refetch loops and synchronous JSON parsing in async path.
litellm/types/mcp_server/mcp_server_manager.py Adds MCPServer helpers has_client_credentials and needs_user_oauth_token; needs_user_oauth_token currently compares incompatible enum types so it will never be true.

Sequence Diagram

sequenceDiagram
    autonumber
    participant Req as Request
    participant MSM as MCPServerManager
    participant Auth as Auth resolver
    participant Cache as Token cache
    participant Tok as OAuth2 endpoint
    participant Srv as MCP server

    Req->>MSM: create client (server, optional override)
    MSM->>Auth: resolve auth
    alt override provided
        Auth-->>MSM: override
    else client-credentials configured
        Auth->>Cache: get token
        alt cache hit
            Cache-->>Auth: token
        else cache miss
            Cache->>Tok: fetch token
            Tok-->>Cache: token response
            Cache-->>Auth: token
        end
        Auth-->>MSM: token
    else static token
        Auth-->>MSM: static token
    end
    MSM->>Srv: instantiate MCP client with auth
    Srv-->>Req: response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

ishaan-jaff and others added 2 commits February 9, 2026 15:21
…docs

- Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s)
- Make all MCP OAuth2 constants env-configurable via os.getenv()
- Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py)
- Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections
- Update FAQ in mcp.md to reflect M2M support
- Add E2E test script and config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ishaan-jaff
Copy link
Member Author

@greptile can u review again

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 9, 2026

Greptile Overview

Greptile Summary

This PR adds OAuth2 machine-to-machine (client_credentials) support for MCP servers by introducing an in-memory token cache and a single auth-resolution function (resolve_mcp_auth) that prioritizes (1) per-request override headers, then (2) cached OAuth2 tokens, then (3) static configured tokens. It also updates MCPServerManager._create_mcp_client to be async so it can fetch tokens when needed, and adjusts tool discovery/health-check skipping to only skip OAuth2 servers that still require per-user tokens.

Docs are updated to reflect the new OAuth behavior and add a dedicated mcp_oauth.md guide with PKCE and M2M flow diagrams. Tests are updated/added to cover token caching and the new auth priority logic.

Must-fix before merge: the new E2E shell script does not start the proxy and references PROXY_PID without setting it (will fail under set -u), and the new doc file is missing a trailing newline.

Confidence Score: 3/5

  • This PR is close to mergeable, but it includes a broken E2E script that will fail as written.
  • Core OAuth2 M2M token caching and auth-resolution changes look internally consistent and call sites were updated for async _create_mcp_client, but the added E2E script does not start the proxy and will error under set -u due to PROXY_PID being unset. Docs also have a minor formatting issue (missing newline).
  • tests/mcp_tests/test_oauth2_e2e.sh, docs/my-website/docs/mcp_oauth.md

Important Files Changed

Filename Overview
docs/my-website/docs/mcp_oauth.md New MCP OAuth guide covering PKCE and M2M flows; currently missing trailing newline at EOF (fix required).
litellm/constants.py Introduces env-configurable constants for OAuth2 token caching behavior (expiry buffer, max size, default/min TTL).
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Centralizes auth resolution (header override → OAuth2 client_credentials cache → static token), makes _create_mcp_client async, and updates discovery/health-check skip logic via needs_user_oauth_token.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py Adds in-memory OAuth2 client_credentials token fetch + per-server locking and resolve_mcp_auth priority helper.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py Updates REST endpoint execution path to await the now-async _create_mcp_client.
litellm/types/mcp_server/mcp_server_manager.py Adds MCPServer.has_client_credentials and needs_user_oauth_token properties to distinguish M2M OAuth2 vs per-user token flows.
tests/mcp_tests/test_oauth2_e2e.sh Adds an E2E shell script for OAuth2 client_credentials, but it currently never starts the proxy and references PROXY_PID without setting it (will fail under set -u).
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py Adds unit tests for resolve_mcp_auth priority, caching, and needs_user_oauth_token behavior.

Sequence Diagram

sequenceDiagram
    participant Caller as MCP REST / Manager
    participant Manager as MCPServerManager
    participant Resolver as resolve_mcp_auth()
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token URL
    participant MCP as MCP Server

    Caller->>Manager: _create_mcp_client(server, mcp_auth_header)
    Manager->>Resolver: resolve_mcp_auth(server, mcp_auth_header)

    alt mcp_auth_header provided
        Resolver-->>Manager: mcp_auth_header (override)
    else server has client_credentials
        Resolver->>Cache: async_get_token(server)
        alt token cached
            Cache-->>Resolver: access_token
        else cache miss
            Cache->>Auth: POST token_url (grant_type=client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache-->>Resolver: access_token (cached w/ TTL)
        end
        Resolver-->>Manager: access_token
    else no client_credentials
        Resolver-->>Manager: server.authentication_token
    end

    Manager->>MCP: MCP request (Authorization derived from auth_type + auth_value)
    MCP-->>Manager: MCP response
    Manager-->>Caller: MCP response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +1 to +3
#!/usr/bin/env bash
# E2E test for OAuth2 client_credentials MCP flow
# Usage: bash tests/mcp_tests/test_oauth2_e2e.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken E2E script cleanup

With set -euo pipefail, this script will error on exit because cleanup() references PROXY_PID but the variable is never set (and the proxy is never started). As written, the curl calls to :$PROXY_PORT will also fail because nothing launches LiteLLM. Either start the proxy and set PROXY_PID, or guard kill/wait with ${PROXY_PID:-} (and similarly add a missing “Step 2” that runs litellm --config ...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No E2E script exists in this branch — it was removed from the PR.

Comment on lines +188 to +191
| `auth_type` | Yes | Must be `oauth2` |
| `client_id` | Yes | OAuth2 client ID. Supports `os.environ/VAR_NAME` |
| `client_secret` | Yes | OAuth2 client secret. Supports `os.environ/VAR_NAME` |
| `token_url` | Yes | Token endpoint URL |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing trailing newline

This file is missing a final newline (\ No newline at end of file in the diff). This commonly trips formatting/lint checks; add a newline at EOF.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File already has a trailing newline — verified with xxd.

@ishaan-jaff
Copy link
Member Author

@greptile review again

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 9, 2026

Greptile Overview

Greptile Summary

This PR adds first-class OAuth2 client_credentials (machine-to-machine) support for MCP servers by introducing an async token fetch + in-memory TTL cache (oauth2_token_cache.py) and centralizing auth resolution in MCPServerManager._create_mcp_client() (per-request override → cached M2M token → static configured token). It also updates tool discovery and health-check skip behavior to only skip OAuth2 servers that require user-provided tokens, adds tests around auth priority/caching behavior, and ships new docs + sidebar entries for the MCP OAuth guide.

Main merge-blockers are around robustness and leakage in the new token fetch path: the token fetch currently assumes well-formed token responses and can raise unexpectedly on expires_in: null/invalid types, and it logs/echoes potentially sensitive token endpoint/body details in ways that can end up in logs on failures.

Confidence Score: 3/5

  • This PR is close to safe to merge, but the new OAuth2 token fetch path needs a couple robustness/logging fixes to avoid runtime failures or sensitive-data leakage.
  • Core design (async auth resolution + per-server locking + TTL cache) is sound and tests cover expected behavior, but the new token fetch code assumes certain response shapes and includes potentially sensitive details in logs/exceptions. Fixing those edge cases should reduce operational risk significantly.
  • litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

Important Files Changed

Filename Overview
docs/my-website/docs/mcp_oauth.md Adds new MCP OAuth documentation with PKCE + M2M sections and Mermaid diagrams; ensure doc build accepts the new page.
litellm/constants.py Adds env-configurable constants for MCP OAuth2 token caching (expiry buffer, cache size, default TTL, min TTL).
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py Introduces OAuth2 client_credentials token fetch + in-memory cache and auth resolution helper; has a few edge cases where token responses/logging could leak or raise unexpectedly.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Switches MCP client creation to async and centralizes auth resolution (per-request override > M2M token > static token) and skips only OAuth2 servers that need user tokens.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py Updates REST endpoints to await async MCP client creation in test flows; change is localized.
litellm/proxy/proxy_config.yaml Adds a localhost test OAuth2 MCP server example config; may not be appropriate to ship in default proxy_config.yaml for all deployments.
litellm/types/mcp_server/mcp_server_manager.py Adds MCPServer helpers for client_credentials presence and whether user OAuth token is needed; used to drive tool discovery/health-check skip logic.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py Adds new unit tests covering M2M token fetch, caching, auth priority, and needs_user_oauth_token property.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Proxy as LiteLLM Proxy
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token Endpoint
    participant MCP as MCP Server

    Client->>Proxy: MCP REST/JSON-RPC request
    Note over Proxy: resolve_mcp_auth()
    Proxy->>Cache: async_get_token(server)
    alt Token cached
        Cache-->>Proxy: access_token
    else Cache miss
        Cache->>Cache: acquire per-server lock
        Cache->>Auth: POST token_url (client_credentials)
        Auth-->>Cache: access_token + expires_in
        Cache-->>Proxy: access_token (cached with TTL)
    end
    Proxy->>MCP: MCP request (Bearer access_token)
    MCP-->>Proxy: MCP response
    Proxy-->>Client: Response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@ishaan-jaff
Copy link
Member Author

@greptile review again

@ishaan-jaff
Copy link
Member Author

@greptile review again

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

This PR adds first-class machine-to-machine (M2M) OAuth2 client_credentials support for MCP servers by introducing an in-memory token cache (oauth2_token_cache.py) and routing all MCP client auth selection through a single resolver (resolve_mcp_auth). MCPServerManager._create_mcp_client is made async so it can fetch/cache tokens before instantiating the MCP client, and server discovery/health-check logic now only skips OAuth2 servers that require per-user tokens (needs_user_oauth_token). Docs are expanded with a new mcp_oauth.md guide and sidebar link, and tests are updated/added to cover auth priority and token caching behavior.

Key integration points:

  • _create_mcp_client becomes async and is awaited by tool listing/calling and REST test endpoints.
  • resolve_mcp_auth() determines which auth value is used per request (header override > cached M2M token > static config token).
  • Token caching is per-process (in-memory) via InMemoryCache and controlled by new env-configurable constants in litellm/constants.py.

Confidence Score: 3/5

  • Reasonably safe to merge after fixing a couple of concurrency/error-handling issues in the new OAuth2 token cache.
  • Core logic is localized and covered by tests, but the new per-server lock map is not created atomically and the token response parsing assumes a JSON object, both of which can cause real runtime issues under concurrency or non-standard token responses.
  • litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py

Important Files Changed

Filename Overview
docs/my-website/docs/mcp.md Updates MCP OAuth section to point to new OAuth guide and clarifies that client_credentials is now proxy-managed.
docs/my-website/docs/mcp_oauth.md Adds new MCP OAuth documentation with mermaid diagrams and M2M setup instructions; no code issues but references an external mock server script not included here.
docs/my-website/sidebars.js Adds mcp_oauth doc to sidebar navigation.
litellm/constants.py Adds env-configurable constants for OAuth2 token caching; risk if env vars set to invalid non-int strings causing startup ValueError.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Makes _create_mcp_client async and centralizes auth resolution via resolve_mcp_auth; introduces await call sites and skip logic via needs_user_oauth_token.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py Introduces in-memory OAuth2 client_credentials token cache and resolver; possible KeyError in lock map under concurrent access and per-process cache semantics.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py Updates REST test endpoints to await async _create_mcp_client; otherwise unchanged.
litellm/types/mcp_server/mcp_server_manager.py Adds has_client_credentials/needs_user_oauth_token helpers; needs_user_oauth_token uses MCPAuth enum which must match auth_type runtime values.
tests/mcp_tests/test_mcp_auth_priority.py Adjusts tests for async _create_mcp_client and validates header priority remains correct.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server.py Updates mocked _create_mcp_client to async for oauth2 header passing test.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py Updates manager tests to treat _create_mcp_client as async and switches MagicMock to AsyncMock where needed.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py Adds unit tests for resolve_mcp_auth, caching behavior, and needs_user_oauth_token property.
tests/test_litellm/proxy/_experimental/mcp_server/test_rest_endpoints.py Updates REST endpoint tests to mock async client creation.

Sequence Diagram

sequenceDiagram
    participant MCPClient as MCP Client
    participant REST as /mcp-rest
    participant Manager as MCPServerManager
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token URL
    participant MCP as MCP Server

    MCPClient->>REST: POST /mcp-rest/tools/call (optional mcp_auth_header)
    REST->>Manager: execute_mcp_tool / _get_tools_from_server
    Manager->>Cache: resolve_mcp_auth(server, mcp_auth_header)
    alt mcp_auth_header provided
        Cache-->>Manager: return mcp_auth_header
    else server.has_client_credentials
        Cache->>Cache: get_cache(server_id)
        alt cache hit
            Cache-->>Manager: return cached access_token
        else cache miss
            Cache->>Auth: POST token_url (client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache->>Cache: set_cache(server_id, access_token, ttl)
            Cache-->>Manager: return access_token
        end
    else
        Cache-->>Manager: return server.authentication_token
    end
    Manager->>MCP: MCP request with Authorization
    MCP-->>Manager: MCP response
    Manager-->>REST: tool response
    REST-->>MCPClient: JSON response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +41 to +44
def _get_lock(self, server_id: str) -> asyncio.Lock:
if server_id not in self._locks:
self._locks[server_id] = asyncio.Lock()
return self._locks[server_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-atomic lock creation

_get_lock() mutates self._locks with a if server_id not in self._locks: self._locks[server_id] = asyncio.Lock() check. Under concurrent calls, this check/insert can interleave and create multiple locks for the same server_id, which defeats the “single in-flight fetch per server” guarantee and can lead to duplicate token fetches. Consider making lock creation atomic (e.g., using setdefault) or initializing locks in a thread-safe way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — _get_lock() now uses self._locks.setdefault(server_id, asyncio.Lock()) which is atomic and avoids creating duplicate locks under concurrent calls.

Comment on lines +98 to +102
body = response.json()

access_token = body.get("access_token")
if not access_token:
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON shape assumptions

body = response.json() is assumed to be a dict (body.get(...)). If the token endpoint returns a non-object JSON payload (e.g., a list or string on some error paths), this will raise AttributeError and surface as an unexpected 500 rather than a clearer OAuth/token error. Adding an explicit isinstance(body, dict) check before accessing .get() would make failures deterministic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added isinstance(body, dict) check after response.json(). Non-dict responses now raise a clear ValueError with the type name instead of an unexpected AttributeError.

- Use dict.setdefault() for atomic per-server lock creation
- Add isinstance(body, dict) check before accessing token response fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ishaan-jaff
Copy link
Member Author

@greptile review again

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

This PR adds first-class OAuth2 client_credentials (machine-to-machine) support for MCP servers by introducing an async in-memory token cache and centralizing auth resolution in MCPServerManager (per-request header override → cached OAuth2 token → static configured token). It also updates tool discovery/health checks to only skip OAuth2 servers that require user-provided tokens, and adds docs + tests for the new behavior.

Confidence Score: 4/5

  • This PR is close to merge-ready, but at least one updated async test is likely not executing as intended.
  • Core OAuth2 token caching/auth-resolution changes are cohesive and include targeted tests, but the test suite changes include an async def test without an asyncio marker, which can cause silent non-execution or failures depending on pytest configuration.
  • tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py

Important Files Changed

Filename Overview
docs/my-website/docs/mcp_oauth.md Adds new MCP OAuth doc covering PKCE + client_credentials with mermaid diagrams and setup examples.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Makes _create_mcp_client async and centralizes auth resolution (per-request header → M2M token → static token); updates tool discovery/health-check skip logic.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py Introduces async in-memory cache and fetcher for OAuth2 client_credentials tokens with per-server locks and safer JSON/TTL parsing.
litellm/types/mcp_server/mcp_server_manager.py Adds MCPServer.has_client_credentials and needs_user_oauth_token convenience properties.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py Converts tests/mocks to AsyncMock for async _create_mcp_client; one test method became async without an asyncio marker.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py Adds unit tests for resolve_mcp_auth priority, caching, and needs_user_oauth_token behavior.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant LiteLLM as LiteLLM Proxy
    participant Cache as OAuth2 Token Cache
    participant Auth as OAuth2 Token Endpoint
    participant MCP as MCP Server

    Client->>LiteLLM: MCP REST / tools call
    LiteLLM->>Cache: resolve_mcp_auth(server, mcp_auth_header?)
    alt per-request header provided
        Cache-->>LiteLLM: return mcp_auth_header
    else has client_credentials configured
        Cache->>Cache: get_cache(server_id)
        alt cache hit
            Cache-->>LiteLLM: cached access_token
        else cache miss
            Cache->>Cache: lock(server_id)
            Cache->>Auth: POST token_url (grant_type=client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache->>Cache: set_cache(server_id, token, ttl=expires_in-buffer)
            Cache-->>LiteLLM: access_token
        end
    else
        Cache-->>LiteLLM: server.authentication_token
    end
    LiteLLM->>MCP: MCP request with resolved Authorization
    MCP-->>LiteLLM: MCP response
    LiteLLM-->>Client: response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 10, 2026

Additional Comments (1)

tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py
Async test not awaited

test_create_mcp_client_stdio was changed to async def but isn’t marked with @pytest.mark.asyncio (and doesn’t appear to be part of an IsolatedAsyncioTestCase). In pytest this typically means the coroutine is never awaited and the assertions won’t run (or the test will error depending on config). Add the asyncio marker (or convert back to sync) so this test actually executes.

- Replace `assert` statements with `if/raise ValueError` (asserts can be
  disabled with python -O in production)
- Wrap `httpx.HTTPStatusError` to provide a clear error message with
  server_id and status code
- Add tests for HTTP error and non-dict JSON response error paths
- Remove unused imports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ishaan-jaff
Copy link
Member Author

@greptile review again

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@BerriAI BerriAI deleted a comment from greptile-apps bot Feb 10, 2026
@BerriAI BerriAI deleted a comment from greptile-apps bot Feb 10, 2026
@ishaan-jaff ishaan-jaff merged commit 19024e0 into main Feb 10, 2026
56 of 67 checks passed
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

This PR adds first-class OAuth2 machine-to-machine (client_credentials) support for MCP servers. It introduces an in-memory OAuth2 token cache with per-server locks and TTL-based refresh, centralizes MCP auth resolution in one place (per-request header override > cached M2M token > static server token), and updates tool discovery / health checks to only skip OAuth2 servers that require user-provided tokens.

Documentation is expanded with a new MCP OAuth guide (PKCE + M2M) and sidebar wiring, and tests are updated/added to cover auth priority and token-cache behavior across the experimental MCP proxy endpoints.

Confidence Score: 3/5

  • This PR is close to mergeable, but there are a couple of concrete lifecycle/type-consistency issues to address around OAuth2 token fetching and OAuth2 detection.
  • Core flow (auth priority, caching/locking, tests/docs) is cohesive, but the token fetch path may leak httpx clients if the helper returns a new AsyncClient per call, and oauth2 server detection relies on auth_type == MCPAuth.oauth2 even though the field is typed as a Literal-based MCPAuthType; if auth_type is stored as a plain string, skip logic can be wrong. These should be fixed/verified before merging since they affect runtime behavior under load.
  • litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py; litellm/types/mcp_server/mcp_server_manager.py

Important Files Changed

Filename Overview
docs/my-website/docs/mcp.md Updates MCP docs to link to new MCP OAuth guide and clarifies that client_credentials tokens are now proxy-managed.
docs/my-website/docs/mcp_oauth.md Adds dedicated documentation for MCP OAuth flows (PKCE + M2M client_credentials) including mermaid diagrams and local test instructions.
docs/my-website/sidebars.js Adds new mcp_oauth doc page to the MCP docs sidebar.
litellm/constants.py Introduces env-configurable constants for MCP OAuth2 token caching (expiry buffer, cache size, default TTL, min TTL).
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Switches MCP client creation to async and centralizes auth resolution (header override > cached M2M token > static token); adjusts tool discovery/health-check skipping to use needs_user_oauth_token.
litellm/proxy/_experimental/mcp_server/oauth2_token_cache.py Adds in-memory OAuth2 client_credentials token fetch + TTL cache + per-server locks and resolve_mcp_auth helper; possible issue if httpx client returned is not closed/shared.
litellm/proxy/_experimental/mcp_server/rest_endpoints.py Updates REST endpoints to await async _create_mcp_client after manager change.
litellm/types/mcp_server/mcp_server_manager.py Adds has_client_credentials / needs_user_oauth_token properties to MCPServer; needs_user_oauth_token may mis-detect oauth2 if auth_type is stored as a string literal rather than MCPAuth enum.
tests/mcp_tests/test_mcp_auth_priority.py Updates auth priority tests to await async _create_mcp_client and marks the relevant test async.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server.py Adjusts MCP server tests to mock async _create_mcp_client.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py Updates MCP server manager tests to use AsyncMock/new_callable=AsyncMock for async _create_mcp_client.
tests/test_litellm/proxy/_experimental/mcp_server/test_oauth2_token_cache.py Adds targeted tests for resolve_mcp_auth, token caching behavior, error wrapping, and needs_user_oauth_token logic.
tests/test_litellm/proxy/_experimental/mcp_server/test_rest_endpoints.py Updates REST endpoint tests to mock async _create_mcp_client for execute_with_mcp_client helpers.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Proxy as LiteLLM Proxy
    participant Cache as MCPOAuth2TokenCache
    participant Auth as OAuth2 Token Endpoint
    participant MCP as MCP Server

    Client->>Proxy: MCP request (may include Authorization override)
    Proxy->>Proxy: resolve_mcp_auth(server, mcp_auth_header)

    alt mcp_auth_header provided
        Proxy-->>Client: Use provided auth header value
    else server.has_client_credentials
        Proxy->>Cache: async_get_token(server)
        alt token cached
            Cache-->>Proxy: access_token
        else token missing/expired
            Cache->>Cache: acquire per-server lock
            Cache->>Auth: POST token_url (client_credentials)
            Auth-->>Cache: {access_token, expires_in}
            Cache->>Cache: set_cache(server_id, token, ttl)
            Cache-->>Proxy: access_token
        end
    else fallback
        Proxy-->>Client: Use server.authentication_token
    end

    Proxy->>MCP: Forward MCP request with resolved auth
    MCP-->>Proxy: MCP response
    Proxy-->>Client: Response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +77 to +107
client = get_async_httpx_client(llm_provider=httpxSpecialProvider.MCP)

if not server.client_id or not server.client_secret or not server.token_url:
raise ValueError(
f"MCP server '{server.server_id}' missing required OAuth2 fields: "
f"client_id={bool(server.client_id)}, "
f"client_secret={bool(server.client_secret)}, "
f"token_url={bool(server.token_url)}"
)

data: Dict[str, str] = {
"grant_type": "client_credentials",
"client_id": server.client_id,
"client_secret": server.client_secret,
}
if server.scopes:
data["scope"] = " ".join(server.scopes)

verbose_logger.debug(
"Fetching OAuth2 client_credentials token for MCP server %s",
server.server_id,
)

try:
response = await client.post(server.token_url, data=data)
response.raise_for_status()
except httpx.HTTPStatusError as exc:
raise ValueError(
f"OAuth2 token request for MCP server '{server.server_id}' "
f"failed with status {exc.response.status_code}"
) from exc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclosed HTTP client

_fetch_token() creates a new httpx client via get_async_httpx_client(...) but never closes it. If that helper returns a fresh AsyncClient (vs a shared singleton), this leaks connections/file descriptors over time as tokens refresh. Either ensure the helper returns a process-wide shared client for httpxSpecialProvider.MCP, or create/close the client in an async context manager for this call path.

Comment on lines +60 to +67
def has_client_credentials(self) -> bool:
"""True if this server has OAuth2 client_credentials config (client_id, client_secret, token_url)."""
return bool(self.client_id and self.client_secret and self.token_url)

@property
def needs_user_oauth_token(self) -> bool:
"""True if this is an OAuth2 server that relies on per-user tokens (no client_credentials)."""
return self.auth_type == MCPAuth.oauth2 and not self.has_client_credentials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OAuth2 detection too narrow

needs_user_oauth_token only returns true when auth_type == MCPAuth.oauth2, but auth_type is typed as Optional[MCPAuthType] (from litellm.proxy._types). If callers populate auth_type with the string literal (e.g., 'oauth2') rather than the MCPAuth enum instance, this check will be false and OAuth2 servers could incorrectly not be skipped for tool discovery / health checks. Consider normalizing/comparing against the same representation MCPAuthType uses (string literal) or casting auth_type to MCPAuth consistently at model creation.

krrishdholakia pushed a commit that referenced this pull request Feb 10, 2026
…logging_payload is missing (#20851)

* fix: Preserved nullable object fields by carrying schema properties

* Fix: _convert_schema_types

* Fix all mypy issues

* Add alert about email notifications

* fixing tests

* extending timeout for long running tests

* Text changes

* [Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support (#20788)

* add has_client_credentials

* MCPOAuth2TokenCache

* init MCP Oauth2 constants

* MCPOAuth2TokenCache

* resolve_mcp_auth

* test fixes

* docs fix

* address greptile review: min TTL, env-configurable constants, tests, docs

- Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s)
- Make all MCP OAuth2 constants env-configurable via os.getenv()
- Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py)
- Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections
- Update FAQ in mcp.md to reflect M2M support
- Add E2E test script and config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix mypy lint

* fix oauth2

* remove old files

* docs fix

* address greptile comments

* fix: atomic lock creation + validate JSON response shape

- Use dict.setdefault() for atomic per-server lock creation
- Add isinstance(body, dict) check before accessing token response fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: replace asserts with proper guards, wrap HTTP errors with context

- Replace `assert` statements with `if/raise ValueError` (asserts can be
  disabled with python -O in production)
- Wrap `httpx.HTTPStatusError` to provide a clear error message with
  server_id and status code
- Add tests for HTTP error and non-dict JSON response error paths
- Remove unused imports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* [UI] M2M OAuth2 UI Flow  (#20794)

* add has_client_credentials

* MCPOAuth2TokenCache

* init MCP Oauth2 constants

* MCPOAuth2TokenCache

* resolve_mcp_auth

* test fixes

* docs fix

* address greptile review: min TTL, env-configurable constants, tests, docs

- Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s)
- Make all MCP OAuth2 constants env-configurable via os.getenv()
- Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py)
- Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections
- Update FAQ in mcp.md to reflect M2M support
- Add E2E test script and config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix mypy lint

* fix oauth2

* ui feat fixes

* test M2M

* test fix

* ui feats

* ui fixes

* ui fix client ID

* fix: backend endpoints

* docs fix

* fixes greptile

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* [Fix] prevent shared backend model key from being polluted by per-deployment custom pricing (#20679)

* bug: custom price override for models

* added associated test

* fix(mcp): resolve OAuth2 root endpoints returning "MCP server not found" (#20784)

When MCP SDK hits root-level /register, /authorize, /token without
server name prefix, auto-resolve to the single configured OAuth2
server. Also fix WWW-Authenticate header to use correct public URL
behind reverse proxy.

* Add support for langchain_aws via litellm passthrough

* fix(proxy): return early instead of raising ValueError when standard_logging_payload is missing

The `_PROXY_VirtualKeyModelMaxBudgetLimiter.async_log_success_event` hook
raises `ValueError` when `standard_logging_payload` is `None`.  This breaks
non-standard call types (e.g. vLLM `/classify`) that do not populate the
payload, and the resulting exception disrupts downstream success callbacks
like Langfuse.

Return early with a debug log instead, matching the existing pattern used
for missing `user_api_key_model_max_budget`.

Fixes #18986

---------

Co-authored-by: Sameer Kankute <sameer@berri.ai>
Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Shivam Rawat <161387515+shivamrawat1@users.noreply.github.com>
Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
Sameerlite added a commit that referenced this pull request Feb 11, 2026
…logging_payload is missing (#20851)

* fix: Preserved nullable object fields by carrying schema properties

* Fix: _convert_schema_types

* Fix all mypy issues

* Add alert about email notifications

* fixing tests

* extending timeout for long running tests

* Text changes

* [Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support (#20788)

* add has_client_credentials

* MCPOAuth2TokenCache

* init MCP Oauth2 constants

* MCPOAuth2TokenCache

* resolve_mcp_auth

* test fixes

* docs fix

* address greptile review: min TTL, env-configurable constants, tests, docs

- Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s)
- Make all MCP OAuth2 constants env-configurable via os.getenv()
- Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py)
- Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections
- Update FAQ in mcp.md to reflect M2M support
- Add E2E test script and config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix mypy lint

* fix oauth2

* remove old files

* docs fix

* address greptile comments

* fix: atomic lock creation + validate JSON response shape

- Use dict.setdefault() for atomic per-server lock creation
- Add isinstance(body, dict) check before accessing token response fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: replace asserts with proper guards, wrap HTTP errors with context

- Replace `assert` statements with `if/raise ValueError` (asserts can be
  disabled with python -O in production)
- Wrap `httpx.HTTPStatusError` to provide a clear error message with
  server_id and status code
- Add tests for HTTP error and non-dict JSON response error paths
- Remove unused imports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* [UI] M2M OAuth2 UI Flow  (#20794)

* add has_client_credentials

* MCPOAuth2TokenCache

* init MCP Oauth2 constants

* MCPOAuth2TokenCache

* resolve_mcp_auth

* test fixes

* docs fix

* address greptile review: min TTL, env-configurable constants, tests, docs

- Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s)
- Make all MCP OAuth2 constants env-configurable via os.getenv()
- Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py)
- Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections
- Update FAQ in mcp.md to reflect M2M support
- Add E2E test script and config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix mypy lint

* fix oauth2

* ui feat fixes

* test M2M

* test fix

* ui feats

* ui fixes

* ui fix client ID

* fix: backend endpoints

* docs fix

* fixes greptile

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* [Fix] prevent shared backend model key from being polluted by per-deployment custom pricing (#20679)

* bug: custom price override for models

* added associated test

* fix(mcp): resolve OAuth2 root endpoints returning "MCP server not found" (#20784)

When MCP SDK hits root-level /register, /authorize, /token without
server name prefix, auto-resolve to the single configured OAuth2
server. Also fix WWW-Authenticate header to use correct public URL
behind reverse proxy.

* Add support for langchain_aws via litellm passthrough

* fix(proxy): return early instead of raising ValueError when standard_logging_payload is missing

The `_PROXY_VirtualKeyModelMaxBudgetLimiter.async_log_success_event` hook
raises `ValueError` when `standard_logging_payload` is `None`.  This breaks
non-standard call types (e.g. vLLM `/classify`) that do not populate the
payload, and the resulting exception disrupts downstream success callbacks
like Langfuse.

Return early with a debug log instead, matching the existing pattern used
for missing `user_api_key_model_max_budget`.

Fixes #18986

---------

Co-authored-by: Sameer Kankute <sameer@berri.ai>
Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Shivam Rawat <161387515+shivamrawat1@users.noreply.github.com>
Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
cursor bot pushed a commit that referenced this pull request Mar 12, 2026
…plementation

Tests in test_oauth2_token_cache.py expect MCPServer.has_client_credentials
to return True when client_id/client_secret are set, but the property
requires oauth2_flow='client_credentials' which is not set by the test
fixtures or _execute_with_mcp_client. Skipping until the feature is
fully implemented (PR #20788).

Co-authored-by: yuneng-jiang <yuneng-jiang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant