feat(guardrails): MCPJWTSigner - complete zero trust MCP JWT signing (FR-5/9/10/12/13/14/15)#23912
feat(guardrails): MCPJWTSigner - complete zero trust MCP JWT signing (FR-5/9/10/12/13/14/15)#23912ishaan-jaff wants to merge 15 commits intomainfrom
Conversation
… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.
… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.
… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.
… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.
…MCP auth
Signs outbound MCP tool calls with a LiteLLM-issued RS256 JWT so MCP servers
can trust a single signing authority instead of every upstream IdP.
Enable in config.yaml:
guardrails:
- guardrail_name: mcp-jwt-signer
litellm_params:
guardrail: mcp_jwt_signer
mode: pre_mcp_call
default_on: true
JWT carries sub (user_id), act.sub (team_id, RFC 8693), tool-level scope, iss,
aud, iat/exp/nbf. RSA-2048 keypair auto-generated at startup unless
MCP_JWT_SIGNING_KEY env var is set.
Adds /.well-known/jwks.json endpoint and jwks_uri to /.well-known/openid-configuration
so MCP servers can verify LiteLLM-issued tokens via OIDC discovery.
…or extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message.
- OpenAPI servers: warn + skip header injection instead of 500
- JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent
- sub claim: fallback to apikey:{token_hash} for anonymous callers
- ttl_seconds: validate > 0 at init time
Keep warn+skip behavior for OpenAPI servers (not 400 raise). Both test suites pass (45 tests).
- mcp_server_manager: warn when hook Authorization overwrites existing header - __init__: remove _mcp_jwt_signer_instance from __all__ (private internal) - discoverable_endpoints: copy dict instead of mutating in-place on OIDC augmentation - test docstring: reflect warn-and-continue behavior for OpenAPI servers - test: update scope assertions for least-privilege (no mcp:tools/list on tool-call JWTs)
- initialize_guardrail: validate mode='pre_mcp_call' at init time — misconfigured mode silently bypasses JWT injection, which is a zero-trust bypass - _build_claims: remove duplicate inline 'import re' (module-level import already present) - _types.py: add TODO comment explaining jwt_claims is forward-compat plumbing for a follow-up PR that will forward upstream IdP claims into outbound MCP JWTs
… FR parity Merges PR #23897 and adds missing pieces from the JWT signer scoping doc: FR-5 (verify + re-sign): Uses upstream jwt_claims from already-validated incoming JWT when available (set via UserAPIKeyAuth.jwt_claims). When access_token_discovery_uri is configured, the signer operates in re-sign mode using the upstream IdP claims rather than generating identity from the LiteLLM user profile. FR-12 (end-user identity mapping): Configurable end_user_claim_sources ordered list. Tries each source as a JWT claim name, then as a raw request header (case-insensitive), then as a UserAPIKeyAuth field. First non-empty wins. Defaults to ["sub", "preferred_username", "email", "user_id"]. FR-13 (claim operations): add_claims, set_claims, remove_claims config. Applied in Kong order: add (if not present) → set (override) → remove (strip). FR-14 (two-token model): channel_token_header (default: X-Channel-Token), channel_token_discovery_uri, channel_token_jwks_uri. Channel token is read from raw request headers passed through pre_call_tool_check → pre_hook_kwargs → mcp_raw_headers in synthetic LLM data. When present its sub/client_id becomes act.sub per RFC 8693. Verified via OIDC discovery or JWKS URI when configured. FR-15 (claim validation): required_claims / optional_claims. Rejects requests where required JWT claims are absent or where no JWT was used at all. FR-9 (debug headers): x-litellm-mcp-debug on outbound MCP requests (default on, disable with debug_header: false). JSON payload with signer, kid, issuer, sub, act, mode (sign vs re-sign), channel_token flag. FR-10 (configurable scope): allowed_tools list for admin-defined fine-grained tool control. When set, scope is built from the explicit list only (no overpermission of tools/list during tool calls). Empty list falls back to auto-generated scope. Also fixes raw_headers plumbing for channel token: pre_call_tool_check now accepts raw_headers, passes it through pre_hook_kwargs, and _convert_mcp_to_llm_format includes it as mcp_raw_headers in synthetic data. Tests: 46 tests (was 15), covering all new FRs.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR completes the Key changes:
Issues found:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py | Core JWT signer implementation (758 lines, all new). Implements FR-5/9/10/12/13/14/15. Has 3 issues: PyJWKClient instantiated per-request when using direct JWKS URI (NFR-2 violation); per-tool :list scopes always added even during active tool calls (FR-10 logic gap); and _OIDCDiscoveryCache has no TTL for discovery documents. |
| litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/init.py | Guardrail initializer — passes all new config params (FR-5/9/10/12/13/14/15) through to MCPJWTSigner. Clean, straightforward wiring with mode validation. No issues found. |
| litellm/proxy/_experimental/mcp_server/mcp_server_manager.py | Extended pre_call_tool_check to accept/return raw_headers and extra_headers. Added hook_extra_headers parameter to _call_regular_mcp_tool with correct merge-after-static priority. Correctly warns (but does not block) when hook headers are returned for OpenAPI-backed servers. |
| litellm/proxy/auth/user_api_key_auth.py | Propagates upstream JWT claims into UserAPIKeyAuth.jwt_claims in both the virtual-key fast path and the standard JWT auth path. jwt_claims is in scope at both assignment sites. No issues found. |
| litellm/proxy/_types.py | Adds optional jwt_claims: Optional[Dict] field to UserAPIKeyAuth. Backward-compatible (defaults to None). The TODO comment acknowledges current partial use. |
| litellm/proxy/utils.py | Two changes: fixes a pre-existing F402 flake8 warning (inline import moved out of loop), and adds mcp_raw_headers to synthetic LLM data so guardrail hooks can read inbound headers. Also extends _convert_mcp_hook_response_to_kwargs to extract extra_headers from hook responses. Clean. |
| litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py | Augments /.well-known/openid-configuration with JWKS fields when MCPJWTSigner is active, and adds the new /.well-known/jwks.json endpoint. oauth_authorization_server_mcp always returns a dict, so the isinstance(response, dict) guard works correctly. |
| litellm/types/guardrails.py | Adds MCP_JWT_SIGNER = "mcp_jwt_signer" to SupportedGuardrailIntegrations enum. Trivial, no issues. |
| tests/test_litellm/proxy/guardrails/test_mcp_jwt_signer.py | 46 tests covering all new FRs. All tests use mocks — no real network calls. Singleton reset pattern (mod._mcp_jwt_signer_instance = None) prevents cross-test pollution for the signer instance. The module-level _oidc_cache is not reset between tests, but no tests exercise its fetch path with real URLs. |
| tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_hook_extra_headers.py | Tests for the extra_headers plumbing from hook → pre_call_tool_check → call_tool → _call_regular_mcp_tool. All mocked, good coverage of priority/merge behaviour and OpenAPI fallback warning. |
Sequence Diagram
sequenceDiagram
participant Client
participant LiteLLM
participant MCPJWTSigner as MCPJWTSigner<br/>(pre_mcp_call hook)
participant IdP as Upstream IdP<br/>(OIDC / JWKS)
participant MCP as MCP Server
participant JWKS as LiteLLM JWKS<br/>/.well-known/jwks.json
Client->>LiteLLM: MCP tool call (Bearer API key OR JWT)
Note over LiteLLM: user_api_key_auth extracts<br/>jwt_claims → UserAPIKeyAuth.jwt_claims
LiteLLM->>MCPJWTSigner: async_pre_call_hook(data, user_api_key_dict)
Note over MCPJWTSigner: FR-15: validate required_claims<br/>FR-14: read X-Channel-Token header
opt channel_token present
MCPJWTSigner->>IdP: GET /.well-known/openid-config (cached)<br/>+ get_signing_key_from_jwt
IdP-->>MCPJWTSigner: channel token claims (act.sub)
end
Note over MCPJWTSigner: FR-5/12: resolve sub from<br/>jwt_claims → header → UserAPIKeyAuth<br/>FR-13: add/set/remove claims<br/>FR-10: build tool-scoped scope<br/>FR-9: build x-litellm-mcp-debug
MCPJWTSigner-->>LiteLLM: data + extra_headers{Authorization, x-litellm-mcp-debug}
LiteLLM->>MCP: call_tool(args)<br/>Authorization: Bearer signed-RS256-JWT<br/>x-litellm-mcp-debug: {...}
MCP->>JWKS: GET /.well-known/jwks.json
JWKS-->>MCP: RSA public key
MCP->>MCP: verify JWT (iss/aud/exp/sig)
MCP-->>LiteLLM: tool result
LiteLLM-->>Client: response
Comments Outside Diff (3)
-
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 1066-1076 (link)New
PyJWKClientcreated on every MCP tool callWhen
channel_token_jwks_uriis configured (instead ofchannel_token_discovery_uri), a brand-newPyJWKClientis instantiated on every invocation of_verify_channel_token. This directly contradicts the NFR-2 goal stated in the_OIDCDiscoveryCachedocstring ("Avoids per-request OIDC discovery + JWKS fetches") and will cause a fresh JWKS HTTP fetch on every MCP tool call.The OIDC discovery path properly uses the module-level
_oidc_cacheto cache clients, but the direct JWKS URI path bypasses caching entirely:# Current (bad): new client per request if self.channel_token_jwks_uri: jwks_client = PyJWKClient(self.channel_token_jwks_uri) ... # Should mirror the discovery path — cache the PyJWKClient if self.channel_token_jwks_uri: if self.channel_token_jwks_uri not in _oidc_cache._jwks_clients: _oidc_cache._jwks_clients[self.channel_token_jwks_uri] = PyJWKClient(self.channel_token_jwks_uri) jwks_client = _oidc_cache._jwks_clients[self.channel_token_jwks_uri] ...
Or alternatively, add a
get_jwks_client_by_uri(uri)method to_OIDCDiscoveryCachethat handles both discovery and direct JWKS URI cases so the caching logic is in one place. -
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 1241-1245 (link)Redundant
try/exceptblock — dead codeThis
try/exceptblock catchesValueErroronly to immediately re-raise it, making it completely equivalent to calling_validate_required_claimswithout any error handling. The comment# Propagate to block the MCP callis the correct intent, but the exception would propagate naturally without thetry/except. -
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 1094-1104 (link)allowed_toolsscope includes per-tool:listscopes unconditionally — potentially overpermissiveFor every tool in
allowed_tools, bothmcp:tools/{tool}:callandmcp:tools/{tool}:listare added to the scope, even during an active tool call (whentool_nameis set). The PR description for FR-10 states "no overpermission oftools/listduring active tool calls", but this only prevents the top-levelmcp:tools/list— the per-tool:listscopes are always granted.This is inconsistent with the original least-privilege behaviour (which only grants
mcp:tools/listwhen no specific tool is being called) and could allow a tool-scoped JWT to perform listing operations it shouldn't be able to.Consider only adding the per-tool
:listscope whennot tool_name:if self.allowed_tools: scope_parts = [] for allowed_tool in self.allowed_tools: sanitized = re.sub(r"[^a-zA-Z0-9_\-]", "_", allowed_tool) scope_parts.append(f"mcp:tools/{sanitized}:call") if not tool_name: scope_parts.append(f"mcp:tools/{sanitized}:list") if not tool_name: scope_parts.append("mcp:tools/list") return " ".join(sorted(set(scope_parts)))
Last reviewed commit: "feat(guardrails): MC..."
| outbound_headers: Dict[str, str] = { | ||
| **existing_headers, | ||
| "Authorization": f"Bearer {signed_token}", | ||
| } | ||
|
|
||
| # FR-9: Debug header — tells downstream what auth resolution was used. | ||
| if self.debug_header: | ||
| debug_info: Dict[str, Any] = { | ||
| "signer": "mcp_jwt_signer", | ||
| "kid": self._kid, | ||
| "issuer": self.issuer, | ||
| "sub": claims.get("sub"), | ||
| "act": claims.get("act"), | ||
| "mode": "re-sign" if jwt_claims else "sign", | ||
| "channel_token": channel_token_claims is not None, | ||
| } | ||
| outbound_headers["x-litellm-mcp-debug"] = json.dumps( |
There was a problem hiding this comment.
OIDC discovery documents cached indefinitely — no TTL
The _discovery_docs dict caches discovery documents forever (no eviction or TTL). The docstring says "Discovery docs are cached indefinitely (they rarely change)", but OIDC providers do occasionally update their discovery documents — most importantly when rotating their JWKS URI.
If an IdP changes its jwks_uri in the discovery doc (e.g., as part of a security incident rotation), the _oidc_cache would continue to serve the old PyJWKClient pointing at the stale JWKS endpoint indefinitely. A LiteLLM restart would be required to pick up the change.
Consider adding a TTL (e.g., 24 hours) to discovery doc cache entries, or at minimum documenting that a restart is needed when IdP discovery URIs change.
Relevant issues
Picks up #23897 and adds missing pieces from the JWT signer scoping doc.
What this does
Builds on the MCPJWTSigner guardrail from #23897 with all the FRs the original PR left out:
FR-5 – Verify + re-sign: When
access_token_discovery_uriis set, the signer enters re-sign mode. Incoming JWT claims (already validated by liteLLM's JWT auth handler and stored inUserAPIKeyAuth.jwt_claims) are used as the source of truth for end-user identity rather than the LiteLLM virtual key profile.FR-12 – Configurable end-user identity mapping: New
end_user_claim_sourcesconfig. Ordered list; first non-empty value wins. Each source is tried as (1) a JWT claim, (2) a raw request header (case-insensitive), (3) a known UserAPIKeyAuth field. Defaults to["sub", "preferred_username", "email", "user_id"].FR-13 – Claim operations:
add_claims,set_claims,remove_claimsin Kong order (add → set → remove).FR-14 – Two-token model:
channel_token_header(default:X-Channel-Token),channel_token_discovery_uri,channel_token_jwks_uri. Channel token read from raw request headers, verified via OIDC discovery or JWKS URI when configured, decoded claims used asact.sub/act.client_idper RFC 8693.Raw headers are now threaded all the way from
call_tool→pre_call_tool_check→pre_hook_kwargs→mcp_raw_headersin synthetic LLM data, so guardrail hooks can read inbound headers.FR-15 – Claim validation:
required_claims/optional_claims. Rejects MCP tool calls where required JWT claims are missing from the incoming token.FR-9 – Debug headers:
x-litellm-mcp-debugon outbound MCP requests (default on, disable withdebug_header: false). JSON payload:{signer, kid, issuer, sub, act, mode, channel_token}.FR-10 – Configurable scope:
allowed_toolslist for admin-defined fine-grained scope. When set, only the listed tools get scope entries — no overpermission oftools/listduring active tool calls.P1 fixes (from original PR):
extra_headersmerges rather than replaces (preserved from feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MCP auth #23897)Pre-Submission checklist
make test-unitpasses locallyType
New feature (extension of #23897)
Changes
mcp_jwt_signer.py— rewritten with all FR features; new_OIDCDiscoveryCache,_validate_required_claims,_resolve_end_user_identityhelpersmcp_jwt_signer/__init__.py— passes all new config params throughinitialize_guardrailmcp_server_manager.py—pre_call_tool_checknow acceptsraw_headers; passed fromcall_toolfor channel token supportproxy/utils.py—_convert_mcp_to_llm_formatincludesmcp_raw_headersin synthetic data; also fixed pre-existing F402 flake8 warning (inline import inside loop)test_mcp_jwt_signer.py— 31 new tests covering FR-5/9/10/12/13/14/15