Skip to content

feat(guardrails): MCPJWTSigner - complete zero trust MCP JWT signing (FR-5/9/10/12/13/14/15)#23912

Closed
ishaan-jaff wants to merge 15 commits intomainfrom
worktree-soft-roaming-stream
Closed

feat(guardrails): MCPJWTSigner - complete zero trust MCP JWT signing (FR-5/9/10/12/13/14/15)#23912
ishaan-jaff wants to merge 15 commits intomainfrom
worktree-soft-roaming-stream

Conversation

@ishaan-jaff
Copy link
Copy Markdown
Contributor

Relevant issues

Picks up #23897 and adds missing pieces from the JWT signer scoping doc.

What this does

Builds on the MCPJWTSigner guardrail from #23897 with all the FRs the original PR left out:

FR-5 – Verify + re-sign: When access_token_discovery_uri is set, the signer enters re-sign mode. Incoming JWT claims (already validated by liteLLM's JWT auth handler and stored in UserAPIKeyAuth.jwt_claims) are used as the source of truth for end-user identity rather than the LiteLLM virtual key profile.

FR-12 – Configurable end-user identity mapping: New end_user_claim_sources config. Ordered list; first non-empty value wins. Each source is tried as (1) a JWT claim, (2) a raw request header (case-insensitive), (3) a known UserAPIKeyAuth field. Defaults to ["sub", "preferred_username", "email", "user_id"].

FR-13 – Claim operations: add_claims, set_claims, remove_claims in Kong order (add → set → remove).

FR-14 – Two-token model: channel_token_header (default: X-Channel-Token), channel_token_discovery_uri, channel_token_jwks_uri. Channel token read from raw request headers, verified via OIDC discovery or JWKS URI when configured, decoded claims used as act.sub/act.client_id per RFC 8693.

Raw headers are now threaded all the way from call_toolpre_call_tool_checkpre_hook_kwargsmcp_raw_headers in synthetic LLM data, so guardrail hooks can read inbound headers.

FR-15 – Claim validation: required_claims / optional_claims. Rejects MCP tool calls where required JWT claims are missing from the incoming token.

FR-9 – Debug headers: x-litellm-mcp-debug on outbound MCP requests (default on, disable with debug_header: false). JSON payload: {signer, kid, issuer, sub, act, mode, channel_token}.

FR-10 – Configurable scope: allowed_tools list for admin-defined fine-grained scope. When set, only the listed tools get scope entries — no overpermission of tools/list during active tool calls.

P1 fixes (from original PR):

Pre-Submission checklist

  • Tests added — 46 tests (was 15), covering all new FRs
  • make test-unit passes locally
  • No breaking changes — purely additive

Type

New feature (extension of #23897)

Changes

  • mcp_jwt_signer.py — rewritten with all FR features; new _OIDCDiscoveryCache, _validate_required_claims, _resolve_end_user_identity helpers
  • mcp_jwt_signer/__init__.py — passes all new config params through initialize_guardrail
  • mcp_server_manager.pypre_call_tool_check now accepts raw_headers; passed from call_tool for channel token support
  • proxy/utils.py_convert_mcp_to_llm_format includes mcp_raw_headers in synthetic data; also fixed pre-existing F402 flake8 warning (inline import inside loop)
  • test_mcp_jwt_signer.py — 31 new tests covering FR-5/9/10/12/13/14/15

noahnistler and others added 15 commits March 17, 2026 14:31
… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.
… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.
… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.
… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.
…MCP auth

Signs outbound MCP tool calls with a LiteLLM-issued RS256 JWT so MCP servers
can trust a single signing authority instead of every upstream IdP.

Enable in config.yaml:
  guardrails:
    - guardrail_name: mcp-jwt-signer
      litellm_params:
        guardrail: mcp_jwt_signer
        mode: pre_mcp_call
        default_on: true

JWT carries sub (user_id), act.sub (team_id, RFC 8693), tool-level scope, iss,
aud, iat/exp/nbf. RSA-2048 keypair auto-generated at startup unless
MCP_JWT_SIGNING_KEY env var is set.

Adds /.well-known/jwks.json endpoint and jwks_uri to /.well-known/openid-configuration
so MCP servers can verify LiteLLM-issued tokens via OIDC discovery.
…or extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message.
- OpenAPI servers: warn + skip header injection instead of 500
- JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent
- sub claim: fallback to apikey:{token_hash} for anonymous callers
- ttl_seconds: validate > 0 at init time
Keep warn+skip behavior for OpenAPI servers (not 400 raise).
Both test suites pass (45 tests).
- mcp_server_manager: warn when hook Authorization overwrites existing header
- __init__: remove _mcp_jwt_signer_instance from __all__ (private internal)
- discoverable_endpoints: copy dict instead of mutating in-place on OIDC augmentation
- test docstring: reflect warn-and-continue behavior for OpenAPI servers
- test: update scope assertions for least-privilege (no mcp:tools/list on tool-call JWTs)
- initialize_guardrail: validate mode='pre_mcp_call' at init time — misconfigured
  mode silently bypasses JWT injection, which is a zero-trust bypass
- _build_claims: remove duplicate inline 'import re' (module-level import already present)
- _types.py: add TODO comment explaining jwt_claims is forward-compat plumbing
  for a follow-up PR that will forward upstream IdP claims into outbound MCP JWTs
… FR parity

Merges PR #23897 and adds missing pieces from the JWT signer scoping doc:

FR-5 (verify + re-sign): Uses upstream jwt_claims from already-validated
incoming JWT when available (set via UserAPIKeyAuth.jwt_claims). When
access_token_discovery_uri is configured, the signer operates in re-sign mode
using the upstream IdP claims rather than generating identity from the LiteLLM
user profile.

FR-12 (end-user identity mapping): Configurable end_user_claim_sources ordered
list. Tries each source as a JWT claim name, then as a raw request header
(case-insensitive), then as a UserAPIKeyAuth field. First non-empty wins.
Defaults to ["sub", "preferred_username", "email", "user_id"].

FR-13 (claim operations): add_claims, set_claims, remove_claims config. Applied
in Kong order: add (if not present) → set (override) → remove (strip).

FR-14 (two-token model): channel_token_header (default: X-Channel-Token),
channel_token_discovery_uri, channel_token_jwks_uri. Channel token is read from
raw request headers passed through pre_call_tool_check → pre_hook_kwargs →
mcp_raw_headers in synthetic LLM data. When present its sub/client_id becomes
act.sub per RFC 8693. Verified via OIDC discovery or JWKS URI when configured.

FR-15 (claim validation): required_claims / optional_claims. Rejects requests
where required JWT claims are absent or where no JWT was used at all.

FR-9 (debug headers): x-litellm-mcp-debug on outbound MCP requests (default on,
disable with debug_header: false). JSON payload with signer, kid, issuer, sub,
act, mode (sign vs re-sign), channel_token flag.

FR-10 (configurable scope): allowed_tools list for admin-defined fine-grained
tool control. When set, scope is built from the explicit list only (no overpermission
of tools/list during tool calls). Empty list falls back to auto-generated scope.

Also fixes raw_headers plumbing for channel token: pre_call_tool_check now
accepts raw_headers, passes it through pre_hook_kwargs, and
_convert_mcp_to_llm_format includes it as mcp_raw_headers in synthetic data.

Tests: 46 tests (was 15), covering all new FRs.
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 17, 2026 10:37pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 17, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing worktree-soft-roaming-stream (1d02d2b) with main (ef9cc33)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR completes the MCPJWTSigner zero-trust guardrail by implementing the remaining functional requirements (FR-5/9/10/12/13/14/15) from the MCP JWT scoping document. It extends the RS256 JWT signing introduced in #23897 with upstream token re-signing, configurable end-user identity mapping, claim operations (add/set/remove), a two-token model (access + channel token), required/optional claim validation, debug headers, and fine-grained tool scoping. Supporting infrastructure changes thread raw inbound headers through the MCP call stack, expose a /.well-known/jwks.json endpoint, and propagate upstream JWT claims into UserAPIKeyAuth.

Key changes:

  • mcp_jwt_signer.py — new 758-line implementation with _OIDCDiscoveryCache, _validate_required_claims, _resolve_end_user_identity helpers and the full MCPJWTSigner class
  • mcp_server_manager.pypre_call_tool_check now returns a Dict (was None) carrying extra_headers/arguments; _call_regular_mcp_tool gains hook_extra_headers param with correct merge priority
  • user_api_key_auth.py — upstream JWT claims stored on UserAPIKeyAuth.jwt_claims in both auth fast paths
  • discoverable_endpoints.py/.well-known/jwks.json endpoint + OIDC discovery doc augmentation
  • 46 mock-only unit tests added across two new test files

Issues found:

  • When channel_token_jwks_uri is configured (direct JWKS URI, not discovery), a new PyJWKClient is instantiated on every MCP tool call request — contradicting the stated NFR-2 goal of avoiding per-request JWKS fetches. The OIDC discovery path correctly caches via _oidc_cache, but the direct URI path bypasses caching entirely.
  • The _build_scope method always adds mcp:tools/{tool}:list scopes for every entry in allowed_tools, even during an active tool call where tool_name is set. This contradicts FR-10's stated goal of preventing tools/list overpermission during active tool calls.
  • The try/except ValueError: raise exc block in async_pre_call_hook is dead code — the exception propagates identically without the wrapper.
  • _OIDCDiscoveryCache._discovery_docs has no TTL — discovery documents are cached for the lifetime of the process, meaning an IdP JWKS URI rotation in the discovery doc would require a LiteLLM restart to take effect.

Confidence Score: 3/5

  • PR is largely well-implemented and additive, but two logic issues in the JWT signer's scope building and JWKS client caching should be addressed before merge.
  • The architecture is sound and the plumbing changes are careful. The two P1 issues are: (1) PyJWKClient created per-request when channel_token_jwks_uri is used directly — this causes a network fetch on every MCP tool call and violates the stated NFR-2 goal; (2) _build_scope with allowed_tools grants per-tool :list scopes unconditionally (even during active tool calls), contradicting FR-10's least-privilege intent. Neither is a security hole that would allow auth bypass, but the JWKS client issue is a correctness/performance regression and the scope issue quietly over-grants permissions. The test coverage is thorough (46 tests, all mocked), jwt_claims propagation is correct, and no breaking changes were introduced.
  • litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py — specifically _verify_channel_token (JWKS caching) and _build_scope (allowed_tools :list scopes)

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py Core JWT signer implementation (758 lines, all new). Implements FR-5/9/10/12/13/14/15. Has 3 issues: PyJWKClient instantiated per-request when using direct JWKS URI (NFR-2 violation); per-tool :list scopes always added even during active tool calls (FR-10 logic gap); and _OIDCDiscoveryCache has no TTL for discovery documents.
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/init.py Guardrail initializer — passes all new config params (FR-5/9/10/12/13/14/15) through to MCPJWTSigner. Clean, straightforward wiring with mode validation. No issues found.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Extended pre_call_tool_check to accept/return raw_headers and extra_headers. Added hook_extra_headers parameter to _call_regular_mcp_tool with correct merge-after-static priority. Correctly warns (but does not block) when hook headers are returned for OpenAPI-backed servers.
litellm/proxy/auth/user_api_key_auth.py Propagates upstream JWT claims into UserAPIKeyAuth.jwt_claims in both the virtual-key fast path and the standard JWT auth path. jwt_claims is in scope at both assignment sites. No issues found.
litellm/proxy/_types.py Adds optional jwt_claims: Optional[Dict] field to UserAPIKeyAuth. Backward-compatible (defaults to None). The TODO comment acknowledges current partial use.
litellm/proxy/utils.py Two changes: fixes a pre-existing F402 flake8 warning (inline import moved out of loop), and adds mcp_raw_headers to synthetic LLM data so guardrail hooks can read inbound headers. Also extends _convert_mcp_hook_response_to_kwargs to extract extra_headers from hook responses. Clean.
litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py Augments /.well-known/openid-configuration with JWKS fields when MCPJWTSigner is active, and adds the new /.well-known/jwks.json endpoint. oauth_authorization_server_mcp always returns a dict, so the isinstance(response, dict) guard works correctly.
litellm/types/guardrails.py Adds MCP_JWT_SIGNER = "mcp_jwt_signer" to SupportedGuardrailIntegrations enum. Trivial, no issues.
tests/test_litellm/proxy/guardrails/test_mcp_jwt_signer.py 46 tests covering all new FRs. All tests use mocks — no real network calls. Singleton reset pattern (mod._mcp_jwt_signer_instance = None) prevents cross-test pollution for the signer instance. The module-level _oidc_cache is not reset between tests, but no tests exercise its fetch path with real URLs.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_hook_extra_headers.py Tests for the extra_headers plumbing from hook → pre_call_tool_checkcall_tool_call_regular_mcp_tool. All mocked, good coverage of priority/merge behaviour and OpenAPI fallback warning.

Sequence Diagram

sequenceDiagram
    participant Client
    participant LiteLLM
    participant MCPJWTSigner as MCPJWTSigner<br/>(pre_mcp_call hook)
    participant IdP as Upstream IdP<br/>(OIDC / JWKS)
    participant MCP as MCP Server
    participant JWKS as LiteLLM JWKS<br/>/.well-known/jwks.json

    Client->>LiteLLM: MCP tool call (Bearer API key OR JWT)
    Note over LiteLLM: user_api_key_auth extracts<br/>jwt_claims → UserAPIKeyAuth.jwt_claims

    LiteLLM->>MCPJWTSigner: async_pre_call_hook(data, user_api_key_dict)

    Note over MCPJWTSigner: FR-15: validate required_claims<br/>FR-14: read X-Channel-Token header

    opt channel_token present
        MCPJWTSigner->>IdP: GET /.well-known/openid-config (cached)<br/>+ get_signing_key_from_jwt
        IdP-->>MCPJWTSigner: channel token claims (act.sub)
    end

    Note over MCPJWTSigner: FR-5/12: resolve sub from<br/>jwt_claims → header → UserAPIKeyAuth<br/>FR-13: add/set/remove claims<br/>FR-10: build tool-scoped scope<br/>FR-9: build x-litellm-mcp-debug

    MCPJWTSigner-->>LiteLLM: data + extra_headers{Authorization, x-litellm-mcp-debug}

    LiteLLM->>MCP: call_tool(args)<br/>Authorization: Bearer signed-RS256-JWT<br/>x-litellm-mcp-debug: {...}
    MCP->>JWKS: GET /.well-known/jwks.json
    JWKS-->>MCP: RSA public key
    MCP->>MCP: verify JWT (iss/aud/exp/sig)
    MCP-->>LiteLLM: tool result
    LiteLLM-->>Client: response
Loading

Comments Outside Diff (3)

  1. litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 1066-1076 (link)

    P1 New PyJWKClient created on every MCP tool call

    When channel_token_jwks_uri is configured (instead of channel_token_discovery_uri), a brand-new PyJWKClient is instantiated on every invocation of _verify_channel_token. This directly contradicts the NFR-2 goal stated in the _OIDCDiscoveryCache docstring ("Avoids per-request OIDC discovery + JWKS fetches") and will cause a fresh JWKS HTTP fetch on every MCP tool call.

    The OIDC discovery path properly uses the module-level _oidc_cache to cache clients, but the direct JWKS URI path bypasses caching entirely:

    # Current (bad): new client per request
    if self.channel_token_jwks_uri:
        jwks_client = PyJWKClient(self.channel_token_jwks_uri)
        ...
    
    # Should mirror the discovery path — cache the PyJWKClient
    if self.channel_token_jwks_uri:
        if self.channel_token_jwks_uri not in _oidc_cache._jwks_clients:
            _oidc_cache._jwks_clients[self.channel_token_jwks_uri] = PyJWKClient(self.channel_token_jwks_uri)
        jwks_client = _oidc_cache._jwks_clients[self.channel_token_jwks_uri]
        ...

    Or alternatively, add a get_jwks_client_by_uri(uri) method to _OIDCDiscoveryCache that handles both discovery and direct JWKS URI cases so the caching logic is in one place.

  2. litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 1241-1245 (link)

    P2 Redundant try/except block — dead code

    This try/except block catches ValueError only to immediately re-raise it, making it completely equivalent to calling _validate_required_claims without any error handling. The comment # Propagate to block the MCP call is the correct intent, but the exception would propagate naturally without the try/except.

  3. litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 1094-1104 (link)

    P1 allowed_tools scope includes per-tool :list scopes unconditionally — potentially overpermissive

    For every tool in allowed_tools, both mcp:tools/{tool}:call and mcp:tools/{tool}:list are added to the scope, even during an active tool call (when tool_name is set). The PR description for FR-10 states "no overpermission of tools/list during active tool calls", but this only prevents the top-level mcp:tools/list — the per-tool :list scopes are always granted.

    This is inconsistent with the original least-privilege behaviour (which only grants mcp:tools/list when no specific tool is being called) and could allow a tool-scoped JWT to perform listing operations it shouldn't be able to.

    Consider only adding the per-tool :list scope when not tool_name:

    if self.allowed_tools:
        scope_parts = []
        for allowed_tool in self.allowed_tools:
            sanitized = re.sub(r"[^a-zA-Z0-9_\-]", "_", allowed_tool)
            scope_parts.append(f"mcp:tools/{sanitized}:call")
            if not tool_name:
                scope_parts.append(f"mcp:tools/{sanitized}:list")
        if not tool_name:
            scope_parts.append("mcp:tools/list")
        return " ".join(sorted(set(scope_parts)))

Last reviewed commit: "feat(guardrails): MC..."

Comment on lines +727 to +743
outbound_headers: Dict[str, str] = {
**existing_headers,
"Authorization": f"Bearer {signed_token}",
}

# FR-9: Debug header — tells downstream what auth resolution was used.
if self.debug_header:
debug_info: Dict[str, Any] = {
"signer": "mcp_jwt_signer",
"kid": self._kid,
"issuer": self.issuer,
"sub": claims.get("sub"),
"act": claims.get("act"),
"mode": "re-sign" if jwt_claims else "sign",
"channel_token": channel_token_claims is not None,
}
outbound_headers["x-litellm-mcp-debug"] = json.dumps(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 OIDC discovery documents cached indefinitely — no TTL

The _discovery_docs dict caches discovery documents forever (no eviction or TTL). The docstring says "Discovery docs are cached indefinitely (they rarely change)", but OIDC providers do occasionally update their discovery documents — most importantly when rotating their JWKS URI.

If an IdP changes its jwks_uri in the discovery doc (e.g., as part of a security incident rotation), the _oidc_cache would continue to serve the old PyJWKClient pointing at the stale JWKS endpoint indefinitely. A LiteLLM restart would be required to pick up the change.

Consider adding a TTL (e.g., 24 hours) to discovery doc cache entries, or at minimum documenting that a restart is needed when IdP discovery URIs change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants