feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MCP auth by ishaan-jaff · Pull Request #23897 · BerriAI/litellm

ishaan-jaff · 2026-03-17T21:02:43Z

Relevant issues

Closes #TODO

What this does

Adds MCPJWTSigner as a first-class built-in LiteLLM guardrail. When enabled, it signs every outbound MCP tool call with a LiteLLM-issued RS256 JWT, so MCP servers (e.g. AWS Bedrock AgentCore Gateway) can trust a single signing authority instead of validating each upstream IdP.

Enable in config.yaml — same pattern as grayswan / litellm_content_filter:

guardrails:
  - guardrail_name: mcp-jwt-signer
    litellm_params:
      guardrail: mcp_jwt_signer
      mode: pre_mcp_call
      default_on: true
      issuer: "https://my-litellm.example.com"  # optional
      audience: "mcp"                            # optional
      ttl_seconds: 300                           # optional

The signed JWT carries sub (user_id), act.sub (team_id, RFC 8693 delegation), tool-level scope (mcp:tools/call, mcp:tools/list, mcp:tools/<tool>:call), and standard timing claims.

MCP servers verify tokens via OIDC discovery:

GET /.well-known/openid-configuration → now includes jwks_uri
GET /.well-known/jwks.json → RSA public key when guardrail is active

RSA-2048 keypair is auto-generated at startup. Set MCP_JWT_SIGNING_KEY env var (PEM string or file:///path) to use your own key.

Depends on the pre_mcp_call hook mechanism (cherry-picked from #23889).

Pre-Submission checklist

Tests added (tests/test_litellm/proxy/guardrails/test_mcp_jwt_signer.py, 15 tests, all passing)
No breaking changes — purely additive
make test-unit passes locally

Type

Bug fix
New feature
Refactor
Documentation

Changes

litellm/types/guardrails.py — add MCP_JWT_SIGNER = "mcp_jwt_signer" to SupportedGuardrailIntegrations
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/ — new guardrail package (MCPJWTSigner class + initialize_guardrail)
litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py — add jwks_uri to OIDC doc; add GET /.well-known/jwks.json
Cherry-picks from Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers #23889: hook_extra_headers in _call_regular_mcp_tool, pre_mcp_call event hook support in proxy/utils.py

… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.

… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.

… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.

… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.

…MCP auth Signs outbound MCP tool calls with a LiteLLM-issued RS256 JWT so MCP servers can trust a single signing authority instead of every upstream IdP. Enable in config.yaml: guardrails: - guardrail_name: mcp-jwt-signer litellm_params: guardrail: mcp_jwt_signer mode: pre_mcp_call default_on: true JWT carries sub (user_id), act.sub (team_id, RFC 8693), tool-level scope, iss, aud, iat/exp/nbf. RSA-2048 keypair auto-generated at startup unless MCP_JWT_SIGNING_KEY env var is set. Adds /.well-known/jwks.json endpoint and jwks_uri to /.well-known/openid-configuration so MCP servers can verify LiteLLM-issued tokens via OIDC discovery.

vercel · 2026-03-17T21:02:48Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Error		Mar 18, 2026 0:29am

codspeed-hq · 2026-03-17T21:04:38Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing worktree-fluttering-sleeping-cookie (4761382) with main (ef9cc33)}

greptile-apps · 2026-03-17T21:07:04Z

Greptile Summary

This PR introduces MCPJWTSigner as a new first-class built-in guardrail that signs every outbound MCP tool call with a LiteLLM-issued RS256 JWT, enabling zero-trust authentication for MCP servers (e.g. AWS Bedrock AgentCore Gateway) against a single trusted signing authority. It also adds supporting OIDC discovery infrastructure (/.well-known/jwks.json, augmented /.well-known/openid-configuration), wires up a new pre_mcp_call hook mechanism, and propagates upstream jwt_claims through the auth pipeline into outbound tokens.

Changes:

MCPJWTSigner guardrail with RS256 signing, OIDC re-sign (FR-5), configurable scopes (FR-10), end-user identity mapping (FR-12), claim operations (FR-13), two-token model (FR-14), and incoming claim validation (FR-15)
JWKS and OIDC discovery endpoints added; cache TTL correctly differentiated (300s ephemeral vs 3600s persistent)
pre_mcp_call hook propagates extra_headers and incoming_bearer_token through the MCP call path; OpenAPI-backed servers correctly warn and skip header injection
jwt_claims field added to UserAPIKeyAuth and populated in both JWT auth fast paths

Issues found:

_validate_required_claims uses a falsy-value check (not get(c)) instead of a key-presence check — claims with values of 0, False, or "" are incorrectly treated as missing, causing spurious 403 errors for valid tokens
_introspect_opaque_token does not validate verify_issuer/verify_audience — when opaque token introspection is used, the configured issuer and audience restrictions are silently skipped, creating a security gap compared to the JWT verification path
remove_claims can strip security-critical JWT claims (exp, iss, aud) without any warning, producing non-expiring or unverifiable tokens on misconfiguration
channel_token_ttl is not bounds-checked — unlike ttl_seconds, a value of 0 or negative passes through, producing immediately-expired channel tokens

Confidence Score: 3/5

Not safe to merge as-is — two security logic bugs (falsy required_claims check and missing iss/aud validation on opaque token introspection) need fixing before production use.
The overall architecture is sound and many previously identified issues were addressed (JWKS cache TTL, ttl_seconds validation, tool name sanitization, scope over-granting, extra_headers merging, OpenAPI server graceful degradation). However, two remaining P1 bugs could cause incorrect 403 rejections for valid tokens and silently bypass issuer/audience restrictions on opaque token flows — both directly relevant to the zero-trust security guarantees this feature is meant to provide. The remove_claims / channel_token_ttl gaps are lower severity but worth fixing before GA.
Pay close attention to litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py — specifically _validate_required_claims (line 526), _introspect_opaque_token (lines 477–507), _apply_claim_operations (lines 632–646), and the channel_token_ttl init block (lines 313–316).

Important Files Changed

Filename	Overview
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py	Core MCPJWTSigner guardrail class: signs outbound MCP calls with RS256 JWTs. Several issues found: `_validate_required_claims` uses falsy-value check (incorrectly rejects present claims with values `0`/`False`/`""`), `_introspect_opaque_token` ignores `verify_issuer`/`verify_audience`, `remove_claims` can silently strip critical JWT claims (`exp`, `iss`, `aud`), and `channel_token_ttl` lacks the same `<= 0` validation guard applied to `ttl_seconds`.
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/init.py	Package init: exposes `MCPJWTSigner`, `initialize_guardrail`, and `get_mcp_jwt_signer`. The private `_mcp_jwt_signer_instance` singleton is correctly NOT included in `__all__` or re-exported. Guardrail initializer registry wired correctly.
litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py	Adds `/.well-known/jwks.json` endpoint and augments `/.well-known/openid-configuration` with `jwks_uri`. JWKS cache TTL is correctly differentiated: 300s for ephemeral auto-generated keys, 3600s for persistent env-var keys via `signer.jwks_max_age`. OIDC response is safely shallow-copied before mutation.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py	Adds `hook_extra_headers` support to `_call_regular_mcp_tool` and wires up `pre_call_tool_check` to return hook results including extra headers. OpenAPI-backed servers correctly log a warning and skip header injection instead of raising. Authorization header conflicts emit warnings rather than silently overwriting.
litellm/proxy/utils.py	Extends `_convert_mcp_hook_response_to_kwargs` to propagate `extra_headers` from hook responses, and surfaces `incoming_bearer_token` in the synthetic MCP data dict. Header merging logic correctly uses `{existing, new}` to preserve prior-guardrail headers.
litellm/proxy/auth/user_api_key_auth.py	Propagates `jwt_claims` from both the fast virtual-key JWT path and the standard JWT auth builder path into `UserAPIKeyAuth`. The variable is always in scope at each assignment site.
litellm/proxy/_types.py	Adds optional `jwt_claims: Optional[Dict]` field to `UserAPIKeyAuth` for downstream claim forwarding. Backward-compatible addition.
litellm/types/guardrails.py	Registers `MCP_JWT_SIGNER = "mcp_jwt_signer"` in `SupportedGuardrailIntegrations`. Minimal, correct addition.
tests/test_litellm/proxy/guardrails/test_mcp_jwt_signer.py	15 mock-only unit tests covering key generation, JWKS format, claim building, scope generation, TTL validation, and singleton behavior. All tests correctly use mocking — no real network calls. Missing coverage for `_introspect_opaque_token` issuer/audience validation gap and the falsy-value `required_claims` edge case.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_hook_extra_headers.py	Mock-based tests for the pre_mcp_call hook header injection pipeline. Docstring accurately describes the current behavior (OpenAPI servers warn and continue). Tests are thorough for the happy path and backward compatibility.

Sequence Diagram

sequenceDiagram
    participant Client
    participant LiteLLM Proxy
    participant MCPJWTSigner Guardrail
    participant IdP (optional)
    participant MCP Server

    Client->>LiteLLM Proxy: MCP tool call + Bearer token (optional)
    LiteLLM Proxy->>LiteLLM Proxy: user_api_key_auth (propagates jwt_claims)
    LiteLLM Proxy->>MCPJWTSigner Guardrail: async_pre_call_hook(call_type=call_mcp_tool)

    alt access_token_discovery_uri configured
        MCPJWTSigner Guardrail->>IdP (optional): Fetch JWKS / introspect opaque token
        IdP (optional)-->>MCPJWTSigner Guardrail: Verified JWT claims / introspection response
        MCPJWTSigner Guardrail->>MCPJWTSigner Guardrail: validate required_claims
    end

    MCPJWTSigner Guardrail->>MCPJWTSigner Guardrail: _resolve_end_user_identity → sub
    MCPJWTSigner Guardrail->>MCPJWTSigner Guardrail: _build_scope (tool-specific, least-privilege)
    MCPJWTSigner Guardrail->>MCPJWTSigner Guardrail: _apply_claim_operations (add/set/remove)
    MCPJWTSigner Guardrail->>MCPJWTSigner Guardrail: jwt.encode(RS256, kid=KID)
    MCPJWTSigner Guardrail-->>LiteLLM Proxy: data[extra_headers][Authorization] = Bearer JWT

    LiteLLM Proxy->>MCP Server: Tool call + Authorization: Bearer <LiteLLM JWT>
    MCP Server->>MCP Server: GET /.well-known/jwks.json (cached)
    MCP Server->>MCP Server: Verify RS256 JWT signature
    MCP Server-->>LiteLLM Proxy: Tool result
    LiteLLM Proxy-->>Client: Tool result

Comments Outside Diff (4)

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 526 (link)

Falsy-value check incorrectly marks present claims as missing

(jwt_claims or {}).get(c) returns a falsy value for claims whose value is 0, False, or "" — all of which are valid claim values in JWT (e.g., a numeric role ID of 0, a boolean flag, or an empty string). The list comprehension would then include such claims in missing, raising an erroneous 403 even though the claim IS present in the token.

The check should test for key presence, not value truthiness:
litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 477-507 (link)

verify_issuer / verify_audience are ignored during opaque token introspection

When access_token_discovery_uri is set alongside token_introspection_endpoint, the _verify_incoming_jwt path correctly enforces verify_issuer and verify_audience against the JWT payload. However, _introspect_opaque_token only checks active=true and returns the raw introspection response without applying either validation.

RFC 7662 introspection responses often include iss and aud claims. An attacker who obtains an active token from a different trusted introspection endpoint (or a permissive endpoint that returns active=true for any token) can bypass issuer/audience restrictions entirely when using opaque tokens.

After the active check, validate the claims that are present in the response:
```
if self.verify_issuer and result.get("iss") and result["iss"] != self.verify_issuer:
    raise jwt.exceptions.InvalidIssuerError(
        f"MCPJWTSigner: introspection iss {result['iss']!r} != expected {self.verify_issuer!r}"
    )
if self.verify_audience:
    aud = result.get("aud")
    if aud and self.verify_audience not in (
        [aud] if isinstance(aud, str) else aud
    ):
        raise jwt.exceptions.InvalidAudienceError(
            f"MCPJWTSigner: introspection aud {aud!r} does not contain {self.verify_audience!r}"
        )
```

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 632-646 (link)

remove_claims can silently strip security-critical JWT claims

The remove_claims config option is applied after all other claim-building steps with no guard against removing exp, iss, aud, or nbf. Removing exp produces a non-expiring token; removing iss or aud causes verification failures at MCP servers. Since these are admin-supplied values, a misconfiguration would produce silently broken or dangerously permissive JWTs with no startup warning.

Consider logging a warning (or raising at __init__ time) when remove_claims contains security-critical claim names:

_SECURITY_CLAIMS = {"exp", "iss", "aud", "nbf", "iat"}

def _apply_claim_operations(self, claims: Dict[str, Any]) -> Dict[str, Any]:
    # ...existing add/set logic...

    # remove_claims: delete listed keys
    for k in self.remove_claims:
        if k in _SECURITY_CLAIMS:
            verbose_proxy_logger.warning(
                "MCPJWTSigner: removing security-critical claim %r via remove_claims — "
                "this may cause JWT verification failures downstream.",
                k,
            )
        claims.pop(k, None)

    return claims

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py, line 313-316 (link)

channel_token_ttl accepts zero/negative values unlike ttl_seconds

ttl_seconds has a <= 0 guard that raises ValueError at init time. channel_token_ttl, however, is accepted without bounds validation — a value of 0 or -60 produces channel tokens that are expired on creation, silently breaking the two-token model.

Apply the same guard:
```
if self.channel_token_ttl <= 0:
    raise ValueError(
        f"MCPJWTSigner: channel_token_ttl must be > 0, got {self.channel_token_ttl}"
    )
```

_{Last reviewed commit: "fix(mcp_jwt_signer):..."}

…or extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message.

greptile-apps · 2026-03-17T21:07:08Z

litellm/proxy/_experimental/mcp_server/mcp_server_manager.py

+            if hook_result.get("extra_headers"):
+                raise HTTPException(
+                    status_code=500,
+                    detail={
+                        "error": (
+                            "pre_mcp_call hook returned extra_headers for an "
+                            "OpenAPI-backed MCP server, which does not support "
+                            "hook header injection. Use a regular MCP server "
+                            "(SSE/HTTP transport) for hook header support."
+                        )
+                    },
+                )


Breaking 500 error for OpenAPI-backed MCP servers when guardrail is active

When MCPJWTSigner is configured with default_on: true, it will inject extra_headers for every call_mcp_tool invocation. Any user with an OpenAPI-backed MCP server (spec_path set) will immediately hit a 500 Internal Server Error simply because the guardrail is enabled. This is a breaking change for existing OpenAPI MCP server users who add this guardrail.

The PR description claims "No breaking changes — purely additive", but this contradicts that: any deployment that enables the guardrail AND has at least one OpenAPI-backed server will see tool calls fail.

A more resilient approach would be to silently skip header injection for OpenAPI servers (log a warning instead of raising), matching the "graceful degradation" principle for guardrails:

if hook_result.get("extra_headers"): if mcp_server.spec_path: verbose_logger.warning( "MCPJWTSigner: OpenAPI-backed server '%s' does not support hook header " "injection — skipping JWT signing for this call.", mcp_server.server_name, ) else: raise HTTPException(...)

greptile-apps · 2026-03-17T21:07:08Z

litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py

+            return JSONResponse(
+                content=signer.get_jwks(),
+                headers={"Cache-Control": "public, max-age=3600"},
+            )


JWKS cache duration too long for auto-generated ephemeral keys

The endpoint returns Cache-Control: public, max-age=3600 regardless of whether the key is a user-supplied persistent key or an auto-generated ephemeral keypair. When the server restarts (which regenerates the keypair), downstream MCP servers and gateways will continue using the old cached JWKS for up to one hour, causing all JWT verifications to fail during that window.

For auto-generated keys, either:

Reduce the cache TTL significantly (e.g., max-age=60), or

Use Cache-Control: no-store (or private, no-cache) for ephemeral keys and only apply long-lived caching when a persistent key from MCP_JWT_SIGNING_KEY env var is used.

signer = get_mcp_jwt_signer() if signer is not None: key_is_ephemeral = not os.environ.get(signer.SIGNING_KEY_ENV) cache_ttl = 60 if key_is_ephemeral else 3600 return JSONResponse( content=signer.get_jwks(), headers={"Cache-Control": f"public, max-age={cache_ttl}"}, )

greptile-apps · 2026-03-17T21:07:09Z

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py

+        self.ttl_seconds: int = int(
+            ttl_seconds
+            if ttl_seconds is not None
+            else os.environ.get("MCP_JWT_TTL_SECONDS", str(self.DEFAULT_TTL))
+        )


No validation of ttl_seconds — zero or negative produces instantly-expired JWTs

ttl_seconds is cast to int directly from config/env without bounds checking. A value of 0 or any negative integer will produce JWTs where exp <= iat, meaning every signed token will be immediately expired on receipt. This would silently break all MCP authentication without any startup warning.

Add a validation guard:

self.ttl_seconds: int = int( ttl_seconds if ttl_seconds is not None else os.environ.get("MCP_JWT_TTL_SECONDS", str(self.DEFAULT_TTL)) ) if self.ttl_seconds <= 0: raise ValueError( f"MCPJWTSigner: ttl_seconds must be a positive integer, got {self.ttl_seconds}" )

greptile-apps · 2026-03-17T21:07:10Z

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py

+        tool_name: str = data.get("mcp_tool_name", "")
+        scopes = ["mcp:tools/call", "mcp:tools/list"]
+        if tool_name:
+            scopes.append(f"mcp:tools/{tool_name}:call")
+        claims["scope"] = " ".join(scopes)


Tool name not sanitized before embedding in scope string

tool_name comes directly from data.get("mcp_tool_name", "") without any validation. If the tool name contains spaces, special characters, or path-traversal sequences (e.g., ../../admin), the resulting scope value (mcp:tools/../../admin:call) could be malformed or semantically misleading to MCP servers that parse the scope string.

Consider stripping or rejecting tool names that contain characters outside [A-Za-z0-9_-]:

tool_name: str = data.get("mcp_tool_name", "") if tool_name and not re.match(r'^[A-Za-z0-9_\-\.]+$', tool_name): verbose_proxy_logger.warning( "MCPJWTSigner: tool_name '%s' contains unexpected characters; omitting tool-specific scope", tool_name, ) tool_name = ""

greptile-apps · 2026-03-17T21:07:11Z

litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py

+        signer = get_mcp_jwt_signer()
+        if signer is not None:
+            request_base_url = get_request_base_url(request)
+            if isinstance(response, dict):
+                response["jwks_uri"] = f"{request_base_url}/.well-known/jwks.json"
+                response["id_token_signing_alg_values_supported"] = ["RS256"]


In-place mutation of a potentially shared response object

oauth_authorization_server_mcp(request) returns its result via response, and this code then mutates it in-place with response["jwks_uri"] = .... If the underlying function ever returns a cached or shared dict object (e.g., from a module-level constant or cached response), this mutation could permanently alter the base OIDC document for all subsequent requests — even after the MCPJWTSigner singleton is unregistered.

Use a defensive copy:

if isinstance(response, dict): response = {**response} # shallow copy to avoid mutating a shared object response["jwks_uri"] = f"{request_base_url}/.well-known/jwks.json" response["id_token_signing_alg_values_supported"] = ["RS256"]

greptile-apps · 2026-03-17T21:07:12Z

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py

+
+        # sub: End-user identity (RFC 8693)
+        user_id = getattr(user_api_key_dict, "user_id", None)
+        if user_id:
+            claims["sub"] = user_id


Missing sub claim for anonymous / service-account calls

When user_api_key_dict.user_id is None (common for service-account API keys or unauthenticated calls), the JWT is issued without a sub claim. RFC 7519 marks sub as optional, but many JWT consumers — including OIDC-compliant validators and AWS Bedrock AgentCore Gateway — require or expect sub to be present and will reject the token. Without a sub, the token also provides no useful end-user identity, which defeats part of the zero-trust purpose.

A defensive fallback should be added so sub is always populated:

user_id = getattr(user_api_key_dict, "user_id", None) if user_id: claims["sub"] = user_id else: # Fallback: use a sentinel so sub is always present in the token claims["sub"] = "litellm-proxy-service-account"

- OpenAPI servers: warn + skip header injection instead of 500 - JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent - sub claim: fallback to apikey:{token_hash} for anonymous callers - ttl_seconds: validate > 0 at init time

ishaan-jaff · 2026-03-17T21:15:24Z

@greptile review

greptile-apps · 2026-03-17T21:22:39Z

tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_hook_extra_headers.py

+"""
+Tests for pre_mcp_call guardrail hook header mutation support.
+
+Validates that:
+1. _convert_mcp_hook_response_to_kwargs extracts extra_headers from hook response
+2. pre_call_tool_check returns hook-provided extra_headers AND modified arguments
+3. call_tool flows hook headers and modified arguments downstream
+4. Hook-provided headers take highest priority (merge after static_headers)


Module docstring contradicts actual test behavior

Line 5 of the module docstring says:

5. OpenAPI-backed servers raise HTTPException when hook headers are present

But the actual test on line 491–519 (test_openapi_server_warns_and_continues_on_hook_headers) verifies the opposite behavior — the server logs a warning and continues rather than raising an exception. The docstring documents the old broken behavior that was deliberately fixed (silent drop with a warning instead of a hard 500).

This mismatch will mislead future contributors reviewing this file about what the expected contract is.

Suggested change

"""

Tests for pre_mcp_call guardrail hook header mutation support.

Validates that:

1. _convert_mcp_hook_response_to_kwargs extracts extra_headers from hook response

2. pre_call_tool_check returns hook-provided extra_headers AND modified arguments

3. call_tool flows hook headers and modified arguments downstream

4. Hook-provided headers take highest priority (merge after static_headers)

"""

Tests for pre_mcp_call guardrail hook header mutation support.

Validates that:

1. _convert_mcp_hook_response_to_kwargs extracts extra_headers from hook response

2. pre_call_tool_check returns hook-provided extra_headers AND modified arguments

3. call_tool flows hook headers and modified arguments downstream

4. Hook-provided headers take highest priority (merge after static_headers)

5. OpenAPI-backed servers log a warning and skip header injection (graceful degradation)

6. JWT claims are propagated in both standard and virtual-key fast paths

7. Backward compatibility: hooks without extra_headers continue to work

"""

Rule Used: # Code Review Rule: Mock Test Integrity

What:... (source)

greptile-apps · 2026-03-17T21:22:40Z

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py

+        self.ttl_seconds: int = resolved_ttl
+
+        # Register singleton so the JWKS endpoint can access it.
+        global _mcp_jwt_signer_instance


Duplicate initialization replaces JWKS singleton, silently breaking first signer's tokens

The constructor unconditionally assigns _mcp_jwt_signer_instance = self. If a user accidentally configures two mcp_jwt_signer guardrails in config.yaml (or if the guardrail is initialized twice during reload), the second instance replaces the singleton, so /.well-known/jwks.json returns only the second instance's public key. However, both MCPJWTSigner instances remain active in the callback list and continue signing JWTs — meaning JWTs from the first signer can no longer be verified (the matching public key has been evicted from JWKS), causing silent authentication failures downstream.

Add a guard that raises an error on duplicate initialization:

# Register singleton so the JWKS endpoint can access it. global _mcp_jwt_signer_instance if _mcp_jwt_signer_instance is not None: raise ValueError( "MCPJWTSigner: only one instance may be active at a time. " "Found an existing signer (kid=%s). Remove duplicate guardrail config.", _mcp_jwt_signer_instance._kid, ) _mcp_jwt_signer_instance = self

greptile-apps · 2026-03-17T21:22:41Z

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/__init__.py

+
+from litellm.types.guardrails import SupportedGuardrailIntegrations
+
+from .mcp_jwt_signer import MCPJWTSigner, _mcp_jwt_signer_instance, get_mcp_jwt_signer


Private module variable _mcp_jwt_signer_instance exported as public API

_mcp_jwt_signer_instance is a private implementation detail (the leading _ convention signals this). Re-exporting it in __all__ makes it part of the package's public API, which may cause unintended external usage or direct mutation of the singleton from outside the module.

External callers should only use get_mcp_jwt_signer() (the public accessor). Remove _mcp_jwt_signer_instance from the import and __all__:

from .mcp_jwt_signer import MCPJWTSigner, get_mcp_jwt_signer

Suggested change

from .mcp_jwt_signer import MCPJWTSigner, _mcp_jwt_signer_instance, get_mcp_jwt_signer

__all__ = [

"MCPJWTSigner",

"initialize_guardrail",

"get_mcp_jwt_signer",

]

greptile-apps · 2026-03-17T21:22:42Z

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py

+        # end_user_id (if set separately from user_id)
+        end_user_id = getattr(user_api_key_dict, "end_user_id", None)
+        if end_user_id:
+            claims["end_user_id"] = end_user_id
+


mcp:tools/list scope always granted even for specific tool calls

The scope claim is built as:

scopes = ["mcp:tools/call", "mcp:tools/list"] if tool_name: scopes.append(f"mcp:tools/{tool_name}:call")

This means every JWT issued for a tool call (call_mcp_tool) unconditionally includes mcp:tools/list — a broader permission that lets the token holder enumerate available tools. For a zero-trust model, a token authorizing a single tool call should not also authorize list.

MCP servers that enforce fine-grained scope checks (e.g., AWS Bedrock AgentCore Gateway) could reject or mishandle this. Consider restricting the base scopes to only what's needed for the actual call type:

scopes = ["mcp:tools/call"] if tool_name: scopes.append(f"mcp:tools/{tool_name}:call") # Only add list scope if this is an explicit list call, not a tool call

Keep warn+skip behavior for OpenAPI servers (not 400 raise). Both test suites pass (45 tests).

greptile-apps · 2026-03-17T21:35:17Z

litellm/proxy/guardrails/guardrail_hooks/mcp_jwt_signer/mcp_jwt_signer.py

+        data["extra_headers"] = {
+            "Authorization": f"Bearer {signed_token}",
+        }


extra_headers assignment silently discards headers from prior hooks

data["extra_headers"] = {...} unconditionally replaces any extra_headers value that an earlier guardrail in the chain may have injected. In a multi-guardrail configuration where another hook also contributes headers (e.g., a custom guardrail that injects tracing or auth headers before MCPJWTSigner runs), those headers are silently dropped.

The assignment should merge into any existing dict rather than replace it:

Suggested change

data["extra_headers"] = {

"Authorization": f"Bearer {signed_token}",

}

existing = data.get("extra_headers") or {}

data["extra_headers"] = {

**existing,

"Authorization": f"Bearer {signed_token}",

}

greptile-apps · 2026-03-17T21:35:18Z

litellm/proxy/_experimental/mcp_server/mcp_server_manager.py

+        if hook_extra_headers:
+            if extra_headers is None:
+                extra_headers = {}
+            extra_headers.update(hook_extra_headers)


Hook JWT silently overwrites existing OAuth2/static Authorization header on regular MCP servers

hook_extra_headers is merged last (highest priority). When MCPJWTSigner is enabled with default_on: true, its Authorization: Bearer <jwt> overwrites any Authorization header already set by OAuth2 exchange or mcp_server_auth_headers — silently breaking authentication for any regular (non-OpenAPI) MCP server that already has its own auth configured.

The OpenAPI code path received a warning (verbose_logger.warning(...)) for this case, but the regular MCP code path has no equivalent guard. Users enabling this guardrail against servers with existing OAuth2 or API-key auth will see silent authentication failures.

At minimum, add a warning similar to the OpenAPI path when the hook header would override an existing Authorization:

if hook_extra_headers: if extra_headers is None: extra_headers = {} if "Authorization" in hook_extra_headers and "Authorization" in extra_headers: verbose_logger.warning( "MCPJWTSigner: overriding existing Authorization header for " "MCP server '%s'. Disable MCPJWTSigner or remove server-level " "auth if this is unintended.", mcp_server.server_name, ) extra_headers.update(hook_extra_headers)

- mcp_server_manager: warn when hook Authorization overwrites existing header - __init__: remove _mcp_jwt_signer_instance from __all__ (private internal) - discoverable_endpoints: copy dict instead of mutating in-place on OIDC augmentation - test docstring: reflect warn-and-continue behavior for OpenAPI servers - test: update scope assertions for least-privilege (no mcp:tools/list on tool-call JWTs)

ishaan-jaff · 2026-03-17T21:44:43Z

@greptile review

- initialize_guardrail: validate mode='pre_mcp_call' at init time — misconfigured mode silently bypasses JWT injection, which is a zero-trust bypass - _build_claims: remove duplicate inline 'import re' (module-level import already present) - _types.py: add TODO comment explaining jwt_claims is forward-compat plumbing for a follow-up PR that will forward upstream IdP claims into outbound MCP JWTs

ishaan-jaff · 2026-03-17T21:56:58Z

@greptile review

… FR parity Merges PR #23897 and adds missing pieces from the JWT signer scoping doc: FR-5 (verify + re-sign): Uses upstream jwt_claims from already-validated incoming JWT when available (set via UserAPIKeyAuth.jwt_claims). When access_token_discovery_uri is configured, the signer operates in re-sign mode using the upstream IdP claims rather than generating identity from the LiteLLM user profile. FR-12 (end-user identity mapping): Configurable end_user_claim_sources ordered list. Tries each source as a JWT claim name, then as a raw request header (case-insensitive), then as a UserAPIKeyAuth field. First non-empty wins. Defaults to ["sub", "preferred_username", "email", "user_id"]. FR-13 (claim operations): add_claims, set_claims, remove_claims config. Applied in Kong order: add (if not present) → set (override) → remove (strip). FR-14 (two-token model): channel_token_header (default: X-Channel-Token), channel_token_discovery_uri, channel_token_jwks_uri. Channel token is read from raw request headers passed through pre_call_tool_check → pre_hook_kwargs → mcp_raw_headers in synthetic LLM data. When present its sub/client_id becomes act.sub per RFC 8693. Verified via OIDC discovery or JWKS URI when configured. FR-15 (claim validation): required_claims / optional_claims. Rejects requests where required JWT claims are absent or where no JWT was used at all. FR-9 (debug headers): x-litellm-mcp-debug on outbound MCP requests (default on, disable with debug_header: false). JSON payload with signer, kid, issuer, sub, act, mode (sign vs re-sign), channel_token flag. FR-10 (configurable scope): allowed_tools list for admin-defined fine-grained tool control. When set, scope is built from the explicit list only (no overpermission of tools/list during tool calls). Empty list falls back to auto-generated scope. Also fixes raw_headers plumbing for channel token: pre_call_tool_check now accepts raw_headers, passes it through pre_hook_kwargs, and _convert_mcp_to_llm_format includes it as mcp_raw_headers in synthetic data. Tests: 46 tests (was 15), covering all new FRs.

… configurable scopes Addresses all missing pieces from the scoping doc review: FR-5 (Verify + re-sign): MCPJWTSigner now accepts access_token_discovery_uri and token_introspection_endpoint. When set, the incoming Bearer token is extracted from raw_headers (threaded through pre_call_tool_check), verified against the IdP's JWKS (JWT) or introspected (opaque), and only re-signed if valid. Falls back to user_api_key_dict.jwt_claims for LiteLLM JWT-auth mode. FR-12 (Configurable end-user identity mapping): end_user_claim_sources ordered list drives sub resolution — sources: token:<claim>, litellm:user_id, litellm:email, litellm:end_user_id, litellm:team_id. FR-13 (Claim operations): add_claims (insert-if-absent), set_claims (always override), remove_claims (delete) applied in that order. FR-14 (Two-token model): channel_token_audience + channel_token_ttl issue a second JWT injected as x-mcp-channel-token: Bearer <token>. FR-15 (Incoming claim validation): required_claims raises HTTP 403 when any listed claim is absent; optional_claims passes listed claims from verified token into the outbound JWT. FR-9 (Debug headers): debug_headers: true emits x-litellm-mcp-debug with kid, sub, iss, exp, scope. FR-10 (Configurable scopes): allowed_scopes replaces auto-generation. Also fixed: tool-call JWTs no longer grant mcp:tools/list (overpermission). P1 fixes: - proxy/utils.py: _convert_mcp_hook_response_to_kwargs merges rather than replaces extra_headers, preserving headers from prior guardrails. - mcp_server_manager.py: warns when hook injects Authorization alongside a server-configured authentication_token (previously silent). - mcp_server_manager.py: pre_call_tool_check now accepts raw_headers and extracts incoming_bearer_token so FR-5 verification has the raw token. - proxy/utils.py: remove stray inline import inspect inside loop (pre-existing lint error, now cleaned up). Tests: 43 passing (28 new tests covering all FR flags + P1 fixes).

… configurable scopes (core) Remaining files from the FR implementation: mcp_jwt_signer.py — full rewrite with all new params: FR-5: access_token_discovery_uri, token_introspection_endpoint, verify_issuer, verify_audience + _verify_incoming_jwt(), _introspect_opaque_token() FR-12: end_user_claim_sources ordered resolution chain FR-13: add_claims, set_claims, remove_claims FR-14: channel_token_audience, channel_token_ttl → x-mcp-channel-token FR-15: required_claims (raises 403), optional_claims (passthrough) FR-9: debug_headers → x-litellm-mcp-debug FR-10: allowed_scopes; tool-call JWTs no longer over-grant tools/list mcp_server_manager.py: - pre_call_tool_check gains raw_headers param to extract incoming_bearer_token - Silent Authorization override warning fixed: now fires when server has authentication_token AND hook injects Authorization tests/test_mcp_jwt_signer.py: 28 new tests covering all FR flags + P1 fixes (43 total, all passing)

- Remove stale TODO comment on UserAPIKeyAuth.jwt_claims — the field is already populated and consumed by MCPJWTSigner in the same PR - Fix _get_oidc_discovery to only cache the OIDC discovery doc when jwks_uri is present; a malformed/empty doc now retries on the next request instead of being permanently cached until proxy restart - Add FR-5 test coverage for _fetch_jwks (cache hit/miss), _get_oidc_discovery (cache/no-cache on bad doc), _verify_incoming_jwt (valid token, expired token), _introspect_opaque_token (active, inactive, no endpoint), and the end-to-end 401 hook path — 53 tests total, all passing

ishaan-jaff · 2026-03-18T00:09:46Z

@greptile review again

…signer features Add scenario-driven sections for each new config area: - Verify+re-sign with Okta/Azure AD (access_token_discovery_uri, end_user_claim_sources, token_introspection_endpoint) - Enforcing caller attributes with required_claims / optional_claims - Adding metadata via add_claims / set_claims / remove_claims - Two-token model for AWS Bedrock AgentCore Gateway (channel_token_audience / channel_token_ttl) - Controlling scopes with allowed_scopes - Debugging JWT rejections with debug_headers Update JWT claims table to reflect configurable sub (end_user_claim_sources)

…uardrail The factory was only passing issuer/audience/ttl_seconds to MCPJWTSigner. All FR-5/9/10/12/13/14/15 params (access_token_discovery_uri, end_user_claim_sources, add/set/remove_claims, channel_token_audience, required/optional_claims, debug_headers, allowed_scopes, etc.) were silently dropped, making every advertised advanced feature non-functional when loaded from config.yaml. Add regression test that asserts every param is wired through correctly.

- Lead with the problem (unsigned direct calls bypass access controls) - Shorter statement section headers instead of question-form headers - Move diagram/OIDC discovery block after the reader is bought in - Add 'read further only if you need to' callout after basic setup - Two-token section now opens from the user problem not product jargon - Add concrete 403 error response example in required_claims section - Debug section opens from the symptom (MCP server returning 401) - Lowercase claims reference header for consistency

…ery 24h TTL - Remove alg from unverified JWT header; use signing_jwk.algorithm_name from JWKS key instead. Reading alg from attacker-controlled headers enables alg:none / HS256 confusion attacks. - Add _oidc_discovery_fetched_at timestamp and _OIDC_DISCOVERY_TTL = 86400 (24h). Without a TTL the cached discovery doc never refreshes, so IdP key rotation is invisible.

* feat(xai): add grok-4.20 beta 2 models with pricing (#23900) Add three grok-4.20 beta 2 model variants from xAI: - grok-4.20-multi-agent-beta-0309 (reasoning + multi-agent) - grok-4.20-beta-0309-reasoning (reasoning) - grok-4.20-beta-0309-non-reasoning Pricing (from https://docs.x.ai/docs/models): - Input: $2.00/1M tokens ($0.20/1M cached) - Output: $6.00/1M tokens - Context: 2M tokens All variants support vision, function calling, tool choice, and web search. Closes LIT-2171 * docs: add Quick Install section for litellm --setup wizard (#23905) * docs: add Quick Install section for litellm --setup wizard * docs: clarify setup wizard is for local/beginner use * feat(setup): interactive setup wizard + install.sh (#23644) * feat(setup): add interactive setup wizard + install.sh Adds `litellm --setup` — a Claude Code-style TUI onboarding wizard that guides users through provider selection, API key entry, and proxy config generation, then optionally starts the proxy immediately. - litellm/setup_wizard.py: wizard with ASCII art, numbered provider menu (OpenAI, Anthropic, Azure, Gemini, Bedrock, Ollama), API key prompts, port/master-key config, and litellm_config.yaml generation - litellm/proxy/proxy_cli.py: adds --setup flag that invokes the wizard - scripts/install.sh: curl-installable script (detect OS/Python, pip install litellm[proxy], launch wizard) Usage: curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh | sh litellm --setup * fix(install.sh): remove orange color, add LITELLM_BRANCH env var for branch installs * fix(install.sh): install from git branch so --setup is available for QA * fix(install.sh): remove stale LITELLM_BRANCH reference that caused unbound variable error * fix(install.sh): force-reinstall from git to bypass cached PyPI version * fix(install.sh): show pip progress bar during install * fix(install.sh): always launch wizard via $PYTHON_BIN -m litellm, not PATH binary * fix(install.sh): use litellm.proxy.proxy_cli module (no __main__.py exists) * fix(install.sh): suppress RuntimeWarning from module invocation * fix(install.sh): use Python bin-dir litellm binary to avoid CWD sys.path shadowing * fix(install.sh): use sysconfig.get_path('scripts') to find pip-installed litellm binary * fix(install.sh): redirect stdin from /dev/tty on exec so wizard gets terminal, not exhausted pipe * fix(install.sh): warn about git clone duration, drop --no-cache-dir so re-runs are faster * feat(setup_wizard): arrow-key selector, updated model names * fix(setup_wizard): use sysconfig binary to start proxy, not python -m litellm * feat(setup_wizard): credential validation after key entry + clear next-steps after proxy start * style(install.sh): show git clone warning in blue * refactor(setup_wizard): class with static methods, use check_valid_key from litellm.utils * address greptile review: fix yaml escaping, port validation, display name collisions, tests - setup_wizard.py: add _yaml_escape() for safe YAML embedding of API keys - setup_wizard.py: add _styled_input() with readline ANSI ignore markers - setup_wizard.py: change DIVIDER to _divider() fn to avoid import-time color capture - setup_wizard.py: validate port range 1-65535, initialize before loop - setup_wizard.py: qualify azure display names (azure-gpt-4o) to avoid collision with openai - setup_wizard.py: work on env_copy in _build_config to avoid mutating caller's dict - setup_wizard.py: skip model_list entries for providers with no credentials - setup_wizard.py: prompt for azure deployment name - setup_wizard.py: wrap os.execlp in try/except with friendly fallback - setup_wizard.py: wrap config write in try/except OSError - setup_wizard.py: fix _validate_and_report to use two print lines (no \r overwrite) - setup_wizard.py: add .gitignore tip next to key storage notice - setup_wizard.py: fix run_setup_wizard() return type annotation to None - scripts/install.sh: drop pipefail (not supported by dash on Ubuntu when invoked as sh) - scripts/install.sh: use litellm[proxy] from PyPI (not hardcoded dev branch) - scripts/install.sh: guard /dev/tty read with -r check for Docker/CI compat - scripts/install.sh: remove --force-reinstall to avoid downgrading dependencies - tests/test_litellm/test_setup_wizard.py: 13 unit tests for _build_config and _yaml_escape * style: black format setup_wizard.py * fix: address remaining greptile issues - Windows compat, YAML quoting, credential flow - guard termios/tty imports with try/except ImportError for Windows compat - quote master_key as YAML double-quoted scalar (same as env vars) - remove unused port param from _build_config signature - _validate_and_report now returns the final key so re-entered creds are stored - add test for master_key YAML quoting * fix: add --port to suggested command, guard /dev/tty exec in install.sh * fix: quote api_base in YAML, skip azure if no deployment, only redraw on state change * fix: address greptile review comments - _yaml_escape: add control character escaping (\n, \r, \t) - test: fix tautological assertion in test_build_config_azure_no_deployment_skipped - test: add tests for control character escaping in _yaml_escape * feat(ui): remove Chat UI page link and banner from sidebar and playground (#23908) * feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MCP auth (#23897) * Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers * Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present. * Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings. * Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers * Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present. * Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings. * feat(guardrails): add MCPJWTSigner built-in guardrail for zero trust MCP auth Signs outbound MCP tool calls with a LiteLLM-issued RS256 JWT so MCP servers can trust a single signing authority instead of every upstream IdP. Enable in config.yaml: guardrails: - guardrail_name: mcp-jwt-signer litellm_params: guardrail: mcp_jwt_signer mode: pre_mcp_call default_on: true JWT carries sub (user_id), act.sub (team_id, RFC 8693), tool-level scope, iss, aud, iat/exp/nbf. RSA-2048 keypair auto-generated at startup unless MCP_JWT_SIGNING_KEY env var is set. Adds /.well-known/jwks.json endpoint and jwks_uri to /.well-known/openid-configuration so MCP servers can verify LiteLLM-issued tokens via OIDC discovery. * Update MCPServerManager to raise HTTPException with status code 400 for extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message. * fix: address P1 issues in MCPJWTSigner - OpenAPI servers: warn + skip header injection instead of 500 - JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent - sub claim: fallback to apikey:{token_hash} for anonymous callers - ttl_seconds: validate > 0 at init time * docs: add MCP zero trust auth guide with architecture diagram * docs: add FastMCP JWT verification guide to zero trust doc * fix: address remaining Greptile review issues (round 2) - mcp_server_manager: warn when hook Authorization overwrites existing header - __init__: remove _mcp_jwt_signer_instance from __all__ (private internal) - discoverable_endpoints: copy dict instead of mutating in-place on OIDC augmentation - test docstring: reflect warn-and-continue behavior for OpenAPI servers - test: update scope assertions for least-privilege (no mcp:tools/list on tool-call JWTs) * fix: address Greptile round 3 feedback - initialize_guardrail: validate mode='pre_mcp_call' at init time — misconfigured mode silently bypasses JWT injection, which is a zero-trust bypass - _build_claims: remove duplicate inline 'import re' (module-level import already present) - _types.py: add TODO comment explaining jwt_claims is forward-compat plumbing for a follow-up PR that will forward upstream IdP claims into outbound MCP JWTs * feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes Addresses all missing pieces from the scoping doc review: FR-5 (Verify + re-sign): MCPJWTSigner now accepts access_token_discovery_uri and token_introspection_endpoint. When set, the incoming Bearer token is extracted from raw_headers (threaded through pre_call_tool_check), verified against the IdP's JWKS (JWT) or introspected (opaque), and only re-signed if valid. Falls back to user_api_key_dict.jwt_claims for LiteLLM JWT-auth mode. FR-12 (Configurable end-user identity mapping): end_user_claim_sources ordered list drives sub resolution — sources: token:<claim>, litellm:user_id, litellm:email, litellm:end_user_id, litellm:team_id. FR-13 (Claim operations): add_claims (insert-if-absent), set_claims (always override), remove_claims (delete) applied in that order. FR-14 (Two-token model): channel_token_audience + channel_token_ttl issue a second JWT injected as x-mcp-channel-token: Bearer <token>. FR-15 (Incoming claim validation): required_claims raises HTTP 403 when any listed claim is absent; optional_claims passes listed claims from verified token into the outbound JWT. FR-9 (Debug headers): debug_headers: true emits x-litellm-mcp-debug with kid, sub, iss, exp, scope. FR-10 (Configurable scopes): allowed_scopes replaces auto-generation. Also fixed: tool-call JWTs no longer grant mcp:tools/list (overpermission). P1 fixes: - proxy/utils.py: _convert_mcp_hook_response_to_kwargs merges rather than replaces extra_headers, preserving headers from prior guardrails. - mcp_server_manager.py: warns when hook injects Authorization alongside a server-configured authentication_token (previously silent). - mcp_server_manager.py: pre_call_tool_check now accepts raw_headers and extracts incoming_bearer_token so FR-5 verification has the raw token. - proxy/utils.py: remove stray inline import inspect inside loop (pre-existing lint error, now cleaned up). Tests: 43 passing (28 new tests covering all FR flags + P1 fixes). * feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes (core) Remaining files from the FR implementation: mcp_jwt_signer.py — full rewrite with all new params: FR-5: access_token_discovery_uri, token_introspection_endpoint, verify_issuer, verify_audience + _verify_incoming_jwt(), _introspect_opaque_token() FR-12: end_user_claim_sources ordered resolution chain FR-13: add_claims, set_claims, remove_claims FR-14: channel_token_audience, channel_token_ttl → x-mcp-channel-token FR-15: required_claims (raises 403), optional_claims (passthrough) FR-9: debug_headers → x-litellm-mcp-debug FR-10: allowed_scopes; tool-call JWTs no longer over-grant tools/list mcp_server_manager.py: - pre_call_tool_check gains raw_headers param to extract incoming_bearer_token - Silent Authorization override warning fixed: now fires when server has authentication_token AND hook injects Authorization tests/test_mcp_jwt_signer.py: 28 new tests covering all FR flags + P1 fixes (43 total, all passing) * fix(mcp_jwt_signer): address pre-landing review issues - Remove stale TODO comment on UserAPIKeyAuth.jwt_claims — the field is already populated and consumed by MCPJWTSigner in the same PR - Fix _get_oidc_discovery to only cache the OIDC discovery doc when jwks_uri is present; a malformed/empty doc now retries on the next request instead of being permanently cached until proxy restart - Add FR-5 test coverage for _fetch_jwks (cache hit/miss), _get_oidc_discovery (cache/no-cache on bad doc), _verify_incoming_jwt (valid token, expired token), _introspect_opaque_token (active, inactive, no endpoint), and the end-to-end 401 hook path — 53 tests total, all passing * docs(mcp_zero_trust): rewrite as use-case guide covering all new JWT signer features Add scenario-driven sections for each new config area: - Verify+re-sign with Okta/Azure AD (access_token_discovery_uri, end_user_claim_sources, token_introspection_endpoint) - Enforcing caller attributes with required_claims / optional_claims - Adding metadata via add_claims / set_claims / remove_claims - Two-token model for AWS Bedrock AgentCore Gateway (channel_token_audience / channel_token_ttl) - Controlling scopes with allowed_scopes - Debugging JWT rejections with debug_headers Update JWT claims table to reflect configurable sub (end_user_claim_sources) * fix(mcp_jwt_signer): wire all config.yaml params through initialize_guardrail The factory was only passing issuer/audience/ttl_seconds to MCPJWTSigner. All FR-5/9/10/12/13/14/15 params (access_token_discovery_uri, end_user_claim_sources, add/set/remove_claims, channel_token_audience, required/optional_claims, debug_headers, allowed_scopes, etc.) were silently dropped, making every advertised advanced feature non-functional when loaded from config.yaml. Add regression test that asserts every param is wired through correctly. * docs(mcp_zero_trust): add hero image * docs(mcp_zero_trust): apply Linear-style edits - Lead with the problem (unsigned direct calls bypass access controls) - Shorter statement section headers instead of question-form headers - Move diagram/OIDC discovery block after the reader is bought in - Add 'read further only if you need to' callout after basic setup - Two-token section now opens from the user problem not product jargon - Add concrete 403 error response example in required_claims section - Debug section opens from the symptom (MCP server returning 401) - Lowercase claims reference header for consistency * fix(mcp_jwt_signer): fix algorithm confusion attack + add OIDC discovery 24h TTL - Remove alg from unverified JWT header; use signing_jwk.algorithm_name from JWKS key instead. Reading alg from attacker-controlled headers enables alg:none / HS256 confusion attacks. - Add _oidc_discovery_fetched_at timestamp and _OIDC_DISCOVERY_TTL = 86400 (24h). Without a TTL the cached discovery doc never refreshes, so IdP key rotation is invisible. --------- Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com> * fix(ci): stabilize CI - formatting, type errors, test polling, security CVEs, router bug, batch resolution Fix 1: Run Black formatter on 35 files Fix 2: Fix MyPy type errors: - setup_wizard.py: add type annotation for 'selected' set variable - user_api_key_auth.py: remove redundant type annotation on jwt_claims reassignment Fix 3: Fix spend accuracy test burst 2 polling to wait for expected total spend instead of just 'any increase' from burst 2 Fix 4: Bump Next.js 16.1.6 -> 16.1.7 to fix CVE-2026-27978, CVE-2026-27979, CVE-2026-27980, CVE-2026-29057 Fix 5: Fix router _pre_call_checks model variable being overwritten inside loop, causing wrong model lookups on subsequent deployments. Use local _deployment_model variable instead. Fix 6: Add missing resolve_output_file_ids_to_unified call in batch retrieve non-terminal-to-terminal path (matching the terminal path behavior) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * chore: regenerate poetry.lock to sync with pyproject.toml Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: format merged files from main and regenerate poetry.lock Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(mypy): annotate jwt_claims as Optional[dict] to fix type incompatibility Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): update router region test to use gpt-4.1-mini (fix flaky model lookup) Replace deprecated gpt-3.5-turbo-1106 with gpt-4.1-mini + mock_response in test_router_region_pre_call_check, following the same pattern used in commit 717d37c for test_router_context_window_check_pre_call_check_out_group. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * ci: retry flaky logging_testing (async event loop race condition) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): aggregate all mock calls in langfuse e2e test to fix race condition The _verify_langfuse_call helper only inspected the last mock call (mock_post.call_args), but the Langfuse SDK may split trace-create and generation-create events across separate HTTP flush cycles. This caused an IndexError when the last call's batch contained only one event type. Fix: iterate over mock_post.call_args_list to collect batch items from ALL calls. Also add a safety assertion after filtering by trace_id and mark all langfuse e2e tests with @pytest.mark.flaky(retries=3) as an extra safety net for any residual timing issues. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): black formatting + update OpenAPI compliance tests for spec changes - Apply Black 26.x formatting to litellm_logging.py (parenthesized style) - Update test_input_types_match_spec to follow $ref to InteractionsInput schema (Google updated their OpenAPI spec to use $ref instead of inline oneOf) - Update test_content_schema_uses_discriminator to handle discriminator without explicit mapping (Google removed the mapping key from Content discriminator) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * revert: undo incorrect Black 26.x formatting on litellm_logging.py The file was correctly formatted for Black 23.12.1 (the version pinned in pyproject.toml). The previous commit applied Black 26.x formatting which was incompatible with the CI's Black version. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): deduplicate and sort langfuse batch events after aggregation The Langfuse SDK may send the same event (e.g., trace-create) in multiple flush cycles, causing duplicates when we aggregate from all mock calls. After filtering by trace_id, deduplicate by keeping only the first event of each type, then sort to ensure trace-create is at index 0 and generation-create at index 1. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

- Add xai/grok-4.20-beta-0309-reasoning (3rd xAI model, was missing) - Update New Model count 11 → 12 - Fix supports_minimal_reasoning_effort description (full gpt-5.x series) - Add Akto guardrail integration (BerriAI#23250) - Add MCP JWT Signer guardrail (BerriAI#23897) - Add pre_mcp_call header mutation (BerriAI#23889) - Add litellm --setup wizard (BerriAI#23644) - Fix ### Bug Fixes → #### Bugs under New Models - Add missing Documentation Updates section - Rename Diff Summary "AI Integrations" → "Logging / Guardrail / Prompt Management Integrations" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

noahnistler and others added 7 commits March 17, 2026 14:31

Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers

89a43cf

Enhance MCPServerManager to support hook-modified arguments and extra…

94da7e6

… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.

Refactor MCPServerManager to raise HTTPException for extra headers in…

4af352a

… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.

Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers

f612533

Enhance MCPServerManager to support hook-modified arguments and extra…

8574094

… headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.

Refactor MCPServerManager to raise HTTPException for extra headers in…

296f382

… OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.

vercel bot deployed to Preview March 17, 2026 21:04 View deployment

Update MCPServerManager to raise HTTPException with status code 400 f…

9f443a3

…or extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message.

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

fix: address P1 issues in MCPJWTSigner

8574543

- OpenAPI servers: warn + skip header injection instead of 500 - JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent - sub claim: fallback to apikey:{token_hash} for anonymous callers - ttl_seconds: validate > 0 at init time

docs: add MCP zero trust auth guide with architecture diagram

bb6a9aa

vercel bot deployed to Preview March 17, 2026 21:18 View deployment

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

merge: resolve conflicts from PR #23889

5253b6c

Keep warn+skip behavior for OpenAPI servers (not 400 raise). Both test suites pass (45 tests).

vercel bot deployed to Preview March 17, 2026 21:32 View deployment

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

docs: add FastMCP JWT verification guide to zero trust doc

18fc306

vercel bot deployed to Preview March 17, 2026 21:38 View deployment

vercel bot deployed to Preview March 17, 2026 21:46 View deployment

vercel bot deployed to Preview March 17, 2026 21:58 View deployment

ishaan-jaff mentioned this pull request Mar 17, 2026

feat(guardrails): MCPJWTSigner - complete zero trust MCP JWT signing (FR-5/9/10/12/13/14/15) #23912

Closed

3 tasks

ishaan-jaff added 2 commits March 17, 2026 15:48

vercel bot deployed to Preview March 17, 2026 22:50 View deployment

vercel bot deployed to Preview March 18, 2026 00:08 View deployment

ishaan-jaff changed the base branch from main to litellm_ishaan_march_17 March 18, 2026 00:09

vercel bot deployed to Preview March 18, 2026 00:17 View deployment

vercel bot deployed to Preview March 18, 2026 00:20 View deployment

ishaan-jaff added 3 commits March 17, 2026 17:23

docs(mcp_zero_trust): add hero image

a906052

vercel bot had a problem deploying to Preview March 18, 2026 00:29 Failure

ishaan-jaff merged commit d9a6036 into litellm_ishaan_march_17 Mar 18, 2026
3 of 5 checks passed

ishaan-jaff mentioned this pull request Mar 18, 2026

docs(mcp_zero_trust): add MCP zero trust auth guide #23918

Merged

5 tasks

joereyna mentioned this pull request Mar 24, 2026

docs(release-notes): add v1.82.6.rc.1 release notes #24452

Open

3 tasks


		from litellm.types.guardrails import SupportedGuardrailIntegrations

		from .mcp_jwt_signer import MCPJWTSigner, _mcp_jwt_signer_instance, get_mcp_jwt_signer

-from .mcp_jwt_signer import MCPJWTSigner, _mcp_jwt_signer_instance, get_mcp_jwt_signer
+__all__ = [
+    "MCPJWTSigner",
+    "initialize_guardrail",
+    "get_mcp_jwt_signer",
+]

Uh oh!

Conversation

ishaan-jaff commented Mar 17, 2026

Relevant issues

What this does

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (4)

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 17, 2026

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 17, 2026

Uh oh!

ishaan-jaff commented Mar 17, 2026

Uh oh!

ishaan-jaff commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 17, 2026 •

edited

Loading

codspeed-hq bot commented Mar 17, 2026 •

edited

Loading

greptile-apps bot commented Mar 17, 2026 •

edited

Loading