Bitropy patches on v1.82.3-stable: passthrough credential priority + header security#1
Closed
Bitropy patches on v1.82.3-stable: passthrough credential priority + header security#1
Conversation
The test was missing mocks for extract_mcp_auth_context and set_auth_context, causing the handler to fail silently in the except block instead of reaching session_manager.handle_request. This mirrors the fix already applied to the sibling test_sse_mcp_handler_mock. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
… tests The test_anthropic_messages_openai_model_streaming_cost_injection test fails because the OpenAI Responses API returns 400 for requests routed through the Anthropic Messages endpoint. Setting LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES=true routes OpenAI models through the stable chat completions path instead. Cost injection still works since it happens at the proxy level. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
1. custom_auth_basic.py: Add user_role='proxy_admin' so the custom auth user can access management endpoints like /key/generate. The test test_assemblyai_transcribe_with_non_admin_key was hidden behind an earlier -x failure and was never reached before. 2. test_router_utils.py: Add flaky(retries=3) and increase sleep from 1s to 2s for test_router_get_model_group_usage_wildcard_routes. The async callback needs time to write usage to cache, and 1s is insufficient on slower CI hardware. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
…auth_basic Fixes mypy error: Argument 'user_role' has incompatible type 'str'; expected 'LitellmUserRoles | None' Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
…#22926) * fix: don't close HTTP/SDK clients on LLMClientCache eviction Removing the _remove_key override that eagerly called aclose()/close() on evicted clients. Evicted clients may still be held by in-flight streaming requests; closing them causes: RuntimeError: Cannot send a request, as the client has been closed. This is a regression from commit fb72979. Clients that are no longer referenced will be garbage-collected naturally. Explicit shutdown cleanup happens via close_litellm_async_clients(). Fixes production crashes after the 1-hour cache TTL expires. * test: update LLMClientCache unit tests for no-close-on-eviction behavior Flip the assertions: evicted clients must NOT be closed. Replace test_remove_key_closes_async_client → test_remove_key_does_not_close_async_client and equivalents for sync/eviction paths. Add test_remove_key_removes_plain_values for non-client cache entries. Remove test_background_tasks_cleaned_up_after_completion (no more _background_tasks). Remove test_remove_key_no_event_loop variant that depended on old behavior. * test: add e2e tests for OpenAI SDK client surviving cache eviction Add two new e2e tests using real AsyncOpenAI clients: - test_evicted_openai_sdk_client_stays_usable: verifies size-based eviction doesn't close the client - test_ttl_expired_openai_sdk_client_stays_usable: verifies TTL expiry eviction doesn't close the client Both tests sleep after eviction so any create_task()-based close would have time to run, making the regression detectable. Also expand the module docstring to explain why the sleep is required. * docs(AGENTS.md): add rule — never close HTTP/SDK clients on cache eviction * docs(CLAUDE.md): add HTTP client cache safety guideline
The security_scans.sh script uses `column` to format vulnerability output, but the package wasn't installed in the CI environment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When callbacks are configured as a plain string (e.g., `callbacks: "my_callback"`) instead of a list, the proxy crashes on startup with: TypeError: can only concatenate str (not "list") to str Normalize each callback setting to a list before concatenating. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nforcement The --enforce_prisma_migration_check flag is now required to trigger sys.exit(1) on DB migration failure, after BerriAI#23675 flipped the default behavior to warn-and-continue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hen router_model_id has no pricing When custom pricing is passed as per-request kwargs (input_cost_per_token/output_cost_per_token), completion() registers pricing under the model name, but _select_model_name_for_cost_calc was selecting the router deployment hash (which has no pricing data), causing response_cost to be 0.0. Now checks whether the router_model_id entry actually has pricing before preferring it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ream providers Prevent x-litellm-api-key (LiteLLM's virtual key) from being leaked to upstream providers when _forward_headers=True is used in passthrough endpoints.
Client-provided credentials now take precedence over server credentials in the /anthropic/ passthrough endpoint. This enables mixed mode where: 1. Client sends x-api-key → forwarded as-is (user pays via own API key) 2. Client sends Authorization → forwarded as-is (user pays via OAuth/Max) 3. No client credentials + server ANTHROPIC_API_KEY → server key used 4. No client credentials + no server key → no credentials forwarded Previously the server always sent x-api-key (even literal "None" when unconfigured), overwriting any client-provided credentials and breaking Claude Code Max (OAuth) and BYOK scenarios. Supersedes the simpler one-liner from d742c76 on v1.81.12-stable-patched. Based on the approach from PR BerriAI#20429 (closed) and reverted PR BerriAI#14821.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
2 patches on top of upstream
v1.82.3-stablethat fix the Anthropic passthrough endpoint for Claude Code Max (OAuth) and BYOK scenarios.Full documentation:
PASSTHROUGH_PATCHES.mdPatches
1. Strip x-litellm-api-key from forwarded headers (security)
File:
litellm/passthrough/utils.py(+1 line)Without this, the
x-litellm-api-keyproxy auth header is forwarded to Anthropic, leaking credentials.Upstream PR: BerriAI#20432 (open, unreviewed since Feb 2026)
2. Credential priority for Anthropic passthrough (critical)
File:
litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py(+19/-6 lines)The upstream code always sends
x-api-key: "{}".format(anthropic_api_key)as a custom header, which:x-api-key: Nonewhen no serverANTHROPIC_API_KEYis setNew behavior (credential priority):
Background
/anthropic/route has auth included — no premium license or custompass_through_endpointsconfig neededTesting
Tested locally on v1.82.4 (main-stable) with header inspection (request catcher) and real Anthropic API:
Deploy
Claude Code config
{ "env": { "ANTHROPIC_BASE_URL": "https://ai.demo.internal.bitropy.io/anthropic", "ANTHROPIC_CUSTOM_HEADERS": "x-litellm-api-key: sk-<your-virtual-key>" } }🤖 Generated with Claude Code