Skip to content

Bitropy patches on v1.82.3-stable: passthrough credential priority + header security#1

Closed
pkieszcz wants to merge 14 commits intomainfrom
bitropy/v1.82.3-stable-patched
Closed

Bitropy patches on v1.82.3-stable: passthrough credential priority + header security#1
pkieszcz wants to merge 14 commits intomainfrom
bitropy/v1.82.3-stable-patched

Conversation

@pkieszcz
Copy link
Copy Markdown

What this is

2 patches on top of upstream v1.82.3-stable that fix the Anthropic passthrough endpoint for Claude Code Max (OAuth) and BYOK scenarios.

Send this link to anyone who needs context. The diff tab shows exactly what we changed.

Full documentation: PASSTHROUGH_PATCHES.md

Patches

1. Strip x-litellm-api-key from forwarded headers (security)

File: litellm/passthrough/utils.py (+1 line)

Without this, the x-litellm-api-key proxy auth header is forwarded to Anthropic, leaking credentials.

Upstream PR: BerriAI#20432 (open, unreviewed since Feb 2026)

2. Credential priority for Anthropic passthrough (critical)

File: litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py (+19/-6 lines)

The upstream code always sends x-api-key: "{}".format(anthropic_api_key) as a custom header, which:

  • Sends literal x-api-key: None when no server ANTHROPIC_API_KEY is set
  • Overwrites client-provided credentials (OAuth tokens, BYOK keys) even when server key IS set

New behavior (credential priority):

Client sends Server has key Result
OAuth token (Claude Max) Any Client pays (Max subscription)
Own x-api-key (BYOK) Any Client pays (own API key)
Nothing Yes Company pays (server key)
Nothing No No credentials (Anthropic rejects)

Background

Testing

Tested locally on v1.82.4 (main-stable) with header inspection (request catcher) and real Anthropic API:

  • ✅ Claude Code Max (OAuth) passthrough
  • ✅ BYOK (client x-api-key) passthrough
  • ✅ Mixed mode (server key fallback)
  • ✅ Auth: wrong/missing litellm key rejected
  • ✅ x-litellm-api-key stripped from upstream requests
  • ✅ Spend tracking for streaming passthrough (v1.82.x fix)

Deploy

Image: europe-central2-docker.pkg.dev/bitropy-management/images/litellm:v1.82.3-stable-patched

Claude Code config

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://ai.demo.internal.bitropy.io/anthropic",
    "ANTHROPIC_CUSTOM_HEADERS": "x-litellm-api-key: sk-<your-virtual-key>"
  }
}

⚠️ Use ANTHROPIC_CUSTOM_HEADERS, NOT ANTHROPIC_API_KEY. See PASSTHROUGH_PATCHES.md for why.

🤖 Generated with Claude Code

cursoragent and others added 14 commits March 1, 2026 00:17
The test was missing mocks for extract_mcp_auth_context and set_auth_context,
causing the handler to fail silently in the except block instead of reaching
session_manager.handle_request. This mirrors the fix already applied to the
sibling test_sse_mcp_handler_mock.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
… tests

The test_anthropic_messages_openai_model_streaming_cost_injection test fails
because the OpenAI Responses API returns 400 for requests routed through the
Anthropic Messages endpoint. Setting LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES=true
routes OpenAI models through the stable chat completions path instead.
Cost injection still works since it happens at the proxy level.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
1. custom_auth_basic.py: Add user_role='proxy_admin' so the custom auth
   user can access management endpoints like /key/generate. The test
   test_assemblyai_transcribe_with_non_admin_key was hidden behind an
   earlier -x failure and was never reached before.

2. test_router_utils.py: Add flaky(retries=3) and increase sleep from 1s
   to 2s for test_router_get_model_group_usage_wildcard_routes. The async
   callback needs time to write usage to cache, and 1s is insufficient on
   slower CI hardware.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
…auth_basic

Fixes mypy error: Argument 'user_role' has incompatible type 'str'; expected 'LitellmUserRoles | None'

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
…#22926)

* fix: don't close HTTP/SDK clients on LLMClientCache eviction

Removing the _remove_key override that eagerly called aclose()/close()
on evicted clients. Evicted clients may still be held by in-flight
streaming requests; closing them causes:

  RuntimeError: Cannot send a request, as the client has been closed.

This is a regression from commit fb72979. Clients that are no longer
referenced will be garbage-collected naturally. Explicit shutdown cleanup
happens via close_litellm_async_clients().

Fixes production crashes after the 1-hour cache TTL expires.

* test: update LLMClientCache unit tests for no-close-on-eviction behavior

Flip the assertions: evicted clients must NOT be closed. Replace
test_remove_key_closes_async_client → test_remove_key_does_not_close_async_client
and equivalents for sync/eviction paths.

Add test_remove_key_removes_plain_values for non-client cache entries.
Remove test_background_tasks_cleaned_up_after_completion (no more _background_tasks).
Remove test_remove_key_no_event_loop variant that depended on old behavior.

* test: add e2e tests for OpenAI SDK client surviving cache eviction

Add two new e2e tests using real AsyncOpenAI clients:
- test_evicted_openai_sdk_client_stays_usable: verifies size-based eviction
  doesn't close the client
- test_ttl_expired_openai_sdk_client_stays_usable: verifies TTL expiry
  eviction doesn't close the client

Both tests sleep after eviction so any create_task()-based close would
have time to run, making the regression detectable.

Also expand the module docstring to explain why the sleep is required.

* docs(AGENTS.md): add rule — never close HTTP/SDK clients on cache eviction

* docs(CLAUDE.md): add HTTP client cache safety guideline
The security_scans.sh script uses `column` to format vulnerability
output, but the package wasn't installed in the CI environment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When callbacks are configured as a plain string (e.g., `callbacks: "my_callback"`)
instead of a list, the proxy crashes on startup with:
  TypeError: can only concatenate str (not "list") to str

Normalize each callback setting to a list before concatenating.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nforcement

The --enforce_prisma_migration_check flag is now required to trigger
sys.exit(1) on DB migration failure, after BerriAI#23675 flipped the default
behavior to warn-and-continue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hen router_model_id has no pricing

When custom pricing is passed as per-request kwargs (input_cost_per_token/output_cost_per_token),
completion() registers pricing under the model name, but _select_model_name_for_cost_calc was
selecting the router deployment hash (which has no pricing data), causing response_cost to be 0.0.

Now checks whether the router_model_id entry actually has pricing before preferring it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ream providers

Prevent x-litellm-api-key (LiteLLM's virtual key) from being leaked
to upstream providers when _forward_headers=True is used in passthrough
endpoints.
Client-provided credentials now take precedence over server credentials
in the /anthropic/ passthrough endpoint. This enables mixed mode where:

1. Client sends x-api-key → forwarded as-is (user pays via own API key)
2. Client sends Authorization → forwarded as-is (user pays via OAuth/Max)
3. No client credentials + server ANTHROPIC_API_KEY → server key used
4. No client credentials + no server key → no credentials forwarded

Previously the server always sent x-api-key (even literal "None" when
unconfigured), overwriting any client-provided credentials and breaking
Claude Code Max (OAuth) and BYOK scenarios.

Supersedes the simpler one-liner from d742c76 on v1.81.12-stable-patched.
Based on the approach from PR BerriAI#20429 (closed) and reverted PR BerriAI#14821.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants