LiteLLM update-0218: rebase on upstream + fixes by Quentin-M · Pull Request #1 · Quentin-M/litellm

Quentin-M · 2026-02-18T15:32:02Z

Summary

Rebase on BerriAI/litellm main (Feb 18, 2026) with the following fixes:

Websearch Interception

Cherry-pick updated PR Fix websearch interception with extended thinking mode support BerriAI/litellm#20488 — thinking block preservation through websearch agentic loop
Load api_key/api_base from router's search_tools config (fixes "TAVILY_API_KEY is not set")
Auto-adjust max_tokens when <= thinking.budget_tokens (Anthropic requires max_tokens > budget_tokens)

Bedrock

Centralized beta header filtering with version-based support (replaces inconsistent per-API filtering)
Fix version extraction regex in beta headers config
Remove invalid :0 suffix from Claude Opus 4.6 model ID
Strip context_management from request body for all Bedrock APIs (Invoke Messages, Invoke Chat, Converse)

Thinking

Drop thinking param when assistant messages have text without thinking blocks
Recognize adaptive thinking type in is_thinking_enabled (Opus 4.6)

Test Plan

All websearch interception tests pass (51 passed)
All bedrock beta header tests pass (69 passed)
All thinking tests pass (8 passed)
Docker image build + deploy

🤖 Generated with Claude Code

…tection (BerriAI#21342) * Add 6 new EU PII patterns for GDPR compliance - fr_nir: French Social Security Number (NIR/INSEE) with validation - eu_iban_enhanced: Enhanced IBAN detection with specific format - fr_phone: French phone numbers (+33, 0033, 0 formats) - eu_vat: EU VAT identification numbers (all 27 member states) - eu_passport_generic: Generic EU passport format - fr_postal_code: French postal codes with contextual keywords * Add GDPR Art. 32 EU PII Protection policy template - Comprehensive GDPR Article 32 compliance policy - 4 guardrail groups: National IDs, Financial, Contact Info, Business IDs - Masks French NIR/INSEE, EU IBANs, French phones, EU VAT numbers - Includes EU passport numbers and email addresses - Medium complexity template with indigo icon * Add comprehensive tests for EU PII patterns - Test French NIR validation (sex digit, month range) - Test enhanced IBAN detection (French, German) - Test French phone number formats - Test EU VAT numbers - Test generic EU passport format - Test French postal code pattern * Add EU pattern loading and category validation tests - Verify all 6 EU PII patterns are loaded correctly - Verify patterns are categorized as 'EU PII Patterns' - Ensure pattern loading consistency * Add end-to-end tests for GDPR policy template - 4 tests for PII that should be masked (NIR, IBAN, phone, VAT) - 4 tests for text that should pass through (invalid patterns, no PII) - 1 bonus test for multiple PII types in same message - All tests verify correct masking behavior * Add region field to policy templates - Added region field to all 6 templates (EU, AU, Global) - Updated both main and backup JSON files - Enables region-based filtering in UI * Add region filter to policy templates UI - Added Radio.Group filter for regions (All, AU, EU, Global) - Efficient filtering with useMemo hooks - Clean button-based UI matching existing design - Defaults missing regions to Global * feat: add EU AI Act Article 5 policy template Add policy template for detecting EU AI Act Article 5 prohibited practices using conditional keyword matching. Coverage: - Article 5.1.c: Social scoring systems - Article 5.1.f: Emotion recognition in workplace/education - Article 5.1.h: Biometric categorization of protected characteristics - Article 5.1.a: Harmful manipulation techniques - Article 5.1.b: Vulnerability exploitation Implementation: - Uses proven conditional matching pattern (identifier + block words) - 10 always-block keywords for explicit violations - 8 exceptions for research/compliance/entertainment - Zero cost (<5ms), no external APIs, 100% private * feat: add EU AI Act guardrail config example Example configuration showing how to enable EU AI Act Article 5 guardrail. * test: add 40 test cases for EU AI Act Article 5 Comprehensive test coverage: - 10 always-block keywords (explicit violations) - 15 conditional matches (identifier + block word) - 8 exceptions (research, compliance, entertainment) - 7 no-match cases (legitimate uses) Tests validate correct blocking/allowing behavior for Article 5 prohibited practices. * Fix: support standalone conditional matching without inherit_from - Updated loading logic to activate conditional matching when either: 1. identifier_words + inherit_from (existing pattern) 2. identifier_words + additional_block_words (new standalone pattern) - Modified _load_conditional_category to handle standalone templates - EU AI Act template now works properly without inherit_from - All 45 tests passing Fixes Greptile feedback: conditional matching now activates for templates that define additional_block_words without requiring inherit_from * fix: address Greptile code review feedback (2/5 score) - patterns.json: add keyword_pattern to eu_vat and eu_passport_generic - patterns.json: fix fr_phone pattern with leading word boundary - patterns.json: fix eu_iban_enhanced regex efficiency - policy_templates.json: remove country-specific passport patterns from GDPR template - policy_templates_backup.json: sync with main templates file - test_gdpr_policy_e2e.py: update test setup and fix VAT test text All tests now pass. Keyword guards prevent false positives. * Fix: address Greptile pattern feedback - Fix fr_phone: use negative lookbehind (?<!\d) to prevent false matches in digit strings - Add keyword_pattern to eu_passport_generic to reduce false positives - Add keyword_pattern to eu_vat for contextual matching All pattern tests passing * Update litellm/proxy/guardrails/guardrail_hooks/litellm_content_filter/patterns.json Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

- Fix HTTPException swallowed by broad except block in get_user_daily_activity and get_user_daily_activity_aggregated: re-raise HTTPException before the generic handler so 403 status codes propagate correctly - Add status_code assertions in non-admin access tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Keep both structured output tests (ours) and min thinking budget tests (staging). Accept staging poetry.lock.

- Default user_id to caller's own ID for non-admins instead of 403 when omitted, preserving backward compatibility for API consumers - Apply same fix to aggregated endpoint - Update test to verify defaulting behavior instead of expecting 403 - Add useEffect to sync selectedUserId when auth state settles in UsagePageView to handle async auth initialization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…llm into litellm_gh_server_root_test

…_by_user [Feature] UI - Usage: Allow Filtering by User

* fix: SSO PKCE support fails in multi-pod Kubernetes deployments * fix: virutal key grace period from env/UI * fix: refactor, race condition handle, fstring sql injection * fix: add async call to avoid server pauses * Update tests/test_litellm/proxy/management_endpoints/test_ui_sso.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: add await in tests * add modify test to perform async run * Update tests/test_litellm/proxy/management_endpoints/test_ui_sso.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tests/test_litellm/proxy/management_endpoints/test_ui_sso.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix grace period with better error handling on frontend and as per best practices * Update tests/test_litellm/proxy/management_endpoints/test_ui_sso.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: as per request changes * Update litellm/proxy/utils.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Fix errors when callbacks are invoked for file delete operations: * Fix errors when callbacks are invoked for file operations * Fix: pass deployment credentials to afile_retrieve in managed_files post-call hook * Fix: bypass managed files access check in batch polling by calling afile_content directly * Update tests/test_litellm/proxy/management_endpoints/test_ui_sso.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: afile_retrieve returns unified ID for batch output files * fix: batch retrieve returns unified input_file_id * fix(chatgpt): drop unsupported responses params for Codex Co-authored-by: Cursor <cursoragent@cursor.com> * test(chatgpt): ensure Codex request filters unsupported params Co-authored-by: Cursor <cursoragent@cursor.com> * Fix deleted managed files returning 403 instead of 404 * Add comments * Update litellm/proxy/utils.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: thread deployment model_info through batch cost calculation batch_cost_calculator only checked the global cost map, ignoring deployment-level custom pricing (input_cost_per_token_batches etc.). Add optional model_info param through the batch cost chain and pass it from CheckBatchCost. * fix(deps): add pytest-postgresql for db schema migration tests The test_db_schema_migration.py test requires pytest-postgresql but it was missing from dependencies, causing import errors: ModuleNotFoundError: No module named 'pytest_postgresql' Added pytest-postgresql ^6.0.0 to dev dependencies to fix test collection errors in proxy_unit_tests. This is a pre-existing issue, not related to PR BerriAI#21277. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(test): replace caplog with custom handler for parallel execution The cost calculation log level tests were failing when run with pytest-xdist parallel execution because caplog doesn't work reliably across worker processes. This causes "ValueError: I/O operation on closed file" errors. Solution: Replace caplog fixture with a custom LogRecordHandler that directly attaches to the logger. This approach works correctly in parallel execution because each worker process has its own handler instance. Fixes test failures in PR BerriAI#21277 when running with --dist=loadscope. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(test): correct async mock for video generation logging test The test was failing with AuthenticationError because the mock wasn't intercepting the actual HTTP handler calls. This caused real API calls with no API key, resulting in 401 errors. Root cause: The test was patching the wrong target using string path 'litellm.videos.main.base_llm_http_handler' instead of using patch.object on the actual handler instance. Additionally, it was mocking the sync method instead of async_video_generation_handler. Solution: Use patch.object with side_effect pattern on the correct async handler method, following the same pattern used in test_video_generation_async(). Fixes test failure in PR BerriAI#21277 when running with --dist=loadscope. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(test): add cleanup fixture and no_parallel mark for MCP tests Two MCP server tests were failing when run with pytest-xdist parallel execution (--dist=loadscope): - test_mcp_routing_with_conflicting_alias_and_group_name - test_oauth2_headers_passed_to_mcp_client Both tests showed assertion failures where mocks weren't being called (0 times instead of expected 1 time). Root cause: These tests rely on global_mcp_server_manager singleton state and complex async mocking that doesn't work reliably with parallel execution. Each worker process can have different state and patches may not apply correctly. Solution: 1. Added autouse fixture to clean up global_mcp_server_manager registry before and after each test for better isolation 2. Added @pytest.mark.no_parallel to these specific tests to ensure they run sequentially, avoiding parallel execution issues This approach maintains test reliability while allowing other tests in the file to still benefit from parallelization. Fixes test failures exposed by PR BerriAI#21277. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Regenerate poetry.lock with Poetry 2.3.2 Updated lock file to use Poetry 2.3.2 (matching main branch standard). This addresses Greptile feedback about Poetry version mismatch. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Remove unused pytest import and add trailing newline - Removed unused pytest import (caplog fixture was removed) - Added missing trailing newline at end of file Addresses Greptile feedback (minor style issues). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Remove redundant import inside test method The module litellm.videos.main is already imported at the top of the file (line 21), so the import inside the test method is redundant. Addresses Greptile feedback (minor style issue). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Fix converse anthropic usage object according to v1/messages specs * Add routing based on if reasoning is supported or not * add fireworks_ai/accounts/fireworks/models/kimi-k2p5 in model map * Removed stray .md file * fix(bedrock): clamp thinking.budget_tokens to minimum 1024 Bedrock rejects thinking.budget_tokens values below 1024 with a 400 error. This adds automatic clamping in the LiteLLM transformation layer so callers (e.g. router with reasoning_effort="low") don't need to know about the provider-specific minimum. Fixes BerriAI#21297 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: improve Langfuse test isolation to prevent flaky failures (BerriAI#21093) The test was creating fresh mocks but not fully isolating from setUp state, causing intermittent CI failures with 'Expected generation to be called once. Called 0 times.' Instead of creating fresh mocks, properly reset the existing setUp mocks to ensure clean state while maintaining proper mock chain configuration. * feat(s3): add support for virtual-hosted-style URLs (BerriAI#21094) Add s3_use_virtual_hosted_style parameter to support AWS S3 virtual-hosted-style URL format (bucket.endpoint/key) alongside the existing path-style format (endpoint/bucket/key). This enables compatibility with S3-compatible services like MinIO and aligns with AWS S3 official terminology. * Addressed greptile comments to extract common helpers and return 404 * Allow effort="max" for Claude Opus 4.6 (BerriAI#21112) * fix(aiohttp): prevent closing shared ClientSession in AiohttpTransport (BerriAI#21117) When a shared ClientSession is passed to LiteLLMAiohttpTransport, calling aclose() on the transport would close the shared session, breaking other clients still using it. Add owns_session parameter (default True for backwards compatibility) to AiohttpTransport and LiteLLMAiohttpTransport. When a shared session is provided in http_handler.py, owns_session=False is set to prevent the transport from closing a session it does not own. This aligns AiohttpTransport with the ownership pattern already used in AiohttpHandler (aiohttp_handler.py). * perf(spend): avoid duplicate daily agent transaction computation (BerriAI#21187) * fix: proxy/batches_endpoints/endpoints.py:309:11: PLR0915 Too many statements (54 > 50) * fix mypy * Add doc for OpenAI Agents SDK with LiteLLM * Add doc for OpenAI Agents SDK with LiteLLM * Update docs/my-website/sidebars.js Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix mypy * Update tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add blog fffor Managing Anthropic Beta Headers * Add blog fffor Managing Anthropic Beta Headers * correct the time * Fix: Exclude tool params for models without function calling support (BerriAI#21125) (BerriAI#21244) * Fix tool params reported as supported for models without function calling (BerriAI#21125) JSON-configured providers (e.g. PublicAI) inherited all OpenAI params including tools, tool_choice, function_call, and functions — even for models that don't support function calling. This caused an inconsistency where get_supported_openai_params included "tools" but supports_function_calling returned False. The fix checks supports_function_calling in the dynamic config's get_supported_openai_params and removes tool-related params when the model doesn't support it. Follows the same pattern used by OVHCloud and Fireworks AI providers. * Style: move verbose_logger to module-level import, remove redundant try/except Address review feedback from Greptile bot: - Move verbose_logger import to top-level (matches project convention) - Remove redundant try/except around supports_function_calling() since it already handles exceptions internally via _supports_factory() * fix(index.md): cleanup str * fix(proxy): handle missing DATABASE_URL in append_query_params (BerriAI#21239) * fix: handle missing database url in append_query_params * Update litellm/proxy/proxy_cli.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix(migrations): Make vector stores migration idempotent with IF NOT EXISTS - Add IF NOT EXISTS to ALTER TABLE ADD COLUMN statements - Add IF NOT EXISTS to CREATE INDEX statements - Prevents migration failures when columns/indexes already exist from manual fixes - Follows PostgreSQL best practices for idempotent migrations --------- Co-authored-by: Harshit Jain <harshitjain0562@gmail.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Ephrim Stanley <ephrim.stanley@point72.com> Co-authored-by: Jay Prajapati <79649559+jayy-77@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Julio Quinteros Pro <jquinter@gmail.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: mjkam <mjkam@naver.com> Co-authored-by: Fly <48186978+tuzkiyoung@users.noreply.github.com> Co-authored-by: Kristoffer Arlind <13228507+KristofferArlind@users.noreply.github.com> Co-authored-by: Constantine <Runixer@gmail.com> Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com> Co-authored-by: Atharva Jaiswal <92455570+AtharvaJaiswal005@users.noreply.github.com> Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Vincent Koc <vincentkoc@ieee.org>

…erriAI#21349) * feat: add GuardrailTracingDetail TypedDict and tracing fields to StandardLoggingGuardrailInformation * feat: add policy_template field to Guardrail config TypedDict * feat: accept GuardrailTracingDetail in base guardrail logging method * feat: populate tracing fields in content filter guardrail * test: add tracing fields tests for custom guardrail base class * test: add tracing fields e2e tests for content filter guardrail * feat: add guardrail tracing UI - policy badges, match details, timeline * feat: redesign GuardrailViewer to Guardrails & Policy Compliance layout Two-column layout with request lifecycle timeline on the left and compact evaluation detail cards on the right. Header shows guardrail count, pass/fail status, total overhead, policy info, and an export button. * feat: add clickable guardrail link in metrics + show policy names * feat: add risk_score field to StandardLoggingGuardrailInformation * feat: compute risk_score in content filter guardrail * feat: display backend risk_score badge on evaluation cards * fix: fallback to frontend risk score when backend doesn't provide one

…-structured-outputs feat(bedrock): support native structured outputs API (outputConfig.textFormat)

…_code Fix: Add blog as incident report

…ude-opus-4.6-fast (BerriAI#21316) Add missing GitHub Copilot model entries for gpt-5.3-codex (GA) and claude-opus-4.6-fast (Public Preview) to both the root and backup model pricing JSON files.

…proxy layer (BerriAI#19912) PR BerriAI#21039 fixed OAuth token handling at the LLM layer (Authorization: Bearer instead of x-api-key), but the proxy layer still strips the Authorization header in clean_headers() before it reaches the Anthropic code. This breaks OAuth for proxy users (e.g., Claude Code Max through LiteLLM proxy). Changes: - Add is_anthropic_oauth_key() helper to detect OAuth tokens (sk-ant-oat*) - Preserve OAuth Authorization headers in clean_headers() instead of stripping - Forward OAuth Authorization via ProviderSpecificHeader in add_provider_specific_headers_to_request() so tokens only reach Anthropic-compatible providers (anthropic, bedrock, vertex_ai) Fixes BerriAI#19618 Co-authored-by: Adam Reed <iamadamreed@users.noreply.github.com>

* feat: Add IBM watsonx.ai rerank support * feat: added unit tests * fix docstring * added documentataion * Update litellm/llms/watsonx/rerank/transformation.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update litellm/rerank_api/main.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update litellm/llms/watsonx/rerank/transformation.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * update validate_environment signature * fix ruff check and mypy * fix CR * CR fix * CR fix --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

…_managed_access Add File deletion criteria with batch references

…_vllm Incident Report: vLLM Embeddings Broken by encoding_format Parameter

[Feat]Add day 0 claude sonnet 4.6 feat support

Fix mock test

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

…pired-key-cleanup fix(tests): restore proxy_server module attrs after test_proxy_admin_expired_key_from_cache

…rix-ci fix(ci): add prisma generate step to matrix CI workflow

The model_prices_and_context_window_backup.json file has 'inference_geo' fields (e.g. on 'us/claude-sonnet-4-6') for geo-prefixed Anthropic models used in cost calculation, but the JSON schema validator in test_utils.py did not include 'inference_geo' as an allowed property. This caused test_aaamodel_prices_and_context_window_json_is_valid to fail with: Additional properties are not allowed ('inference_geo' was unexpected) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ation.py The file had two unresolved git merge conflict markers from a merge of litellm_oss_staging_02_17_2026 into main, causing a SyntaxError when pytest tried to collect the test module. Kept the instance-level mocking approach (from litellm_oss_staging) for test_get_complete_url and test_validate_environment, which is consistent with the rest of the file and avoids class-reference issues caused by importlib.reload(litellm) in conftest.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…_tags feat(datadog): add 'team' tag to logs, metrics, and cost management

…ate_many Prisma's create_many() requires JSON fields to be wrapped in prisma.Json() not passed as raw JSON strings. Lines 3099-3100 were using safe_dumps() (which returns str) instead of prisma.Json(), causing Prisma validation errors during master key rotation. This is consistent with the existing pattern in the same file (line 3134 already uses prisma.Json for litellm_config env vars). The regression test test_rotate_master_key_model_data_valid_for_prisma was already correctly asserting isinstance(..., prisma.Json) — the test exposed the mismatch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…-conflict fix(tests): resolve merge conflict in test_vertex_ai_rerank_transformation.py

…many fix(proxy): use prisma.Json for JSON fields in _rotate_master_key create_many()

…o-property fix(tests): add inference_geo to model prices JSON schema validator

When Anthropic's extended thinking is enabled, assistant messages must start with thinking blocks before tool_use blocks. The agentic loop was creating follow-up messages with only tool_use blocks, causing validation errors. This change ensures thinking blocks from the original response are preserved and included at the start of follow-up assistant messages. - Created `TransformRequestResult` NamedTuple to capture both tool_calls and thinking_blocks from `transform_request()`, making the contract explicit and extensible - Modified `transform_request()` to extract and return thinking/redacted_thinking blocks alongside tool calls - Updated `transform_response()` to accept thinking_blocks and prepend them to follow-up assistant messages - Passed thinking_blocks through the agentic loop chain: detection → execution → message transformation - Fixed `transform_request()` to return full kwargs (not just tools) to preserve other request parameters - Used `filter_internal_params()` utility instead of manual filtering for consistency This change fixes websearch interception when extended thinking mode is enabled. **Problem**: When Anthropic's extended thinking is enabled, assistant messages must start with thinking blocks before tool_use blocks. The agentic loop was creating follow-up messages with only tool_use blocks, causing the error: `messages.1.content.0.type: Expected 'thinking' or 'redacted_thinking', but found 'tool_use'` **Solution**: Modified `transform_request()` to capture thinking/redacted_thinking blocks from the original response, and `transform_response()` to include them at the start of the assistant message in follow-up requests. **Testing**: Successfully tested end-to-end with Claude Code → LiteLLM Proxy → AWS Bedrock → Claude Opus 4.5. ```yaml model_list: - model_name: claude-opus-4-5-20251101 litellm_params: model: bedrock/us.anthropic.claude-opus-4-5-20251101-v1:0 aws_region_name: us-west-2 model_info: supports_web_search: true litellm_settings: callbacks: ["websearch_interception"] websearch_interception_params: enabled_providers: ["bedrock"] search_tool_name: "searxng-search" search_tools: - search_tool_name: searxng-search litellm_params: search_provider: searxng api_base: "https://searxng.example.com" ``` **Note**: Uses `bedrock/` (not `bedrock/converse/`) to route through `anthropic_messages_handler()` which supports agentic hooks.

Fixes issue where websearch interception failed with "TAVILY_API_KEY is not set" error when using search providers that require API keys. Changes: - Extract api_key and api_base from router search_tools configuration - Pass credentials to litellm.asearch() when available - Falls back to environment variables when credentials not in config - Maintains backward compatibility with existing configurations Root cause: Handler was only extracting search_provider from router config, but not the associated api_key and api_base fields. This caused litellm.asearch() to fall back to environment variables, which failed when keys weren't set in env. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fixes websearch interception failures when thinking.budget_tokens is set and requests violate Anthropic's requirement: max_tokens > budget_tokens. Changes: - Validate max_tokens against thinking.budget_tokens when extended thinking is enabled - Automatically adjust max_tokens to budget_tokens + DEFAULT_MAX_TOKENS (4096) when insufficient - Follows the same pattern as base transformation classes in LiteLLM This prevents the error: "max_tokens must be greater than thinking.budget_tokens" when using extended thinking with websearch interception. Related issue: BerriAI#14194 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…pport Standardize anthropic-beta header handling across all Bedrock APIs (Invoke Chat, Converse, Messages) using a centralized whitelist-based filter with version-based model support. - Inconsistent filtering: Invoke Chat used whitelist (safe), Converse/Messages used blacklist (allows unsupported headers through) - Production risk: unsupported headers could cause AWS API errors - Maintenance burden: adding new Claude models required updating multiple hardcoded lists - Centralized BedrockBetaHeaderFilter with whitelist approach - Version-based filtering (e.g., "requires 4.5+") instead of model lists - Family restrictions (opus/sonnet/haiku) when needed - Automatic header translation for backward compatibility - Add `litellm/llms/bedrock/beta_headers_config.py` - BedrockBetaHeaderFilter class - Whitelist of 11 supported beta headers - Version/family restriction logic - Debug logging support - Invoke Chat: Replace local whitelist with centralized filter - Converse: Remove blacklist (30 lines), use whitelist filter - Messages: Remove complex filter (55 lines), preserve translation - Add `tests/test_litellm/llms/bedrock/test_beta_headers_config.py` - 40+ unit tests for filter logic - Extend `tests/test_litellm/llms/bedrock/test_anthropic_beta_support.py` - 13 integration tests for API transformations - Verify filtering, version restrictions, translations - Add `litellm/llms/bedrock/README.md` - Maintenance guide for adding new headers/models - Enhanced module docstrings with examples - Production safety: only whitelisted headers reach AWS - Zero maintenance for new Claude models (Opus 5, Sonnet 5, etc.) - Consistent filtering across all 3 APIs - Preserved backward compatibility (advanced-tool-use translation) ```bash poetry run pytest tests/test_litellm/llms/bedrock/ -v ``` Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

AWS Bedrock does not recognize anthropic.claude-opus-4-6-v1:0 as a valid model identifier. Unlike other Claude models, Opus 4.6 requires the model ID without the :0 version suffix: anthropic.claude-opus-4-6-v1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ock APIs Bedrock doesn't support context_management as a request body parameter. The feature is enabled via the anthropic-beta header (context-management-2025-06-27) which was already handled correctly. Leaving context_management in the body causes: "context_management: Extra inputs are not permitted" Strip the parameter from all 3 Bedrock API paths: - Invoke Messages API - Invoke Chat API - Converse API (additionalModelRequestFields) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…without thinking blocks Follow-up to a494503f4b which fixed thinking + tool_use. That fix only detected missing thinking blocks on assistant messages with tool_calls. When the last assistant message has plain text content (no tool_calls), the check returned False and thinking was not dropped, causing: "Expected thinking or redacted_thinking, but found text" Add last_assistant_message_has_no_thinking_blocks() to detect any assistant message with content but no thinking blocks. Extract shared _message_has_thinking_blocks() helper that checks both the thinking_blocks field and content array for thinking/redacted_thinking blocks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Upstream only checks for type="enabled" but Opus 4.6 uses type="adaptive". Without this fix, max_tokens auto-adjustment doesn't trigger for adaptive thinking, causing API errors.

The MODEL_VERSION_PATTERN regex had double-escaped backslashes (\\d instead of \d) in a raw string, causing it to never match any model ID. Also constrained minor version capture to a single digit followed by a hyphen to avoid capturing the 8-digit date suffix as the minor version. Additionally added computer-use-2025-11-24 to the whitelist (used by upstream for Opus 4.5+) and updated integration tests to use model IDs compatible with the version-gated beta headers they test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

ishaan-jaff and others added 30 commits February 16, 2026 15:33

allow filtering by user in global usage

0d2aac6

add server root path test to github actions

0e736a7

Update .github/workflows/test_server_root_path.yml

8ab2c91

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

fix: resolve merge conflicts with staging branch

a946cc4

Keep both structured output tests (ours) and min thinking budget tests (staging). Accept staging poetry.lock.

fixing syntax

bec06c0

Merge branch 'litellm_gh_server_root_test' of github.com:BerriAI/lite…

56133f8

…llm into litellm_gh_server_root_test

remove artifacts

330802e

Merge pull request BerriAI#21351 from BerriAI/litellm_ui_usage_filter…

100a5a1

…_by_user [Feature] UI - Usage: Allow Filtering by User

Merge pull request BerriAI#21222 from ndgigliotti/feat/bedrock-native…

3a34b63

…-structured-outputs feat(bedrock): support native structured outputs API (outputConfig.textFormat)

passing in masster key for api calls

a3ff903

Fix: Add blog as incident report

5e02844

Fix: Add blog as incident report

53728b4

remove timeline

6acf63f

Merge pull request BerriAI#21356 from BerriAI/litellm_incident_claude…

4936aab

…_code Fix: Add blog as incident report

feat(models): add github_copilot/gpt-5.3-codex and github_copilot/cla…

757acb4

…ude-opus-4.6-fast (BerriAI#21316) Add missing GitHub Copilot model entries for gpt-5.3-codex (GA) and claude-opus-4.6-fast (Public Preview) to both the root and backup model pricing JSON files.

only tests for /ui

5ac3430

bump: version 1.81.12 → 1.81.13

c4f0fc9

Fixing mapped tests

349e3da

fixing no_config test

efe8477

fixing container tests

23219d9

fixing test_basic_openai_responses_api

45e6440

Adding bedrock thinking budget tokens to docs

aa7bc6a

fixing regen key tests

3669d54

Sameerlite and others added 27 commits February 18, 2026 18:39

Merge pull request BerriAI#21456 from BerriAI/litellm_fix_delete_file…

6f82a3e

…_managed_access Add File deletion criteria with batch references

Merge branch 'main' into litellm_sonnet_4_6_feat

f6d3919

Merge pull request BerriAI#21474 from BerriAI/litellm_incident_report…

3e0a723

…_vllm Incident Report: vLLM Embeddings Broken by encoding_format Parameter

Merge pull request BerriAI#21448 from BerriAI/litellm_sonnet_4_6_feat

c369b0f

[Feat]Add day 0 claude sonnet 4.6 feat support

fix(datadog): use .get() for safe team tag extraction

5755483

Fix _rotate_master_key

ee7e543

Fix:test_get_key_object_loads_object_permission

610bf00

Merge pull request BerriAI#21475 from BerriAI/litellm_fix_mock_test

a358b7a

Fix mock test

Update tests/test_litellm/proxy/auth/test_user_api_key_auth.py

0d11720

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Merge pull request BerriAI#21473 from BerriAI/fix/test-proxy-admin-ex…

34f06c4

…pired-key-cleanup fix(tests): restore proxy_server module attrs after test_proxy_admin_expired_key_from_cache

Merge pull request BerriAI#21436 from BerriAI/fix/prisma-generate-mat…

65a79d5

…rix-ci fix(ci): add prisma generate step to matrix CI workflow

Merge pull request BerriAI#21449 from Harshit28j/litellm_feat_dataDog…

c760318

…_tags feat(datadog): add 'team' tag to logs, metrics, and cost management

Merge pull request BerriAI#21478 from BerriAI/fix/vertex-rerank-merge…

63783db

…-conflict fix(tests): resolve merge conflict in test_vertex_ai_rerank_transformation.py

Merge pull request BerriAI#21479 from BerriAI/fix/prisma-json-create-…

2395734

…many fix(proxy): use prisma.Json for JSON fields in _rotate_master_key create_many()

Merge pull request BerriAI#21477 from BerriAI/fix/schema-inference-ge…

45d3d1a

…o-property fix(tests): add inference_geo to model prices JSON schema validator

fix(thinking): recognize adaptive thinking type in is_thinking_enabled

1800774

Upstream only checks for type="enabled" but Opus 4.6 uses type="adaptive". Without this fix, max_tokens auto-adjustment doesn't trigger for adaptive thinking, causing API errors.

Quentin-M closed this Feb 18, 2026

Quentin-M deleted the update-0213 branch February 18, 2026 15:34

Quentin-M pushed a commit that referenced this pull request Mar 13, 2026

chore: regenerate poetry.lock to match pyproject.toml (#1)

c58aea4

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LiteLLM update-0218: rebase on upstream + fixes#1

LiteLLM update-0218: rebase on upstream + fixes#1
Quentin-M wants to merge 1141 commits intomainfrom
update-0213

Quentin-M commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

Quentin-M commented Feb 18, 2026

Summary

Websearch Interception

Bedrock

Thinking

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants