`main` into `ii-main` by dominicfallows · Pull Request #4 · interactive-investor/litellm

dominicfallows · 2026-02-02T16:52:29Z

No description provided.

Add litellm.disable_default_user_agent global flag to control whether the automatic User-Agent header is injected into HTTP requests.

Modify http_handler.py and httpx_handler.py to check the disable_default_user_agent flag and return empty headers when disabled. This allows users to override the User-Agent header completely.

Add 8 tests covering: - Default User-Agent behavior - Disabling default User-Agent - Custom User-Agent via extra_headers - Environment variable support - Async handler support - Override without disabling - Claude Code use case - Backwards compatibility

…19920)

…hropic (BerriAI#19896) * fix(vertex_ai): convert image URLs to base64 in tool messages for Anthropic Fixes BerriAI#19891 Vertex AI Anthropic models don't support URL sources for images. LiteLLM already converted image URLs to base64 for user messages, but not for tool messages (role='tool'). This caused errors when using ToolOutputImage with image_url in tool outputs. Changes: - Add force_base64 parameter to convert_to_anthropic_tool_result() - Pass force_base64 to create_anthropic_image_param() for tool message images - Calculate force_base64 in anthropic_messages_pt() based on llm_provider - Add unit tests for tool message image handling * chore: remove extra comment from test file header

* fix(proxy_server): pass search_tools to Router during DB-triggered initialization * fix search tools from db * add missing statement to handle from db * fix import issues to pass lint errors

…rriAI#19654) Fixes BerriAI#19478 The stream_chunk_builder function was not handling image chunks from models like gemini-2.5-flash-image. When streaming responses were reconstructed (e.g., for caching), images in delta.images were lost. This adds handling for image_chunks similar to how audio, annotations, and other delta fields are handled.

…sing (BerriAI#19776) Fixes BerriAI#16920 for users of the stable release images. The previous fix (PR BerriAI#18092) added libsndfile to docker/Dockerfile.alpine, but stable releases are built from the main Dockerfile (Wolfi-based), not the Alpine variant.

…ute: /batches/bGl0ZWxsbV9wcm/cancel

… list (BerriAI#19952) The /health/services endpoint rejected datadog_llm_observability as an unknown service, even though it was registered in the core callback registry and __init__.py. Added it to both the Literal type hint and the hardcoded validation list in the health endpoint.

* fix(proxy): prevent provider-prefixed model leaks Proxy clients should not see LiteLLM internal provider prefixes (e.g. hosted_vllm/...) in the OpenAI-compatible response model field. This patch sanitizes the client-facing model name for both: - Non-streaming responses returned from base_process_llm_request - Streaming SSE chunks emitted by async_data_generator Adds regression tests covering vLLM-style hosted_vllm routing for both streaming and non-streaming paths. * chore(lint): suppress PLR0915 in proxy handler Ruff started flagging ProxyBaseLLMRequestProcessing.base_process_llm_request() for too many statements after the hotpatch changes. Add an explicit '# noqa: PLR0915' on the function definition to avoid a large refactor in a hotpatch. * refactor(proxy): make model restamp explicit Replace silent try/except/pass and type ignores with explicit model restamping. - Logs an error when the downstream response model differs from the client-requested model - Overwrites the OpenAI `model` field to the client-requested value to avoid leaking internal provider-prefixed identifiers - Applies the same behavior to streaming chunks, logging the mismatch only once per stream * chore(lint): drop PLR0915 suppression The model restamping bugfix made `base_process_llm_request()` slightly exceed Ruff's PLR0915 (too-many-statements) threshold, requiring a `# noqa` suppression. Collapse consecutive `hidden_params` extractions into tuple unpacking so the function falls back under the lint limit and remove the suppression. No functional change intended; this keeps the proxy model-field bugfix intact while aligning with project linting rules. * chore(proxy): log model mismatches as warnings These model-restamping logs are intentionally verbose: a mismatch is a useful signal that an internal provider/deployment identifier may be leaking into the public OpenAI response `model` field. - Downgrade model mismatch logs from error -> warning - Keep error logs only for cases where the proxy cannot read/override the model * fix(proxy): preserve client model for streaming aliasing Pre-call processing can rewrite request_data['model'] via model alias maps.\n\nOur streaming SSE generator was using the rewritten value when restamping chunk.model, which caused the public 'model' field to differ between streaming and non-streaming responses for alias-based requests.\n\nStash the original client model in request_data as _litellm_client_requested_model after the model has been routed, and prefer it when overriding the outgoing chunk model. Add a regression test for the alias-mapping case. * chore(lint): satisfy PLR0915 in streaming generator Ruff started flagging async_data_generator() for too many statements after adding model restamping logic.\n\nExtract the client-model selection + chunk restamping into small helpers to keep behavior unchanged while meeting the project's PLR0915 threshold.

…verify (BerriAI#19893) * fix(hosted_vllm): route through base_llm_http_handler to support ssl_verify The hosted_vllm provider was falling through to the OpenAI catch-all path which doesn't pass ssl_verify to the HTTP client. This adds an explicit elif branch that routes hosted_vllm through base_llm_http_handler.completion() which properly passes ssl_verify to the httpx client. - Add explicit hosted_vllm branch in main.py completion() - Add ssl_verify tests for sync and async completion - Update existing audio_url test to mock httpx instead of OpenAI client * feat(hosted_vllm): add embedding support with ssl_verify - Add HostedVLLMEmbeddingConfig for embedding transformations - Register hosted_vllm embedding config in utils.py - Add lazy import for embedding transformation module - Add unit test for ssl_verify parameter handling

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

[Docs] UI Spend Logs Settings Docs

[Doc] Fixing Image

…checks (BerriAI#20196) ## Problem Tests using mocked HTTP clients were hitting real APIs because: 1. HTTP client cache was returning previously cached real clients 2. isinstance checks failed due to module identity issues from sys.path ### Tests affected: - test_send_email_missing_api_key - test_send_email_multiple_recipients (resend & sendgrid) - test_search_uses_registry_credentials - test_vector_store_create_with_simple_provider_name - test_vector_store_create_with_provider_api_type - test_vector_store_create_with_ragflow_provider - test_image_edit_merges_headers_and_extra_headers - test_retrieve_container_basic (container API tests) ## Solution 1. Add clear_client_cache fixture (autouse=True) to clear litellm.in_memory_llm_clients_cache before each test 2. Fix isinstance checks to use type name comparison (avoids module identity issues from sys.path.insert) ## Why not disable_aiohttp_transport The default transport is aiohttp, so tests should work with it. Clearing the cache ensures mocks are used instead of cached real clients. ## Regression PR BerriAI#19829 (commit f95572e) added @respx.mock but cached clients from earlier tests were being reused, bypassing the mocks. Co-authored-by: shin-bot-litellm <shin-bot-litellm@users.noreply.github.com>

The user_id field 'default_user_id' is being masked to '*******_user_id' in prometheus metrics for privacy. Updated test expectations to match the actual behavior. Co-authored-by: Cursor <cursoragent@cursor.com>

- Fix /docs/search/index -> /docs/search (404 error) - Fix /cookbook/ -> GitHub cookbook URL (404 error) Co-authored-by: shin-bot-litellm <shin-bot-litellm@users.noreply.github.com>

- Update expected user_id from 'default_user_id' to '*******_user_id' (PII masking) - Add missing client_ip, user_agent, model_id labels (from PRs BerriAI#19717, BerriAI#19678) - Update label order to match Prometheus alphabetical sorting Co-authored-by: Cursor <cursoragent@cursor.com>

…issues (BerriAI#20209) * litellm_fix_mapped_tests_core: fix test isolation and mock injection issues ## Problem Four tests in litellm_mapped_tests_core were failing: 1. test_register_model_with_scientific_notation - KeyError due to test isolation issues 2. test_search_uses_registry_credentials - Mock not being called due to incorrect patch path 3. test_send_email_missing_api_key - Real API calls despite mocking 4. test_stream_transformation_error_sync - Mock not effective, real API called ## Solution ### test_register_model_with_scientific_notation - Use unique model name to avoid conflicts with other tests - Clear LRU caches before test to prevent stale data - Clean up model_cost entry after test ### test_search_uses_registry_credentials - Use patch.object() on the actual base_llm_http_handler instance - String-based patching for instance methods can fail; direct object patching is more reliable ### test_send_email_missing_api_key - Directly inject mock HTTP client into logger instance - This bypasses any caching issues that could cause the fixture mock to be ineffective ### test_stream_transformation_error_sync - Patch litellm.completion directly instead of the handler module's litellm reference - This ensures the mock is effective regardless of import order ## Regression These tests were affected by LRU caching added in BerriAI#19606 and HTTP client caching. * fix(test): use patch.object for container API tests to fix mock injection ## Problem test_retrieve_container_basic tests were failing because mocks weren't being applied correctly. The tests used string-based patching: patch('litellm.containers.main.base_llm_http_handler') But base_llm_http_handler is imported at module level, so the mock wasn't intercepting the actual handler calls, resulting in real HTTP requests to OpenAI API. ## Solution Use patch.object() to directly mock methods on the imported handler instance. Import base_llm_http_handler in the test file and patch like: patch.object(base_llm_http_handler, 'container_retrieve_handler', ...) This ensures the mock is applied to the actual object being used, regardless of import order or caching. * fix(test): add missing Prometheus metric labels to test_proxy_failure_metrics Add client_ip, user_agent, model_id labels to expected metric patterns. These labels were added in PRs BerriAI#19717 and BerriAI#19678 but test wasn't updated. * fix(test_resend_email): use direct mock injection for all email tests Extend the mock injection pattern used in test_send_email_missing_api_key to all other tests in the file: - test_send_email_success - test_send_email_multiple_recipients Instead of relying on fixture-based patching and respx mocks which can fail due to import order and caching issues, directly inject the mock HTTP client into the logger instance. This ensures mocks are always used regardless of test execution order. * fix(test): use patch.object for image_edit and vector_store tests - test_image_edit_merges_headers_and_extra_headers: import base_llm_http_handler and use patch.object instead of string path patching - test_search_uses_registry_credentials: import module and patch via module.base_llm_http_handler to ensure we patch the right instance --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

…m_routing_3" This reverts commit ae26d8e, reversing changes made to 864e8c6.

…-level metrics - Check for both litellm_proxy_failed_requests_metric_total and the deprecated litellm_llm_api_failed_requests_metric_total - The proxy-level failure hook may not always be called depending on where the exception occurs - Simplify total_requests check to only verify key fields Co-authored-by: Cursor <cursoragent@cursor.com>

…Tracing (BerriAI#20225) - Updated title to highlight Logs v2 feature - Simplified Key Highlights to focus on Logs v2 / tool call tracing - Rewrote Logs v2 description with improved language style - Removed Claude Agents SDK and RAG API from key highlights section - TODO: Add image (logs_v2_tool_tracing.png) Co-authored-by: shin-bot-litellm <shin-bot-litellm@users.noreply.github.com>

… model version

…cohere-embed-v4 feat: Support dimensions param for Cohere embed v4

…zation-issue-19017 feat: add User-Agent customization support

…ered-caching-cost feat(bedrock): add 1hr tiered caching costs for long-context models (BerriAI#18988)

Update Vertex AI Text to Speech doc to show use of audio

jayy-77 and others added 30 commits January 28, 2026 00:49

feat: add disable_default_user_agent flag

fe18493

Add litellm.disable_default_user_agent global flag to control whether the automatic User-Agent header is injected into HTTP requests.

refactor: update HTTP handlers to respect disable_default_user_agent

a14703e

Modify http_handler.py and httpx_handler.py to check the disable_default_user_agent flag and return empty headers when disabled. This allows users to override the User-Agent header completely.

fix: honor LITELLM_USER_AGENT for default User-Agent

bf0670e

refactor: drop disable_default_user_agent setting

b870768

test: cover LITELLM_USER_AGENT override in custom_httpx handlers

980ec2a

fix Prompt Studio history to load tools and system messages (BerriAI#…

a785eec

…19920)

Fix/router search tools v2 (BerriAI#19840)

8e2fa79

* fix(proxy_server): pass search_tools to Router during DB-triggered initialization * fix search tools from db * add missing statement to handle from db * fix import issues to pass lint errors

Fix: Batch cancellation ownership bug

833cf6a

Fix File access permissions for .retreive and .delete

70684ca

Fix Only allowed to call routes: ['llm_api_routes']. Tried to call ro…

654edbd

…ute: /batches/bGl0ZWxsbV9wcm/cancel

Add OpenRouter Kimi K2.5 (BerriAI#19872)

d4031c8

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

Fix: Encoding cancel batch response

8966852

Add tests for user level permissions on file and batch access

fa2b065

Fix: mypy errors

d3b2afb

Fix lint issues

98ae6dc

Add litellm metadata correctly for file create

4b385e5

Add cost tacking and usage info in call_type=aretrieve_batch

fce2635

Fix max_input_tokens for gpt-5.2-codex

fa54c24

fix(gemini): support file retrieval in GoogleAIStudioFilesHandler

8b6bcfc

Allow config embedding models

81e8a12

adding tests

0b6bacb

Model Usage per key

aa75bec

adding tests

19b6c44

yuneng-jiang and others added 28 commits January 31, 2026 15:20

Merge pull request BerriAI#20197 from BerriAI/spend_logs_docs

543e85a

[Docs] UI Spend Logs Settings Docs

fix fake-openai-endpoint

94d5036

doc fix

93dfac7

fix team budget checks

d897c5e

Merge pull request BerriAI#20198 from BerriAI/spend_logs_docs

0bf471b

[Doc] Fixing Image

bump: version 1.81.5 → 1.81.6

427d8f4

test_chat_completion_low_budget

0c785b3

fix: delete_file

ecb725f

fixes

01b96f1

fix: update test_prometheus to expect masked user_id in metrics

3a3576d

The user_id field 'default_user_id' is being masked to '*******_user_id' in prometheus metrics for privacy. Updated test expectations to match the actual behavior. Co-authored-by: Cursor <cursoragent@cursor.com>

docs fix

8a57ee5

feat(bedrock): add base cache costs for sonnet v1 (BerriAI#20214)

76407bc

docs: fix dead links in v1.81.6 release notes (BerriAI#20218)

93a0631

- Fix /docs/search/index -> /docs/search (404 error) - Fix /cookbook/ -> GitHub cookbook URL (404 error) Co-authored-by: shin-bot-litellm <shin-bot-litellm@users.noreply.github.com>

Revert "Merge pull request BerriAI#18790 from BerriAI/litellm_key_tea…

35e29c2

…m_routing_3" This reverts commit ae26d8e, reversing changes made to 864e8c6.

test_proxy_failure_metrics

faff9d1

test_proxy_success_metrics

92c8e00

test fix

b7e48f1

feat: enhance Cohere embedding support with additional parameters and…

f0853b2

… model version

Merge pull request BerriAI#20235 from amirzaushnizer/litellm-support-…

c13cb4c

…cohere-embed-v4 feat: Support dimensions param for Cohere embed v4

Merge pull request BerriAI#19881 from jayy-77/feat/user-agent-customi…

1a7fcfb

…zation-issue-19017 feat: add User-Agent customization support

Merge pull request BerriAI#20214 from cscguochang/feat/bedrock-1hr-ti…

6f2dc19

…ered-caching-cost feat(bedrock): add 1hr tiered caching costs for long-context models (BerriAI#18988)

Update Vertex AI Text to Speech doc to show use of audio

c6f178e

Merge pull request BerriAI#20255 from BerriAI/litellm_tts_doc

15cec5a

Update Vertex AI Text to Speech doc to show use of audio

Copilot AI review requested due to automatic review settings February 2, 2026 16:52

dominicfallows merged commit 4de793a into ii-main Feb 2, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`main` into `ii-main`#4

`main` into `ii-main`#4
dominicfallows merged 179 commits intoii-mainfrom
main

dominicfallows commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

dominicfallows commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants