feat(router): tag_regex routing — route by User-Agent regex without per-developer tag config by ishaan-jaff · Pull Request #23594 · BerriAI/litellm

ishaan-jaff · 2026-03-14T00:48:02Z

Relevant issues

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

Branch creation CI run Link:
CI run for the last commit Link:
Merge / cherry-pick CI run Links:

Type

🆕 New Feature

Changes

Adds a tag_regex field to litellm_params in config.yaml. Regex patterns are matched against "Header-Name: value" strings built from request metadata (currently User-Agent). This lets operators route specific client traffic to dedicated deployments without asking each developer to configure a tag.

Config example — route all Claude Code traffic automatically:

model_list:
  - model_name: claude-sonnet
    litellm_params:
      model: bedrock/converse/anthropic-claude-sonnet-4-6
      tag_regex:
        - "^User-Agent: claude-code\\/"

  - model_name: claude-sonnet
    litellm_params:
      model: openai/gpt-4o
      tags:
        - default

With enable_tag_filtering: true and tag_filtering_match_any: true, requests from claude-code/x.y.z route to the first deployment; everything else falls through to the default.

What changed:

litellm/types/router.py — added tag_regex: Optional[List[str]] to GenericLiteLLMParams and LiteLLMParamsTypedDict
litellm/router_strategy/tag_based_routing.py — new _is_valid_deployment_tag_regex() helper; get_deployments_for_tag() builds header_strings from metadata["user_agent"] and runs regex matching; writes metadata["tag_routing"] provenance block (matched_via, matched_value, user_agent) for observability
litellm/router.py — validates tag_regex patterns at startup so bad regexes fail fast; also fixes a pre-existing Pyright error (asyncio.current_task() can return None)
tests/test_litellm/router_strategy/test_router_tag_regex_routing.py — 12 unit tests

Exact tag match still takes precedence over regex match. Unmatched requests fall through to default-tagged deployments as before.

Adds a new `tag_regex` field to litellm_params that lets operators route requests based on regex patterns matched against request headers — primarily User-Agent — without requiring per-developer tag configuration. Use case: route all Claude Code traffic (User-Agent: claude-code/x.y.z) to a dedicated deployment by setting: tag_regex: - "^User-Agent: claude-code\\/" in the deployment's litellm_params. Works alongside existing `tags` routing; exact tag match takes precedence over regex match. Unmatched requests fall through to deployments tagged `default`. The matched deployment, pattern, and user_agent are recorded in `metadata["tag_routing"]` so they flow through to SpendLogs automatically.

vercel · 2026-03-14T00:48:07Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 14, 2026 3:20am

greptile-apps · 2026-03-14T00:51:51Z

Greptile Summary

This PR adds tag_regex routing to the LiteLLM router, allowing operators to route requests to specific deployments based on regex patterns matched against request headers (initially User-Agent) without requiring per-developer tag configuration. The implementation hooks into the existing get_deployments_for_tag pipeline, adds a has_regex_deployments guard to preserve backward compatibility for operators using only plain-tag routing, and validates regex patterns at startup to fail fast on misconfiguration.

Key changes:

litellm/types/router.py — tag_regex: Optional[List[str]] added to GenericLiteLLMParams and LiteLLMParamsTypedDict; tags made explicit on GenericLiteLLMParams
litellm/router_strategy/tag_based_routing.py — _is_valid_deployment_tag_regex() and _match_deployment() helpers added; get_deployments_for_tag() extended with header-string construction and provenance logging
litellm/router.py — startup regex validation added; pre-existing asyncio.current_task() Pyright warning fixed
12 mock-only unit tests covering normal routing, fallback, match_any=False policy enforcement, and provenance metadata

Issues noted (not already addressed in previous review threads):

The ValueError raised when no deployments match (line 222–225) omits user_agent from its message, making it hard to debug failures triggered purely by regex routing
matched_deployment in the tag_routing provenance block stores the non-unique model alias (model_name) rather than the unique deployment ID from model_info.id, reducing the observability value of the field

Confidence Score: 3/5

Mostly safe for operators using only plain-tag routing, but the new regex routing path has two open observability gaps and several design concerns flagged in previous review threads that are not fully resolved.
The backward-compatibility guard (has_regex_deployments) addresses the most dangerous regression. The match_any=False policy enforcement is well-tested. However, issues flagged in previous rounds — including the tag_routing provenance being recorded only for the first match (now addressed in code but the matched_deployment field still records the non-unique alias), the error message gap when user_agent is the sole trigger, and the documented tag_filtering_match_any=False semantics divergence for regex paths — leave the feature in a state where debugging and observability are weaker than they should be before merge.
Pay close attention to litellm/router_strategy/tag_based_routing.py, specifically the error-message path (lines 222–225) and the provenance block (lines 209–216).

Important Files Changed

Filename	Overview
litellm/router_strategy/tag_based_routing.py	Core implementation of tag_regex routing. Introduces `_is_valid_deployment_tag_regex`, `_match_deployment`, and updates `get_deployments_for_tag`. The `has_regex_deployments` guard correctly preserves backward compatibility. The error message on the no-deployments path is missing `user_agent` context. `matched_deployment` in provenance metadata uses the non-unique `model_name` alias instead of the unique deployment ID.
litellm/router.py	Adds startup-time regex validation and fixes a pre-existing Pyright warning about `asyncio.current_task()` returning `None`. Validation now runs before `_add_deployment` per the previous review suggestion.
litellm/types/router.py	Adds `tags` and `tag_regex` to `GenericLiteLLMParams` and `tag_regex` to `LiteLLMParamsTypedDict`, making the routing fields explicit in both the Pydantic model and TypedDict. Clean, no issues.
tests/test_litellm/router_strategy/test_router_tag_regex_routing.py	12 unit tests covering `_is_valid_deployment_tag_regex` and `get_deployments_for_tag` with regex patterns. Tests use only mocks, no real network calls. Good coverage of edge cases including `match_any=False` policy.
docs/my-website/docs/proxy/tag_routing.md	Documentation for `tag_regex` feature, including config example, routing priority table, observability info, and a security caution about `User-Agent` being spoofable. Well-written.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming Request] --> B{enable_tag_filtering?}
    B -- No --> Z[Return all healthy deployments]
    B -- Yes --> C{request_kwargs present?}
    C -- No --> Z
    C -- Yes --> D[Extract request_tags, user_agent]
    D --> E{has_regex_deployments in candidate set?}
    E -- No --> F{request_tags present?}
    E -- Yes --> G[Build header_strings from user_agent]
    F -- No --> Z
    F -- Yes --> H[has_tag_filter = True]
    G --> H
    H --> I[For each deployment: _match_deployment]
    I --> J{Step 1: deployment has tags AND request has tags?}
    J -- Yes --> K{is_valid_deployment_tag?}
    K -- Yes --> L[matched_via = 'tags']
    K -- No --> M{Step 2: deployment has tag_regex?}
    J -- No --> M
    M -- No --> N[No match — skip]
    M -- Yes --> O{strict_tag_check_failed?}
    O -- Yes --> N
    O -- No --> P{regex matches header_strings?}
    P -- No --> N
    P -- Yes --> L2[matched_via = 'tag_regex']
    L --> Q[Add to new_healthy_deployments]
    L2 --> Q
    N --> R{deployment has 'default' tag?}
    Q --> R
    R -- Yes --> S[Add to default_deployments]
    R -- No --> T{All deployments processed}
    S --> T
    T --> U{new_healthy_deployments empty?}
    U -- No --> V[Return new_healthy_deployments]
    U -- Yes --> W{default_deployments empty?}
    W -- No --> X[Return default_deployments]
    W -- Yes --> Y[Raise ValueError]

Comments Outside Diff (1)

litellm/router_strategy/tag_based_routing.py, line 222-225 (link)

Error message omits user_agent when it triggered filtering

has_tag_filter can now become True purely because of user_agent + has_regex_deployments, with request_tags remaining None. In that case the ValueError message reads tags=None, hiding the actual trigger that caused the filter to activate and making the error very hard to diagnose.

Consider including user_agent in the message:

_{Last reviewed commit: c009f4f}

greptile-apps · 2026-03-14T00:51:55Z

litellm/router_strategy/tag_based_routing.py

+        has_tag_filter = bool(request_tags) or bool(header_strings)
+        if has_tag_filter:


Backwards-incompatible behavior change for existing tag-filter users

has_tag_filter now becomes True whenever user_agent is present in the request metadata, even for users who have never configured tag_regex. This widens the condition that previously only triggered on request_tags.

Consider the following existing setup:

enable_tag_filtering: true

Deployments with no default tag (relying on the fall-through "return all deployments" path for untagged requests)

The proxy stores User-Agent in metadata["user_agent"] (common proxy behavior)

Before this PR: Requests without request_tags but with user_agent in metadata skip the filter block, fall through to the untagged path, and return healthy_deployments.

After this PR: Those same requests enter the filter block. Since no deployment matches (no tag_regex config, no default tag), both new_healthy_deployments and default_deployments stay empty, and a ValueError (no_deployments_with_tag_routing) is raised — breaking existing traffic.

Per the backwards-compatibility policy, this new code path should be guarded so it only activates when the operator has explicitly opted in (e.g., when any deployment in the list actually defines tag_regex patterns, or via a dedicated flag).

Rule Used: What: avoid backwards-incompatible changes without... (source)

greptile-apps · 2026-03-14T00:51:56Z

litellm/router_strategy/tag_based_routing.py

+                    metadata["tag_routing"] = {
+                        "matched_deployment": deployment.get("model_name"),
+                        "matched_via": matched_via,
+                        "matched_value": matched_value,
+                        "request_tags": request_tags or [],
+                        "user_agent": user_agent,
+                    }
                    new_healthy_deployments.append(deployment)


tag_routing provenance block overwritten by each matching deployment

The metadata["tag_routing"] dict is assigned inside the per-deployment loop. When multiple deployments match (e.g., two deployments both have tag_regex patterns that match the incoming User-Agent), the block is overwritten for every hit. The final stored value will reflect only the last matched deployment, while new_healthy_deployments contains all of them.

This makes the matched_deployment field in tag_routing inconsistent with the actual routing outcome (the load balancer will pick from all matched deployments, not necessarily the last one). Operators relying on this provenance for debugging will see misleading data.

A safer approach would be to write tag_routing only once — either on the first match, or after the loop using a summary of all matched deployments.

greptile-apps · 2026-03-14T00:51:57Z

litellm/router_strategy/tag_based_routing.py

+            try:
+                if re.search(pattern, header_str):
+                    return pattern
+            except re.error:
+                verbose_logger.warning(
+                    "tag_regex: invalid pattern %r — skipping", pattern
+                )
+    return None


Invalid-pattern warning logged multiple times per invalid regex

The except re.error block is inside the inner loop over header_strings, so a single bad pattern produces one warning per header string rather than one warning per pattern. With multiple header sources added in the future this would produce noisy, repeated logs for the same root cause.

Consider breaking out of the inner loop after logging the warning:

for pattern in tag_regexes: pattern_valid = True for header_str in header_strings: try: if re.search(pattern, header_str): return pattern except re.error: verbose_logger.warning( "tag_regex: invalid pattern %r — skipping", pattern ) pattern_valid = False break # one warning per pattern is enough if not pattern_valid: continue

greptile-apps · 2026-03-14T00:51:58Z

tests/test_litellm/router_strategy/test_router_tag_regex_routing.py

+def test_regex_returns_first_matching_pattern():
+    """When multiple patterns are provided, returns the first match."""
+    result = _is_valid_deployment_tag_regex(
+        tag_regexes=[r"^User-Agent: cursor\/", r"^User-Agent: claude-code\/"],
+        header_strings=["User-Agent: claude-code/2.0.0"],
+    )
+    assert result == r"^User-Agent: claude-code\/"


Misleading test name — returns second pattern in the list, not the first

The docstring says "returns the first match" but the pattern list is [cursor, claude-code] and the header is "claude-code/2.0.0". The cursor pattern does not match, so the function returns claude-code — which is the second pattern. The function returns the first matching pattern, but that is not the first pattern overall.

Consider renaming to clarify intent:

Suggested change

def test_regex_returns_first_matching_pattern():

"""When multiple patterns are provided, returns the first match."""

result = _is_valid_deployment_tag_regex(

tag_regexes=[r"^User-Agent: cursor\/", r"^User-Agent: claude-code\/"],

header_strings=["User-Agent: claude-code/2.0.0"],

)

assert result == r"^User-Agent: claude-code\/"

def test_regex_returns_first_matching_pattern_skipping_non_match():

"""Patterns are tried in order; the first one that matches is returned."""

…ning noise Three issues from code review: 1. Backwards-compat: `has_tag_filter` was widened to activate on any non-empty User-Agent, which would raise ValueError for existing deployments using plain tags without a `default` fallback. Fix: only activate header-based regex filtering when at least one candidate deployment has `tag_regex` configured. 2. Metadata overwrite: `metadata["tag_routing"]` was overwritten for every matching deployment in the loop, leaving inaccurate provenance when multiple deployments match. Fix: write only for the first match. 3. Warning noise: an invalid regex pattern logged one warning per header string rather than once per pattern. Fix: compile first (catching re.error once), then iterate over header strings. Also adds two new tests covering these cases, and adds docs page for tag_regex routing with a Claude Code walk-through.

ishaan-jaff · 2026-03-14T01:00:55Z

@greptileai

ishaan-jaff · 2026-03-14T01:06:09Z

@greptileai please re-review, all three issues from your previous review have been addressed

greptile-apps · 2026-03-14T01:11:22Z

litellm/router_strategy/tag_based_routing.py

+        # Currently we match against User-Agent; format matches "^User-Agent: claude-code/..."
+        user_agent = metadata.get("user_agent", "")
+        header_strings: List[str] = (
+            [f"User-Agent: {user_agent}"] if user_agent else []
+        )


User-Agent is trivially spoofable by any API consumer

User-Agent is a client-supplied HTTP header — any user with a valid API key can set it to any value they choose. The PR's primary use case ("route Claude Code traffic to dedicated AWS accounts for 5,000 employees") implies that the tag_regex-matched deployment represents dedicated infrastructure with potentially different capacity, cost-allocation, or model access compared to the default deployment.

Because the routing decision is based solely on this unverified header, any user can route to the "dedicated" deployment by sending:

User-Agent: claude-code/1.0

This means tag_regex routing should be treated as a traffic classification hint (e.g., for billing visibility or capacity planning) rather than an access-control mechanism. The docs currently frame it as infrastructure isolation ("dedicated AWS accounts") which implies a stronger guarantee than the implementation provides.

At minimum, the documentation should include a prominent warning such as:

Security note: User-Agent is a client-supplied header and can be set to any value. Do not rely on tag_regex routing to enforce access controls or spend limits — use team/key-based routing for that.

greptile-apps · 2026-03-14T01:11:23Z

litellm/router_strategy/tag_based_routing.py

+                # 2. Regex match against request headers (new)
+                if matched_via is None and deployment_tag_regex and header_strings:
+                    regex_match = _is_valid_deployment_tag_regex(
+                        deployment_tag_regex, header_strings
+                    )
+                    if regex_match is not None:
+                        matched_via = "tag_regex"
+                        matched_value = regex_match


tag_filtering_match_any=False is silently ignored for regex routing

match_any (read at line 110) controls whether an exact-tag match requires any or all request tags to be present on the deployment. However, _is_valid_deployment_tag_regex doesn't accept a match_any argument — regex routing always uses "first match wins" semantics.

This creates a silent inconsistency: an operator who sets tag_filtering_match_any: false to enforce strict all-tags matching will find that tag_regex patterns bypass this policy entirely. A request carrying tags: ["a"] against a deployment with tags: ["a", "b"] would correctly fail the strict check (step 1), but if the same deployment also has a matching tag_regex, it will still be selected (step 2) — violating the operator's intent.

Consider either:

Documenting explicitly that tag_regex always uses "any match" semantics regardless of tag_filtering_match_any, or

Skipping the regex check when match_any=False and the tags check already failed, to respect the strict-filtering intent:

# 2. Regex match against request headers (new) # Note: when match_any=False, skip regex if tag check already ran and failed, # to honour the operator's strict-filtering intent. if matched_via is None and deployment_tag_regex and header_strings: if match_any or not (deployment_tags and request_tags): regex_match = _is_valid_deployment_tag_regex(...)

greptile-apps · 2026-03-14T01:11:24Z

litellm/router.py

+            _tag_regex = deployment.litellm_params.get("tag_regex") or []
+            for pattern in _tag_regex:
+                try:
+                    re.compile(pattern)
+                except re.error as exc:
+                    raise ValueError(
+                        f"Invalid regex in tag_regex for model '{deployment.model_name}': "
+                        f"{pattern!r} — {exc}"
+                    ) from exc


Startup validation runs after _add_deployment — partial state possible on failure

The regex validation block fires after self._add_deployment(deployment=deployment) has already run (line 6530). If a deployment has an invalid tag_regex pattern, _add_deployment has already executed (potentially modifying router-internal state) before the ValueError is raised. While the error correctly aborts startup, it may leave partially-initialised internal state if _add_deployment has side effects beyond what is undone by the raised exception.

Consider moving the regex validation to run before _add_deployment so the deployment is only processed after it is known to be fully valid:

# Validate tag_regex patterns BEFORE adding the deployment _tag_regex = deployment.litellm_params.get("tag_regex") or [] for pattern in _tag_regex: try: re.compile(pattern) except re.error as exc: raise ValueError( f"Invalid regex in tag_regex for model '{deployment.model_name}': " f"{pattern!r} — {exc}" ) from exc deployment = self._add_deployment(deployment=deployment)

…page - Add ## Regex-based tag routing (tag_regex) section to existing tag_routing.md instead of a separate page - Remove tag_regex_routing.md standalone doc (odd UX to have a separate page for a sub-feature) - Remove proxy/tag_regex_routing from sidebars.js - Add match_any=False debug warning in tag_based_routing.py when regex routing fires under strict mode (regex always uses OR semantics)

…enforcement, validation order - Strengthen security note in tag_routing.md: explicitly state User-Agent is client-supplied and can be set to any value; frame tag_regex as a traffic classification hint, not an access-control mechanism - Move tag_regex startup validation before _add_deployment() so an invalid pattern never leaves partial router state - Enforce match_any=False strict-tag policy: when a deployment has both tags and tag_regex and the strict tag check fails, skip the regex fallback rather than silently bypassing the operator's intent - Extract per-deployment match logic into _match_deployment() helper to keep get_deployments_for_tag() readable - Add two new tests: strict-mode blocks regex fallback, regex-only deployment still matches under match_any=False

greptile-apps · 2026-03-14T03:23:01Z

litellm/router_strategy/tag_based_routing.py

+                    if "tag_routing" not in metadata:
+                        metadata["tag_routing"] = {
+                            "matched_deployment": deployment.get("model_name"),
+                            "matched_via": match_result["matched_via"],
+                            "matched_value": match_result["matched_value"],
+                            "request_tags": request_tags or [],
+                            "user_agent": user_agent,
+                        }


matched_deployment records the non-unique model alias, not the deployment ID

deployment.get("model_name") returns the shared alias (e.g., "claude-sonnet") that is identical for every deployment behind the same model name. When an operator is debugging which physical deployment was selected, seeing "claude-sonnet" in matched_deployment gives no extra information over the model field that is already logged.

The unique identifier is deployment.get("model_info", {}).get("id"). Since tag_routing is written only for the first match (correct) and it flows to SpendLogs, using the deployment ID here would make the observability block genuinely useful:

Suggested change

if "tag_routing" not in metadata:

metadata["tag_routing"] = {

"matched_deployment": deployment.get("model_name"),

"matched_via": match_result["matched_via"],

"matched_value": match_result["matched_value"],

"request_tags": request_tags or [],

"user_agent": user_agent,

}

if "tag_routing" not in metadata:

metadata["tag_routing"] = {

"matched_deployment": deployment.get("model_info", {}).get("id") or deployment.get("model_name"),

"matched_via": match_result["matched_via"],

"matched_value": match_result["matched_value"],

"request_tags": request_tags or [],

"user_agent": user_agent,

}

* fix(bedrock): respect s3_region_name for batch file uploads (#23569) * fix(bedrock): respect s3_region_name for batch file uploads (GovCloud fix) * fix: s3_region_name always wins over aws_region_name for S3 signing (Greptile feedback) * fix: _filter_headers_for_aws_signature - Bedrock KB (#23571) * fix: _filter_headers_for_aws_signature * fix: filter None header values in all post-signing re-merge paths Addresses Greptile feedback: None-valued headers were being filtered during SigV4 signing but re-merged back into the final headers dict afterward, which would cause downstream HTTP client failures. Made-with: Cursor * feat(router): tag_regex routing — route by User-Agent regex without per-developer tag config (#23594) * feat(router): add tag_regex support for header-based routing Adds a new `tag_regex` field to litellm_params that lets operators route requests based on regex patterns matched against request headers — primarily User-Agent — without requiring per-developer tag configuration. Use case: route all Claude Code traffic (User-Agent: claude-code/x.y.z) to a dedicated deployment by setting: tag_regex: - "^User-Agent: claude-code\\/" in the deployment's litellm_params. Works alongside existing `tags` routing; exact tag match takes precedence over regex match. Unmatched requests fall through to deployments tagged `default`. The matched deployment, pattern, and user_agent are recorded in `metadata["tag_routing"]` so they flow through to SpendLogs automatically. * fix(tag_regex): address backwards-compat, metadata overwrite, and warning noise Three issues from code review: 1. Backwards-compat: `has_tag_filter` was widened to activate on any non-empty User-Agent, which would raise ValueError for existing deployments using plain tags without a `default` fallback. Fix: only activate header-based regex filtering when at least one candidate deployment has `tag_regex` configured. 2. Metadata overwrite: `metadata["tag_routing"]` was overwritten for every matching deployment in the loop, leaving inaccurate provenance when multiple deployments match. Fix: write only for the first match. 3. Warning noise: an invalid regex pattern logged one warning per header string rather than once per pattern. Fix: compile first (catching re.error once), then iterate over header strings. Also adds two new tests covering these cases, and adds docs page for tag_regex routing with a Claude Code walk-through. * refactor(tag_regex): remove unnecessary _healthy_list copy * docs: merge tag_regex section into tag_routing.md, remove standalone page - Add ## Regex-based tag routing (tag_regex) section to existing tag_routing.md instead of a separate page - Remove tag_regex_routing.md standalone doc (odd UX to have a separate page for a sub-feature) - Remove proxy/tag_regex_routing from sidebars.js - Add match_any=False debug warning in tag_based_routing.py when regex routing fires under strict mode (regex always uses OR semantics) * fix(tag_regex): address greptile review - security docs, strict-mode enforcement, validation order - Strengthen security note in tag_routing.md: explicitly state User-Agent is client-supplied and can be set to any value; frame tag_regex as a traffic classification hint, not an access-control mechanism - Move tag_regex startup validation before _add_deployment() so an invalid pattern never leaves partial router state - Enforce match_any=False strict-tag policy: when a deployment has both tags and tag_regex and the strict tag check fails, skip the regex fallback rather than silently bypassing the operator's intent - Extract per-deployment match logic into _match_deployment() helper to keep get_deployments_for_tag() readable - Add two new tests: strict-mode blocks regex fallback, regex-only deployment still matches under match_any=False * fix(ci): apply Black formatting to 14 files and stabilize flaky caplog tests - Run Black formatter on 14 files that were failing the lint check - Replace caplog-based assertions in TestAliasConflicts with unittest.mock.patch on verbose_logger.warning for xdist compatibility - The caplog fixture can produce empty text in pytest-xdist workers in certain CI environments, causing flaky test failures Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

…der (#23663) * fix: forward extra_headers to HuggingFace embedding calls (#23525) Fixes #23502 The huggingface_embed.embedding() call was not receiving the headers parameter, causing extra_headers (e.g., X-HF-Bill-To) to be silently dropped. Other providers (openrouter, vercel_ai_gateway, bedrock) already pass headers correctly. This fix adds headers=headers to match the behavior of other providers. Co-authored-by: Jah-yee <sparklab@outlook.com> * fix: add getPopupContainer to Select components in fallback modal to fix z-index issue (#23516) The model dropdown menus in the Add Fallbacks modal were rendering behind the modal overlay because Ant Design portals Select dropdowns to document.body by default. By setting getPopupContainer to attach the dropdown to its parent element, the dropdown inherits the modal's stacking context and renders above the modal. Fixes #17895 * PR #22867 added _remove_scope_from_cache_control for Bedrock and Azur… (#23183) * PR #22867 added _remove_scope_from_cache_control for Bedrock and Azure AI but omitted Vertex AI. This applies the same pattern to VertexAIPartnerModelsAnthropicMessagesConfig." * PR #22867 added _remove_scope_from_cache_control for Bedrock and Azure AI but omitted Vertex AI. This applies the same pattern to VertexAIPartnerModelsAnthropicMessagesConfig." * PR #22867 added _remove_scope_from_cache_control to AzureAnthropicMessagesConfig but missed VertexAIPartnerModelsAnthropicMessagesConfi Rather than duplicating the method again, moved it up to the base AnthropicMessagesConfig so all providers inherit it, and removed the now-redundant copy from the Azure AI subclass. * PR #22867 added _remove_scope_from_cache_control to AzureAnthropicMessagesConfig but missed VertexAIPartnerModelsAnthropicMessagesConfi Rather than duplicating the method again, moved it up to the base AnthropicMessagesConfig so all providers inherit it, and removed the now-redundant copy from the Azure AI subclass. --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * fix: auto-fill reasoning_content for moonshot kimi reasoning models in multi-turn tool calling (#23580) * Handle response.failed, response.incomplete, and response.cancelled (#23492) * Handle response.failed, response.incomplete, and response.cancelled terminal events in background streaming Previously the background streaming task only handled response.completed and hardcoded the final status to "completed". This missed three other terminal event types from the OpenAI streaming spec, causing failed/incomplete/cancelled responses to be incorrectly marked as completed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * Remove unused terminal_response_data variable Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * Address code review: derive fallback status from event type, rewrite tests as integration tests 1. Replace hardcoded "completed" fallback in response_data.get("status") with _event_to_status lookup so that response.incomplete and response.cancelled events get the correct fallback if the response body ever omits the status field. 2. Replace duplicated-logic unit tests with integration tests that exercise background_streaming_task directly using mocked streaming responses and assert on the final update_state call arguments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * Remove dead mock_processor and unused mock_response parameter from test helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * Remove FastAPI and UserAPIKeyAuth imports from test file These types were only used as Mock(spec=...) arguments. Drop the spec constraints and remove the top-level imports to avoid pulling FastAPI into test files outside litellm/proxy/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * Log warning when streaming response has no body_iterator If base_process_llm_request returns a non-streaming response (no body_iterator), log a warning since this likely indicates a misconfiguration or provider error rather than a successful completion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(security): bump tar to 7.5.11 and tornado to 6.5.5 (#23602) * fix(security): bump tar to 7.5.11 and tornado to 6.5.5 - tar >=7.5.11: fixes CVE-2026-31802 (HIGH) in node-pkg - tornado >=6.5.5: fixes CVE-2026-31958 (HIGH) and GHSA-78cv-mqj4-43f7 (MEDIUM) in python-pkg Addresses vulnerabilities found in ghcr.io/berriai/litellm:main-v1.82.0-stable Trivy scan. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: document tar override is enforced via Dockerfile, not npm * fix: revert invalid JSON comment in package.json tar override --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * [Feat] - Ishaan main merge branch (#23596) * fix(bedrock): respect s3_region_name for batch file uploads (#23569) * fix(bedrock): respect s3_region_name for batch file uploads (GovCloud fix) * fix: s3_region_name always wins over aws_region_name for S3 signing (Greptile feedback) * fix: _filter_headers_for_aws_signature - Bedrock KB (#23571) * fix: _filter_headers_for_aws_signature * fix: filter None header values in all post-signing re-merge paths Addresses Greptile feedback: None-valued headers were being filtered during SigV4 signing but re-merged back into the final headers dict afterward, which would cause downstream HTTP client failures. Made-with: Cursor * feat(router): tag_regex routing — route by User-Agent regex without per-developer tag config (#23594) * feat(router): add tag_regex support for header-based routing Adds a new `tag_regex` field to litellm_params that lets operators route requests based on regex patterns matched against request headers — primarily User-Agent — without requiring per-developer tag configuration. Use case: route all Claude Code traffic (User-Agent: claude-code/x.y.z) to a dedicated deployment by setting: tag_regex: - "^User-Agent: claude-code\\/" in the deployment's litellm_params. Works alongside existing `tags` routing; exact tag match takes precedence over regex match. Unmatched requests fall through to deployments tagged `default`. The matched deployment, pattern, and user_agent are recorded in `metadata["tag_routing"]` so they flow through to SpendLogs automatically. * fix(tag_regex): address backwards-compat, metadata overwrite, and warning noise Three issues from code review: 1. Backwards-compat: `has_tag_filter` was widened to activate on any non-empty User-Agent, which would raise ValueError for existing deployments using plain tags without a `default` fallback. Fix: only activate header-based regex filtering when at least one candidate deployment has `tag_regex` configured. 2. Metadata overwrite: `metadata["tag_routing"]` was overwritten for every matching deployment in the loop, leaving inaccurate provenance when multiple deployments match. Fix: write only for the first match. 3. Warning noise: an invalid regex pattern logged one warning per header string rather than once per pattern. Fix: compile first (catching re.error once), then iterate over header strings. Also adds two new tests covering these cases, and adds docs page for tag_regex routing with a Claude Code walk-through. * refactor(tag_regex): remove unnecessary _healthy_list copy * docs: merge tag_regex section into tag_routing.md, remove standalone page - Add ## Regex-based tag routing (tag_regex) section to existing tag_routing.md instead of a separate page - Remove tag_regex_routing.md standalone doc (odd UX to have a separate page for a sub-feature) - Remove proxy/tag_regex_routing from sidebars.js - Add match_any=False debug warning in tag_based_routing.py when regex routing fires under strict mode (regex always uses OR semantics) * fix(tag_regex): address greptile review - security docs, strict-mode enforcement, validation order - Strengthen security note in tag_routing.md: explicitly state User-Agent is client-supplied and can be set to any value; frame tag_regex as a traffic classification hint, not an access-control mechanism - Move tag_regex startup validation before _add_deployment() so an invalid pattern never leaves partial router state - Enforce match_any=False strict-tag policy: when a deployment has both tags and tag_regex and the strict tag check fails, skip the regex fallback rather than silently bypassing the operator's intent - Extract per-deployment match logic into _match_deployment() helper to keep get_deployments_for_tag() readable - Add two new tests: strict-mode blocks regex fallback, regex-only deployment still matches under match_any=False * fix(ci): apply Black formatting to 14 files and stabilize flaky caplog tests - Run Black formatter on 14 files that were failing the lint check - Replace caplog-based assertions in TestAliasConflicts with unittest.mock.patch on verbose_logger.warning for xdist compatibility - The caplog fixture can produce empty text in pytest-xdist workers in certain CI environments, causing flaky test failures Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: tiktoken cache nonroot offline (#23498) * fix: restore offline tiktoken cache for non-root envs Made-with: Cursor * chore: mkdir for custom tiktoken cache dir Made-with: Cursor * test: patch tiktoken.get_encoding in custom-dir test to avoid network Made-with: Cursor * test: clear CUSTOM_TIKTOKEN_CACHE_DIR in helper for test isolation Made-with: Cursor * test: restore default_encoding module state after custom-dir test Made-with: Cursor * fix: normalize content_filtered finish_reason (#23564) Map provider finish_reason "content_filtered" to the OpenAI-compatible "content_filter" and extend core_helpers tests to cover this case. Made-with: Cursor * fix: Fixes #23185 (#23647) * fix: merge annotations from all streaming chunks in stream_chunk_builder Previously, stream_chunk_builder only took annotations from the first chunk that contained them, losing any annotations from later chunks. This is a problem because providers like Gemini/Vertex AI send grounding metadata (converted to annotations) in the final streaming chunk, while other providers may spread annotations across multiple chunks. Changes: - Collect and merge annotations from ALL annotation-bearing chunks instead of only using the first one --------- Co-authored-by: RoomWithOutRoof <166608075+Jah-yee@users.noreply.github.com> Co-authored-by: Jah-yee <sparklab@outlook.com> Co-authored-by: Ethan T. <ethanchang32@gmail.com> Co-authored-by: Awais Qureshi <awais.qureshi@arbisoft.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Pradyumna Yadav <pradyumna.aky@gmail.com> Co-authored-by: xianzongxie-stripe <87151258+xianzongxie-stripe@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Joe Reyna <joseph.reyna@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> Co-authored-by: milan-berri <milan@berri.ai>

vercel bot deployed to Preview March 14, 2026 00:49 View deployment

greptile-apps bot reviewed Mar 14, 2026

View reviewed changes

vercel bot deployed to Preview March 14, 2026 00:57 View deployment

refactor(tag_regex): remove unnecessary _healthy_list copy

259240f

vercel bot deployed to Preview March 14, 2026 01:06 View deployment

greptile-apps bot reviewed Mar 14, 2026

View reviewed changes

ishaan-jaff changed the base branch from main to litellm_ishaan_march_13 March 14, 2026 03:13

vercel bot deployed to Preview March 14, 2026 03:15 View deployment

vercel bot deployed to Preview March 14, 2026 03:20 View deployment

greptile-apps bot reviewed Mar 14, 2026

View reviewed changes

ishaan-jaff merged commit bd17d12 into litellm_ishaan_march_13 Mar 14, 2026
5 checks passed

		has_tag_filter = bool(request_tags) or bool(header_strings)
		if has_tag_filter:

Uh oh!

Conversation

ishaan-jaff commented Mar 14, 2026

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Mar 14, 2026

Uh oh!

ishaan-jaff commented Mar 14, 2026

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 14, 2026 •

edited

Loading

greptile-apps bot commented Mar 14, 2026 •

edited

Loading