Skip to content

[Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on #20904

Merged
ishaan-jaff merged 18 commits intomainfrom
litellm_policies_tags
Feb 11, 2026
Merged

[Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on #20904
ishaan-jaff merged 18 commits intomainfrom
litellm_policies_tags

Conversation

@ishaan-jaff
Copy link
Member

@ishaan-jaff ishaan-jaff commented Feb 11, 2026

[Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on

Adds tag-based policy attachments so platform teams can say "all requests with tag healthcare get HIPAA guardrails" without manually assigning policies to every key/team. Also adds a Policy Simulator UI and
blast radius preview so admins can debug and preview policy behavior before deploying

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
✅ Test

Changes

@vercel
Copy link

vercel bot commented Feb 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 11, 2026 1:51am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

Adds tag-based scoping to the policy engine, a new Policy Simulator UI tab, and a blast-radius estimation endpoint for policy attachments.

  • Tag support: PolicyScope, PolicyAttachment, PolicyMatchContext, and the Prisma schema all gain a tags field. Tags use opt-in semantics — an empty tag list means "don't filter by tags" (unlike teams/keys/models which default to match-all). Tag matching reuses the existing wildcard pattern infrastructure (health-*).
  • Policy Simulator (/policies/resolve): New endpoint and UI panel to debug which guardrails would apply for a given team/key/model/tags context. Syncs from DB on each call.
  • Impact estimation (/policies/attachments/estimate-impact): Previews how many keys/teams an attachment would affect before creating it. Fetches all keys and teams from the database and filters in Python, which is a scalability concern.
  • Policy attribution: The pre-call path now tracks why each policy matched (e.g., tag:healthcare, team:health-team) and exposes it via a new x-litellm-policy-sources response header.
  • Refactors: Module-level singletons removed from policy_endpoints.py in favor of lazy get_*() calls; PolicyRegistry.sync_policies_from_db now correctly sets _initialized = True.
  • The estimate_attachment_impact endpoint performs multiple unbounded find_many(where={}) queries (fetching all keys/teams), which could be expensive at scale. Consider adding pagination or DB-level filtering.

Confidence Score: 3/5

  • Functionally correct but has scalability concerns in the estimate-impact endpoint that could cause performance issues on large deployments.
  • Core tag matching logic is well-tested and sound. The main concern is the estimate-impact endpoint fetching unbounded data from the database, which violates the project's own guideline about avoiding expensive DB operations. The resolve endpoint also syncs full state from DB on every call. These are not bugs but could cause problems at scale.
  • litellm/proxy/policy_engine/policy_resolve_endpoints.py — the estimate_attachment_impact function performs multiple unbounded DB queries and the resolve_policies_for_context function syncs all policies/attachments from DB on every request.

Important Files Changed

Filename Overview
litellm/proxy/policy_engine/policy_resolve_endpoints.py New file with /policies/resolve and /policies/attachments/estimate-impact endpoints. The estimate-impact endpoint fetches ALL keys and ALL teams from the database (up to 3 separate unbounded queries) with no pagination, which is a performance risk at scale. The resolve endpoint triggers full DB syncs on every call.
litellm/proxy/policy_engine/attachment_registry.py Added tag support to attachments and new get_attached_policies_with_reasons method. Clean refactor that preserves backward compatibility via delegating get_attached_policies to the new method.
litellm/proxy/policy_engine/policy_endpoints.py Removed module-level singleton instantiation (POLICY_REGISTRY, ATTACHMENT_REGISTRY) in favor of calling get_policy_registry() / get_attachment_registry() at each endpoint. This avoids import-time side effects and is a good improvement.
litellm/proxy/policy_engine/policy_matcher.py Added tag matching to scope_matches. Tags use opt-in semantics (empty scope tags = skip check), consistent with documentation. Clean implementation.
litellm/proxy/litellm_pre_call_utils.py Updated policy engine integration in the pre-call path to include tags from request body in the match context, and to track policy attribution sources. Uses get_tags_from_request_body helper.
litellm/types/proxy/policy_engine/policy_types.py Added tags field to PolicyScope and PolicyAttachment models with appropriate defaults and documentation. Tags use opt-in semantics (empty = not checked) vs teams/keys/models (empty = match all).
litellm/types/proxy/policy_engine/resolver_types.py Added new request/response types: PolicyResolveRequest, PolicyResolveResponse, PolicyMatchDetail, AttachmentImpactResponse, and added tags field to existing types. Well-structured Pydantic models.
tests/test_litellm/proxy/policy_engine/test_attachment_registry.py Added TestTagBasedAttachments class with tag matching tests (exact, wildcard, combined with team). All mock/in-memory — no network calls.
tests/test_litellm/proxy/policy_engine/test_policy_matcher.py Added TestPolicyMatcherScopeMatchingWithTags class testing tag matching in scope_matches. All mock/in-memory — no network calls.
ui/litellm-dashboard/src/components/policies/add_attachment_form.tsx Added tags field, impact estimation button, and refactored data loading to handle various API response shapes. Extracted attachment building into shared utility.
ui/litellm-dashboard/src/components/policies/policy_test_panel.tsx New Policy Simulator panel allowing users to test which policies/guardrails apply for a given context. Clean implementation.

Sequence Diagram

sequenceDiagram
    participant UI as Dashboard UI
    participant Resolve as /policies/resolve
    participant Impact as /policies/attachments/estimate-impact
    participant PR as PolicyRegistry
    participant AR as AttachmentRegistry
    participant PM as PolicyMatcher
    participant DB as Database

    Note over UI: Policy Simulator Tab
    UI->>Resolve: POST {team_alias, key_alias, model, tags}
    Resolve->>DB: sync_policies_from_db()
    Resolve->>DB: sync_attachments_from_db()
    Resolve->>AR: get_attached_policies_with_reasons(context)
    AR->>PM: scope_matches(scope, context) [incl. tags]
    AR-->>Resolve: [{policy_name, matched_via}]
    Resolve->>PM: get_policies_with_matching_conditions()
    Resolve->>PR: resolve_policy_guardrails()
    Resolve-->>UI: {effective_guardrails, matched_policies}

    Note over UI: Estimate Impact (before creating attachment)
    UI->>Impact: POST {policy_name, tags, teams, keys}
    Impact->>DB: find_many(keys) [unbounded]
    Impact->>DB: find_many(teams) [unbounded]
    Impact-->>UI: {affected_keys_count, affected_teams_count, samples}

    Note over UI: Pre-call Path (request time)
    participant Req as LLM Request
    Req->>AR: get_attached_policies_with_reasons(context w/ tags)
    AR->>PM: scope_matches (checks tags)
    AR-->>Req: matching policies + reasons
    Req->>Req: add x-litellm-policy-sources header
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

26 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +186 to +191
if prisma_client is None:
raise HTTPException(status_code=500, detail="Database not connected")

try:
# Sync from DB to ensure in-memory state is current
await get_policy_registry().sync_policies_from_db(prisma_client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

estimate_attachment_impact issues multiple unbounded find_many(where={}) queries — fetching all keys and all teams from the database with no pagination or limit. In a deployment with thousands of keys/teams, this will be slow and memory-intensive. Worse, when both tag_patterns and team_patterns are provided, the teams table is queried twice (lines 211 and 236), and the keys table can be queried up to 3 times (lines 188, 254, 266).

Consider adding take limits to the Prisma queries or using database-level filtering (e.g., Prisma json filtering on metadata) instead of fetching everything into Python and filtering in-memory.

Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.

Why: Cre... (source)

Comment on lines +69 to +71
return affected


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling sync_policies_from_db and sync_attachments_from_db on every /policies/resolve request forces two full DB round-trips. These syncs are already performed on a background schedule. For a debug/simulation endpoint this may be acceptable, but it's worth noting this would be expensive if called frequently.

Consider gating the sync behind a ?force_sync=true query param, or relying on the existing background sync.

Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.

Why: Cre... (source)

)

if not match_results:
return PolicyResolveResponse(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When key_alias is empty/None, this falls back to str(key.token)[:8] + "..." — leaking a prefix of the hashed token in the API response. While these are hashed tokens (not raw API keys), exposing even partial hashes in an estimation response could be an unnecessary information disclosure.

Consider using a generic placeholder like "(unnamed key)" instead.

@ishaan-jaff
Copy link
Member Author

@greptile review again

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

Adds tag-based policy attachments, a /policies/resolve debug endpoint, and a /policies/attachments/estimate-impact blast-radius preview endpoint. Also introduces a Policy Simulator UI tab and blast-radius popovers in the dashboard.

  • Tag scoping: PolicyScope, PolicyAttachment, and PolicyMatchContext gain a tags field with opt-in semantics (empty = don't check, unlike teams/keys/models which default to ["*"]). Tags are matched using existing wildcard helpers with AND-logic across scope dimensions.
  • Match attribution: New get_attached_policies_with_reasons method on AttachmentRegistry powers both the x-litellm-policy-sources response header and the simulator UI.
  • Critical path change: add_guardrails_from_policy_engine in litellm_pre_call_utils.py now extracts tags from the request body and passes them into policy matching. This adds a small amount of work to every request that uses the policy engine.
  • Impact estimation: The estimate-impact endpoint queries the DB (bounded by MAX_POLICY_ESTIMATE_IMPACT_ROWS=1000) and filters in Python. Unnamed keys/teams create duplicate placeholder entries that inflate the count.
  • Lazy registry access: policy_endpoints.py now calls get_policy_registry() / get_attachment_registry() per-request instead of caching module-level singletons — this is a correctness improvement.
  • Schema migration: All three schema.prisma files add a tags String[] @default([]) column to LiteLLM_PolicyAttachmentTable.

Confidence Score: 3/5

  • Generally safe feature addition with good test coverage, but the unnamed-key duplicate counting bug in impact estimation and the aggressive sys.modules clearing in conftest warrant attention before merge.
  • The core policy matching and tag scoping logic is well-designed with solid test coverage. The critical-path change in litellm_pre_call_utils.py is lightweight. The impact estimation endpoint has a counting bug (duplicate unnamed placeholders) and the conftest module-clearing is risky for test isolation. No security issues found.
  • litellm/proxy/policy_engine/policy_resolve_endpoints.py (unnamed key/team counting bug), tests/test_litellm/proxy/policy_engine/conftest.py (sys.modules clearing side-effects)

Important Files Changed

Filename Overview
litellm/proxy/policy_engine/policy_resolve_endpoints.py New file: policy resolve & impact estimation endpoints. Fetches up to 1000 rows per query for tag-based impact; unnamed keys/teams create duplicate "(unnamed key)" entries inflating counts.
litellm/proxy/policy_engine/attachment_registry.py Added tag support to attachment loading/matching/persistence and new get_attached_policies_with_reasons method for match attribution. Clean refactor of existing get_attached_policies.
litellm/proxy/policy_engine/policy_matcher.py Added tag matching to scope_matches with opt-in semantics (empty tags = don't check). Correct AND-logic with other scope fields.
litellm/proxy/litellm_pre_call_utils.py Critical path change: extracts tags from request body, passes to PolicyMatchContext, and tracks policy attribution in metadata for response headers.
litellm/proxy/common_utils/callback_utils.py Added add_policy_sources_to_metadata and corresponding x-litellm-policy-sources header generation in get_logging_caching_headers.
litellm/types/proxy/policy_engine/policy_types.py Added tags field to PolicyScope and PolicyAttachment with correct opt-in semantics (None/empty = don't check, unlike teams/keys/models).
tests/test_litellm/proxy/policy_engine/conftest.py New conftest that aggressively removes all litellm modules from sys.modules at import time. Designed for worktree use but could interfere with other tests in the same pytest session.
tests/test_litellm/proxy/policy_engine/test_attachment_registry.py Good test coverage: tag matching, wildcards, AND-logic with teams, match attribution, tags-only attachments, no-scope catch-all. All mock-based.

Sequence Diagram

sequenceDiagram
    participant UI as Dashboard UI
    participant Proxy as LiteLLM Proxy
    participant PE as Policy Engine
    participant DB as Database

    Note over UI,DB: Policy Simulator Flow
    UI->>Proxy: POST /policies/resolve {tags, team, key, model}
    Proxy->>PE: AttachmentRegistry.get_attached_policies_with_reasons(context)
    PE->>PE: PolicyMatcher.scope_matches (teams + keys + models + tags)
    PE-->>Proxy: matched policies + reasons
    Proxy->>PE: PolicyMatcher.get_policies_with_matching_conditions
    Proxy->>PE: PolicyResolver.resolve_policy_guardrails
    Proxy-->>UI: {effective_guardrails, matched_policies}

    Note over UI,DB: Impact Estimation Flow
    UI->>Proxy: POST /policies/attachments/estimate-impact {tags, teams, keys}
    Proxy->>DB: find_many keys/teams (take=1000)
    Proxy->>Proxy: Filter by tag/alias patterns in Python
    Proxy-->>UI: {affected_keys_count, affected_teams_count, samples}

    Note over UI,DB: Request-Time Tag Matching (Critical Path)
    UI->>Proxy: POST /chat/completions {metadata: {tags: ["healthcare"]}}
    Proxy->>PE: get_tags_from_request_body + build PolicyMatchContext
    PE->>PE: get_attached_policies_with_reasons (includes tag matching)
    PE-->>Proxy: policies + attribution reasons
    Proxy->>Proxy: add_policy_sources_to_metadata (for x-litellm-policy-sources header)
    Proxy-->>UI: Response + x-litellm-policy-sources header
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

27 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +85 to +100
affected: list = []
keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore
where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS,
)
for key in keys:
key_alias = key.key_alias or ""
key_tags = _get_tags_from_metadata(
key.metadata, getattr(key, "metadata_json", None)
)
if key_tags and any(
RouteChecks._route_matches_wildcard_pattern(route=tag, pattern=pat)
for tag in key_tags
for pat in tag_patterns
):
affected.append(key_alias or _UNNAMED_KEY_PLACEHOLDER)
return affected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple unnamed keys (keys without key_alias) all get appended as "(unnamed key)", which inflates affected_keys_count with duplicate placeholder entries. The same issue applies to _find_affected_teams_by_tags with "(unnamed team)".

Consider deduplicating or counting unnamed entities separately:

Suggested change
affected: list = []
keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore
where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS,
)
for key in keys:
key_alias = key.key_alias or ""
key_tags = _get_tags_from_metadata(
key.metadata, getattr(key, "metadata_json", None)
)
if key_tags and any(
RouteChecks._route_matches_wildcard_pattern(route=tag, pattern=pat)
for tag in key_tags
for pat in tag_patterns
):
affected.append(key_alias or _UNNAMED_KEY_PLACEHOLDER)
return affected
affected: list = []
unnamed_count = 0
keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore
where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS,
)
for key in keys:
key_alias = key.key_alias or ""
key_tags = _get_tags_from_metadata(
key.metadata, getattr(key, "metadata_json", None)
)
if key_tags and any(
RouteChecks._route_matches_wildcard_pattern(route=tag, pattern=pat)
for tag in key_tags
for pat in tag_patterns
):
if key_alias:
affected.append(key_alias)
else:
unnamed_count += 1
if unnamed_count:
affected.append(f"(unnamed key) x{unnamed_count}")
return affected

Comment on lines +27 to +30
# Clear ALL cached litellm modules so they're re-imported from the worktree
_to_remove = [key for key in sys.modules if key == "litellm" or key.startswith("litellm.")]
for key in _to_remove:
del sys.modules[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting all cached litellm.* modules from sys.modules at conftest import time is a side-effect that runs even when these tests are collected alongside other test modules in the same pytest session. This could cause hard-to-debug re-import issues for tests in sibling directories that depend on already-imported litellm state.

Is this worktree fix still needed, or can the tests run without it?

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!. Is this worktree-specific conftest still needed? It clears all litellm modules from sys.modules at import time, which could interfere with other tests in the same pytest session.

- Track unnamed keys/teams as separate counts instead of inflating
  affected_keys_count with duplicate "(unnamed key)" placeholders.
  Added unnamed_keys_count and unnamed_teams_count to response.
- Push alias pattern matching to DB via _build_alias_where() which
  converts exact patterns to Prisma "in" and suffix wildcards to
  "startsWith" filters.
- Gate sync_policies_from_db/sync_attachments_from_db behind
  force_sync query param (default false) to avoid 2 DB round-trips
  on every /policies/resolve request.
- Remove worktree-only conftest.py that cleared sys.modules at import
  time — no longer needed since code moved to main repo.
- Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ishaan-jaff
Copy link
Member Author

@greptile review again pls

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

Adds tag-based policy attachments so policies can be scoped by metadata.tags on keys/teams or tags passed in request bodies (e.g., "all requests tagged healthcare get HIPAA guardrails"). Also adds two new endpoints (/policies/resolve for simulating policy matching, /policies/attachments/estimate-impact for blast-radius preview), a Policy Simulator UI tab, and inline impact popovers on the attachment table.

  • Tag-based scoping: New tags field on PolicyScope, PolicyAttachment, and the Prisma schema. Tags use opt-in semantics — empty tags means "don't check" (unlike teams/keys/models which default to ["*"]). Matching uses AND logic with other scope dimensions.
  • Policy Simulator: New /policies/resolve POST endpoint and UI panel that shows which policies/guardrails would apply for a given team/key/model/tags context, with match attribution reasons.
  • Blast-radius preview: New /policies/attachments/estimate-impact endpoint scans keys/teams to estimate how many would be affected by an attachment before creation.
  • Match attribution: get_attached_policies_with_reasons now returns why each policy matched (e.g., "tag:healthcare", "scope:*"), surfaced in a new x-litellm-policy-sources response header.
  • Refactors: Module-level singleton constants in policy_endpoints.py replaced with function calls; _initialized flag properly set after DB sync in policy_registry.py.
  • The x-litellm-policy-sources header uses commas as delimiters, but reason strings can also contain commas (e.g., "tag:healthcare, team:health-team"), creating parsing ambiguity for consumers.
  • to_policy_scope() for global attachments drops tags, so a global attachment with tags silently ignores the tag constraint.

Confidence Score: 3/5

  • Functional but has a header parsing ambiguity bug and a potential logic issue with global+tags attachments that should be clarified before merge.
  • The core tag-matching logic is well-tested and sound. However, two issues lower confidence: (1) the x-litellm-policy-sources header format is ambiguous when reason strings contain commas, which will break downstream parsers; (2) to_policy_scope() silently drops tags for global attachments, which may cause unexpected behavior. The estimate-impact endpoint does full-table scans (with a take limit), acceptable for a preview tool but worth noting. Test coverage is good for the matching logic but the new endpoints lack integration tests.
  • litellm/proxy/common_utils/callback_utils.py (header format ambiguity), litellm/types/proxy/policy_engine/policy_types.py (global scope drops tags), litellm/proxy/policy_engine/policy_resolve_endpoints.py (duplicate DB queries for teams)

Important Files Changed

Filename Overview
litellm/proxy/policy_engine/policy_resolve_endpoints.py New file: adds /policies/resolve and /policies/attachments/estimate-impact endpoints. Tag-based impact queries do full-table scans with take limit; duplicate team queries when both tag_patterns and team_patterns are provided.
litellm/proxy/policy_engine/attachment_registry.py Adds tag support to attachment loading, creation, syncing, and a new get_attached_policies_with_reasons method for match attribution. Clean refactor with proper deduplication via seen_policies set.
litellm/proxy/policy_engine/policy_matcher.py Adds tag matching to scope_matches with opt-in semantics — empty scope tags means "don't check". Logic is clean and well-commented.
litellm/proxy/litellm_pre_call_utils.py Integrates tag extraction from request body into policy matching context and adds policy source attribution tracking for the new response header.
litellm/proxy/common_utils/callback_utils.py Adds x-litellm-policy-sources header and add_policy_sources_to_metadata helper. Header format uses commas as delimiters but reason strings can also contain commas, creating parsing ambiguity.
litellm/types/proxy/policy_engine/policy_types.py Adds tags field to PolicyScope and PolicyAttachment with opt-in semantics. to_policy_scope() for global attachments doesn't pass through tags, which may silently drop constraints.
tests/test_litellm/proxy/policy_engine/test_attachment_registry.py Good test coverage: tag matching, wildcards, combined tag+team AND logic, match attribution reasons, tags-only attachment, and no-scope catch-all. All mock-based, no network calls.
tests/test_litellm/proxy/policy_engine/test_policy_matcher.py Tests tag scope matching: exact match, wildcard, no-match, empty tags, and combined tag+team AND logic. Clean and thorough.
ui/litellm-dashboard/src/components/policies/policy_test_panel.tsx New Policy Simulator UI component. Loads teams/keys/models for dropdown selection, sends resolve request, and displays matched policies and guardrails in a table.
ui/litellm-dashboard/src/components/policies/add_attachment_form.tsx Adds tags field to attachment form, impact estimation preview, and fixes response parsing for teams/keys/models API calls. Refactors attachment data building into shared helper.

Sequence Diagram

sequenceDiagram
    participant UI as Admin UI
    participant Proxy as LiteLLM Proxy
    participant AR as AttachmentRegistry
    participant PM as PolicyMatcher
    participant PR as PolicyResolver
    participant DB as Database

    Note over UI,DB: Policy Simulator Flow
    UI->>Proxy: POST /policies/resolve {tags, team, key, model}
    Proxy->>AR: get_attached_policies_with_reasons(context)
    AR->>PM: scope_matches(scope, context) [incl. tag check]
    PM-->>AR: match results with reasons
    AR-->>Proxy: [{policy_name, matched_via}]
    Proxy->>PM: get_policies_with_matching_conditions()
    Proxy->>PR: resolve_policy_guardrails() per policy
    PR-->>Proxy: guardrails list
    Proxy-->>UI: {effective_guardrails, matched_policies}

    Note over UI,DB: Blast Radius Preview Flow
    UI->>Proxy: POST /policies/attachments/estimate-impact
    Proxy->>DB: find_many keys (where={}, take=1000)
    Proxy->>DB: find_many teams (where={}, take=1000)
    Proxy->>Proxy: filter by tag/team/key patterns in Python
    Proxy-->>UI: {affected_keys_count, affected_teams_count, samples}

    Note over UI,DB: Request-Time Tag Matching
    participant Client as API Client
    Client->>Proxy: POST /chat/completions {metadata: {tags: ["healthcare"]}}
    Proxy->>AR: get_attached_policies_with_reasons(context w/ tags)
    AR-->>Proxy: matching policies + reasons
    Proxy->>Proxy: add_policy_sources_to_metadata()
    Proxy-->>Client: response + x-litellm-policy-sources header
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

26 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +400 to +402
headers["x-litellm-policy-sources"] = ",".join(
f"{name}={reason}" for name, reason in sources.items()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The matched_via reason string (e.g. "tag:healthcare, team:health-team") contains commas. Using , as both the entry delimiter and allowing it inside values makes parsing ambiguous. For example, "hipaa=tag:healthcare, team:health-team,baseline=scope:*" is unparsable.

Consider using ; as the entry separator, or a structured format like JSON, to avoid ambiguity for downstream consumers.

Comment on lines +87 to +88
keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore
where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_find_affected_keys_by_tags queries all keys with where={} (no filter) and then filters in Python. Even with the take=1000 limit, this fetches 1000 full key rows from the DB and iterates them in-memory on every call. For tag-based lookups this is unavoidable since tags live inside JSON metadata, but this should be documented as an O(N) scan.

More importantly, _find_affected_teams_by_tags (line 114) does the same where={} scan. When both tag_patterns and team_patterns are provided in estimate_attachment_impact, the teams table gets queried twice (once here, once in _find_affected_by_team_patterns). Consider caching the first teams query result and passing it to _find_affected_by_team_patterns to avoid the duplicate round-trip.

Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.

Why: Cre... (source)

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 11, 2026

Additional Comments (1)

litellm/types/proxy/policy_engine/policy_types.py
When a global attachment (scope is "*") has tags set, to_policy_scope() ignores them because it returns a PolicyScope with only teams/keys/models set to ["*"], without including tags. This means a global attachment with tags silently drops the tag constraint and matches everything.

If this is intentional (global means "override all scoping"), consider adding a comment to document this behavior. If not, the tags should be preserved in the returned PolicyScope.

- Fetch teams table once in estimate_attachment_impact and reuse for
  both tag-based and alias-based lookups (was querying teams twice when
  both tag_patterns and team_patterns were provided).
- Convert tag/team filter functions from async DB queries to sync
  filters that operate on pre-fetched data (_filter_keys_by_tags,
  _filter_teams_by_tags).
- Fix comma ambiguity in x-litellm-policy-sources header: use '; '
  as entry delimiter since matched_via values can contain commas.
- Use '+' as the within-value separator in matched_via reason strings
  (e.g. "tag:healthcare+team:health-team") to avoid conflict with
  header delimiters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ishaan-jaff
Copy link
Member Author

@greptile can u review again ?

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

Adds tag-based policy attachments, a Policy Simulator UI, and blast-radius preview (impact estimation) for policy attachments. Key changes:

  • Tag-based scoping: PolicyScope and PolicyAttachment gain a tags field with opt-in semantics — unlike teams/keys/models (which default to ["*"] when empty), empty tags means "don't check tags." This allows attachments like "all requests tagged healthcare get HIPAA guardrails."
  • Match attribution: get_attached_policies_with_reasons() returns why each policy matched (e.g. tag:healthcare, team:health-team, scope:*), powering both the new x-litellm-policy-sources response header and the Policy Simulator UI.
  • /policies/resolve endpoint: Debug endpoint to simulate which guardrails would apply for a given team/key/model/tags context. Gated behind ?force_sync=true for DB re-sync (defaults to in-memory cache).
  • /policies/attachments/estimate-impact endpoint: Preview blast radius before creating an attachment — shows how many keys/teams would be affected. Uses MAX_POLICY_ESTIMATE_IMPACT_ROWS (default 1000) to cap DB scans.
  • Policy Simulator UI tab: New tab in the policies dashboard letting admins simulate requests and see matched policies/guardrails.
  • Schema migration: tags String[] @default([]) added to LiteLLM_PolicyAttachmentTable across all 3 schema files.
  • Singleton refactor: Replaced module-level POLICY_REGISTRY/ATTACHMENT_REGISTRY constants in policy_endpoints.py with per-call get_*() lookups to avoid stale singleton state.
  • Good test coverage for tag matching, wildcards, AND-logic with other dimensions, and match attribution.

Confidence Score: 4/5

  • This PR is safe to merge with minor issues — the core matching logic is well-tested, and the new endpoints are admin-only debug tools.
  • Score reflects solid test coverage for the core tag matching/attribution logic, correct opt-in semantics for tags, and proper capping of DB queries. The duplicate auth dependency issue follows a pre-existing codebase pattern. The estimate-impact endpoint does in-memory filtering of up to 1000 rows which is acceptable for an admin tool. No security vulnerabilities found.
  • Pay attention to litellm/proxy/policy_engine/policy_resolve_endpoints.py — the duplicate Depends(user_api_key_auth) in both decorator and parameter causes auth to run twice per request.

Important Files Changed

Filename Overview
litellm/proxy/policy_engine/policy_resolve_endpoints.py New file: policy resolve and attachment impact estimation endpoints. Uses unbounded in-memory filtering for tag-based lookups (capped at 1000 rows). Multiple inline imports of RouteChecks.
litellm/proxy/policy_engine/attachment_registry.py Added tag-based matching, match attribution via get_attached_policies_with_reasons, and _describe_match_reason. Tags field propagated through all CRUD paths.
litellm/proxy/policy_engine/policy_matcher.py Added tag scope matching with opt-in semantics (empty tags = don't check). Correct ANY-match logic for context tags vs scope tag patterns.
litellm/proxy/policy_engine/policy_endpoints.py Replaced module-level singleton constants with per-call get_policy_registry()/get_attachment_registry() calls — avoids stale singleton issues.
litellm/proxy/common_utils/callback_utils.py Added x-litellm-policy-sources header using semicolon delimiter and add_policy_sources_to_metadata helper. Correctly uses ; to avoid comma ambiguity.
litellm/proxy/litellm_pre_call_utils.py Integrated tag extraction into policy matching context and added attribution tracking for the x-litellm-policy-sources header.
litellm/types/proxy/policy_engine/policy_types.py Added tags field to PolicyScope and PolicyAttachment with opt-in semantics (None = not checked, unlike teams/keys/models which default to ["*"]).
tests/test_litellm/proxy/policy_engine/test_attachment_registry.py Good test coverage: tag matching, wildcards, combined tag+team AND logic, match attribution reasons, tags-only attachments, and no-scope catch-all. All mock-based.
tests/test_litellm/proxy/policy_engine/test_policy_matcher.py Added tag scope matching tests: exact, wildcard, no-match, empty context tags, opt-in semantics, and combined tag+team AND logic.
ui/litellm-dashboard/src/components/policies/add_attachment_form.tsx Added tags field, impact preview button, and refactored form data building into shared helper. Switched from keyInfoCall to keyListCall.
ui/litellm-dashboard/src/components/policies/policy_test_panel.tsx New Policy Simulator UI component. Loads teams/keys/models for dropdowns and calls /policies/resolve.

Sequence Diagram

sequenceDiagram
    participant Admin as Admin UI
    participant Proxy as LiteLLM Proxy
    participant AR as AttachmentRegistry
    participant PM as PolicyMatcher
    participant PR as PolicyResolver
    participant DB as Database

    Note over Admin,DB: Policy Simulator Flow
    Admin->>Proxy: POST /policies/resolve {tags, team, key, model}
    Proxy->>AR: get_attached_policies_with_reasons(context)
    AR->>PM: scope_matches(scope, context) [incl. tag matching]
    PM-->>AR: matched + reasons
    AR-->>Proxy: [{policy_name, matched_via}]
    Proxy->>PM: get_policies_with_matching_conditions()
    Proxy->>PR: resolve_policy_guardrails() per policy
    PR-->>Proxy: guardrails list
    Proxy-->>Admin: {effective_guardrails, matched_policies}

    Note over Admin,DB: Impact Estimation Flow
    Admin->>Proxy: POST /policies/attachments/estimate-impact
    Proxy->>DB: find_many teams (take=1000)
    Proxy->>DB: find_many keys (take=1000)
    Proxy->>Proxy: Filter by tags/teams/keys in-memory
    Proxy-->>Admin: {affected_keys_count, affected_teams_count, samples}

    Note over Admin,DB: Request-time Tag Matching
    Admin->>Proxy: POST /chat/completions {metadata: {tags: ["healthcare"]}}
    Proxy->>AR: get_attached_policies_with_reasons(context w/ tags)
    AR-->>Proxy: matched policies + reasons
    Proxy->>Proxy: add_policy_sources_to_metadata()
    Proxy-->>Admin: Response + x-litellm-policy-sources header
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

26 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +1 to +7
"""
Policy resolve and attachment impact estimation endpoints.

- /policies/resolve — debug which guardrails apply for a given context
- /policies/attachments/estimate-impact — preview blast radius before creating an attachment
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _describe_match_reason in attachment_registry.py uses + as the delimiter between reasons (e.g. tag:healthcare+team:health-team), but the x-litellm-policy-sources header in callback_utils.py uses = between name and reason and ; between entries (e.g. hipaa=tag:healthcare; baseline=scope:*). The matched_via value itself can contain : — consider documenting the format for downstream consumers.

Comment on lines +307 to +312

# ─────────────────────────────────────────────────────────────────────────────
# Attachment Impact Estimation Endpoint
# ─────────────────────────────────────────────────────────────────────────────


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same duplicate auth dependency — user_api_key_auth runs twice per request. Remove from decorator.

Suggested change
# ─────────────────────────────────────────────────────────────────────────────
# Attachment Impact Estimation Endpoint
# ─────────────────────────────────────────────────────────────────────────────
@router.post(
"/policies/attachments/estimate-impact",
tags=["Policies"],
response_model=AttachmentImpactResponse,
)

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@ishaan-jaff ishaan-jaff merged commit f836201 into main Feb 11, 2026
7 of 12 checks passed
krrishdholakia pushed a commit that referenced this pull request Feb 11, 2026
…e OpenAI (#20883)

* feat: add opus 4.5 and 4.6 to use outout_format param

* generate poetry lock with 2.3.2 poetry

* restore poetry lock

* e2e tests, key delete, update tpm rpm, and regenerate

* Split e2e ui testing for browser

* new login with sso button in login page

* option to hide usage indicator

* fix(cloudzero): update CBF field mappings per LIT-1907 (#20906)

* fix(cloudzero): update CBF field mappings per LIT-1907

Phase 1 field updates for CloudZero integration:

ADD/UPDATE:
- resource/account: Send concat(api_key_alias, '|', api_key_prefix)
- resource/service: Send model_group instead of service_type
- resource/usage_family: Send provider instead of hardcoded 'llm-usage'
- action/operation: NEW - Send team_id
- resource/id: Send model name instead of CZRN
- resource/tag:organization_alias: Add if exists
- resource/tag:project_alias: Add if exists
- resource/tag:user_alias: Add if exists

REMOVE:
- resource/tag:total_tokens: Removed
- resource/tag:team_id: Removed (team_id now in action/operation)

Fixes LIT-1907

* Update litellm/integrations/cloudzero/transform.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* fix: define api_key_alias variable, update CBFRecord docstring

- Fix F821 lint error: api_key_alias was used but not defined
- Update CBFRecord docstring to reflect LIT-1907 field mappings
- Remove unused Optional import

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Add banner notifying of breaking change

* Add semgrep & Fix OOMs (#20912)

* [Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on  (#20904)

* init schema with TAGS

* ui: add policy test

* resolvePoliciesCall

* add_policy_sources_to_metadata + headers

* types Policy

* preview Impact

* def _describe_match_reason(

* match based on TAGs

* TestTagBasedAttachments

* test fixes

* add policy_resolve_router

* add_guardrails_from_policy_engine

* TestMatchAttribution

* refactor

* fix

* fix: address Greptile review feedback on policy resolve endpoints

- Track unnamed keys/teams as separate counts instead of inflating
  affected_keys_count with duplicate "(unnamed key)" placeholders.
  Added unnamed_keys_count and unnamed_teams_count to response.
- Push alias pattern matching to DB via _build_alias_where() which
  converts exact patterns to Prisma "in" and suffix wildcards to
  "startsWith" filters.
- Gate sync_policies_from_db/sync_attachments_from_db behind
  force_sync query param (default false) to avoid 2 DB round-trips
  on every /policies/resolve request.
- Remove worktree-only conftest.py that cleared sys.modules at import
  time — no longer needed since code moved to main repo.
- Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: eliminate duplicate DB queries and fix header delimiter ambiguity

- Fetch teams table once in estimate_attachment_impact and reuse for
  both tag-based and alias-based lookups (was querying teams twice when
  both tag_patterns and team_patterns were provided).
- Convert tag/team filter functions from async DB queries to sync
  filters that operate on pre-fetched data (_filter_keys_by_tags,
  _filter_teams_by_tags).
- Fix comma ambiguity in x-litellm-policy-sources header: use '; '
  as entry delimiter since matched_via values can contain commas.
- Use '+' as the within-value separator in matched_via reason strings
  (e.g. "tag:healthcare+team:health-team") to avoid conflict with
  header delimiters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update litellm/proxy/policy_engine/policy_resolve_endpoints.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* fix: type error & better error handling (#20689)

* [Docs] Add docs guide for using policies  (#20914)

* init schema with TAGS

* ui: add policy test

* resolvePoliciesCall

* add_policy_sources_to_metadata + headers

* types Policy

* preview Impact

* def _describe_match_reason(

* match based on TAGs

* TestTagBasedAttachments

* test fixes

* add policy_resolve_router

* add_guardrails_from_policy_engine

* TestMatchAttribution

* refactor

* fix

* fix: address Greptile review feedback on policy resolve endpoints

- Track unnamed keys/teams as separate counts instead of inflating
  affected_keys_count with duplicate "(unnamed key)" placeholders.
  Added unnamed_keys_count and unnamed_teams_count to response.
- Push alias pattern matching to DB via _build_alias_where() which
  converts exact patterns to Prisma "in" and suffix wildcards to
  "startsWith" filters.
- Gate sync_policies_from_db/sync_attachments_from_db behind
  force_sync query param (default false) to avoid 2 DB round-trips
  on every /policies/resolve request.
- Remove worktree-only conftest.py that cleared sys.modules at import
  time — no longer needed since code moved to main repo.
- Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: eliminate duplicate DB queries and fix header delimiter ambiguity

- Fetch teams table once in estimate_attachment_impact and reuse for
  both tag-based and alias-based lookups (was querying teams twice when
  both tag_patterns and team_patterns were provided).
- Convert tag/team filter functions from async DB queries to sync
  filters that operate on pre-fetched data (_filter_keys_by_tags,
  _filter_teams_by_tags).
- Fix comma ambiguity in x-litellm-policy-sources header: use '; '
  as entry delimiter since matched_via values can contain commas.
- Use '+' as the within-value separator in matched_via reason strings
  (e.g. "tag:healthcare+team:health-team") to avoid conflict with
  header delimiters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs v1 guide with UI imgs

* docs fix

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add dashscope/qwen3-max model with tiered pricing (#20919)

Add support for Alibaba Cloud's Qwen3-Max model with:
- 258K input tokens, 65K output tokens
- Tiered pricing based on context window usage (0-32K, 32K-128K, 128K-252K)
- Function calling and tool choice support
- Reasoning capabilities enabled

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix linting

* docs: add Greptile review requirement to PR template (#20762)

* fix(azure): preserve content_policy_violation error details from Azure OpenAI

Closes #20811

Azure OpenAI returns rich error payloads for content policy violations
(inner_error with ResponsibleAIPolicyViolation, content_filter_results,
revised_prompt). Previously these details were lost when:

1. The top-level error code was not "content_policy_violation" but the
   inner_error.code was "ResponsibleAIPolicyViolation" -- the structured
   check only examined the top-level code.

2. The DALL-E image generation polling path stringified the error JSON
   into the message field instead of setting the structured body, making
   it impossible for exception_type() to extract error details.

3. The string-based fallback detector used "invalid_request_error" as a
   content-policy indicator, which is too broad and could misclassify
   regular bad-request errors.

Changes:
- exception_mapping_utils.py: Check inner_error.code for
  ResponsibleAIPolicyViolation when top-level code is not
  content_policy_violation. Replace overly broad "invalid_request_error"
  string match with specific Azure safety-system messages.
- azure.py: Set structured body on AzureOpenAIError in both async and
  sync DALL-E polling paths so exception_type() can inspect error details.
- test_azure_exception_mapping.py: Add regression tests covering the
  exact error payloads from issue #20811.
- Fix pre-existing lint: duplicate PerplexityResponsesConfig dict key,
  unused RouteChecks top-level import.

---------

Co-authored-by: Kelvin Tran <kelvin-tran@users.noreply.github.com>
Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com>
Co-authored-by: shin-bot-litellm <shin-bot-litellm@berri.ai>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com>
Co-authored-by: ken <122603020@qq.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant