[Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on #20904
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile OverviewGreptile SummaryAdds tag-based scoping to the policy engine, a new Policy Simulator UI tab, and a blast-radius estimation endpoint for policy attachments.
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/policy_engine/policy_resolve_endpoints.py | New file with /policies/resolve and /policies/attachments/estimate-impact endpoints. The estimate-impact endpoint fetches ALL keys and ALL teams from the database (up to 3 separate unbounded queries) with no pagination, which is a performance risk at scale. The resolve endpoint triggers full DB syncs on every call. |
| litellm/proxy/policy_engine/attachment_registry.py | Added tag support to attachments and new get_attached_policies_with_reasons method. Clean refactor that preserves backward compatibility via delegating get_attached_policies to the new method. |
| litellm/proxy/policy_engine/policy_endpoints.py | Removed module-level singleton instantiation (POLICY_REGISTRY, ATTACHMENT_REGISTRY) in favor of calling get_policy_registry() / get_attachment_registry() at each endpoint. This avoids import-time side effects and is a good improvement. |
| litellm/proxy/policy_engine/policy_matcher.py | Added tag matching to scope_matches. Tags use opt-in semantics (empty scope tags = skip check), consistent with documentation. Clean implementation. |
| litellm/proxy/litellm_pre_call_utils.py | Updated policy engine integration in the pre-call path to include tags from request body in the match context, and to track policy attribution sources. Uses get_tags_from_request_body helper. |
| litellm/types/proxy/policy_engine/policy_types.py | Added tags field to PolicyScope and PolicyAttachment models with appropriate defaults and documentation. Tags use opt-in semantics (empty = not checked) vs teams/keys/models (empty = match all). |
| litellm/types/proxy/policy_engine/resolver_types.py | Added new request/response types: PolicyResolveRequest, PolicyResolveResponse, PolicyMatchDetail, AttachmentImpactResponse, and added tags field to existing types. Well-structured Pydantic models. |
| tests/test_litellm/proxy/policy_engine/test_attachment_registry.py | Added TestTagBasedAttachments class with tag matching tests (exact, wildcard, combined with team). All mock/in-memory — no network calls. |
| tests/test_litellm/proxy/policy_engine/test_policy_matcher.py | Added TestPolicyMatcherScopeMatchingWithTags class testing tag matching in scope_matches. All mock/in-memory — no network calls. |
| ui/litellm-dashboard/src/components/policies/add_attachment_form.tsx | Added tags field, impact estimation button, and refactored data loading to handle various API response shapes. Extracted attachment building into shared utility. |
| ui/litellm-dashboard/src/components/policies/policy_test_panel.tsx | New Policy Simulator panel allowing users to test which policies/guardrails apply for a given context. Clean implementation. |
Sequence Diagram
sequenceDiagram
participant UI as Dashboard UI
participant Resolve as /policies/resolve
participant Impact as /policies/attachments/estimate-impact
participant PR as PolicyRegistry
participant AR as AttachmentRegistry
participant PM as PolicyMatcher
participant DB as Database
Note over UI: Policy Simulator Tab
UI->>Resolve: POST {team_alias, key_alias, model, tags}
Resolve->>DB: sync_policies_from_db()
Resolve->>DB: sync_attachments_from_db()
Resolve->>AR: get_attached_policies_with_reasons(context)
AR->>PM: scope_matches(scope, context) [incl. tags]
AR-->>Resolve: [{policy_name, matched_via}]
Resolve->>PM: get_policies_with_matching_conditions()
Resolve->>PR: resolve_policy_guardrails()
Resolve-->>UI: {effective_guardrails, matched_policies}
Note over UI: Estimate Impact (before creating attachment)
UI->>Impact: POST {policy_name, tags, teams, keys}
Impact->>DB: find_many(keys) [unbounded]
Impact->>DB: find_many(teams) [unbounded]
Impact-->>UI: {affected_keys_count, affected_teams_count, samples}
Note over UI: Pre-call Path (request time)
participant Req as LLM Request
Req->>AR: get_attached_policies_with_reasons(context w/ tags)
AR->>PM: scope_matches (checks tags)
AR-->>Req: matching policies + reasons
Req->>Req: add x-litellm-policy-sources header
| if prisma_client is None: | ||
| raise HTTPException(status_code=500, detail="Database not connected") | ||
|
|
||
| try: | ||
| # Sync from DB to ensure in-memory state is current | ||
| await get_policy_registry().sync_policies_from_db(prisma_client) |
There was a problem hiding this comment.
estimate_attachment_impact issues multiple unbounded find_many(where={}) queries — fetching all keys and all teams from the database with no pagination or limit. In a deployment with thousands of keys/teams, this will be slow and memory-intensive. Worse, when both tag_patterns and team_patterns are provided, the teams table is queried twice (lines 211 and 236), and the keys table can be queried up to 3 times (lines 188, 254, 266).
Consider adding take limits to the Prisma queries or using database-level filtering (e.g., Prisma json filtering on metadata) instead of fetching everything into Python and filtering in-memory.
Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.
Why: Cre... (source)
| return affected | ||
|
|
||
|
|
There was a problem hiding this comment.
Calling sync_policies_from_db and sync_attachments_from_db on every /policies/resolve request forces two full DB round-trips. These syncs are already performed on a background schedule. For a debug/simulation endpoint this may be acceptable, but it's worth noting this would be expensive if called frequently.
Consider gating the sync behind a ?force_sync=true query param, or relying on the existing background sync.
Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.
Why: Cre... (source)
| ) | ||
|
|
||
| if not match_results: | ||
| return PolicyResolveResponse( |
There was a problem hiding this comment.
When key_alias is empty/None, this falls back to str(key.token)[:8] + "..." — leaking a prefix of the hashed token in the API response. While these are hashed tokens (not raw API keys), exposing even partial hashes in an estimation response could be an unnecessary information disclosure.
Consider using a generic placeholder like "(unnamed key)" instead.
|
@greptile review again |
Greptile OverviewGreptile SummaryAdds tag-based policy attachments, a
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/policy_engine/policy_resolve_endpoints.py | New file: policy resolve & impact estimation endpoints. Fetches up to 1000 rows per query for tag-based impact; unnamed keys/teams create duplicate "(unnamed key)" entries inflating counts. |
| litellm/proxy/policy_engine/attachment_registry.py | Added tag support to attachment loading/matching/persistence and new get_attached_policies_with_reasons method for match attribution. Clean refactor of existing get_attached_policies. |
| litellm/proxy/policy_engine/policy_matcher.py | Added tag matching to scope_matches with opt-in semantics (empty tags = don't check). Correct AND-logic with other scope fields. |
| litellm/proxy/litellm_pre_call_utils.py | Critical path change: extracts tags from request body, passes to PolicyMatchContext, and tracks policy attribution in metadata for response headers. |
| litellm/proxy/common_utils/callback_utils.py | Added add_policy_sources_to_metadata and corresponding x-litellm-policy-sources header generation in get_logging_caching_headers. |
| litellm/types/proxy/policy_engine/policy_types.py | Added tags field to PolicyScope and PolicyAttachment with correct opt-in semantics (None/empty = don't check, unlike teams/keys/models). |
| tests/test_litellm/proxy/policy_engine/conftest.py | New conftest that aggressively removes all litellm modules from sys.modules at import time. Designed for worktree use but could interfere with other tests in the same pytest session. |
| tests/test_litellm/proxy/policy_engine/test_attachment_registry.py | Good test coverage: tag matching, wildcards, AND-logic with teams, match attribution, tags-only attachments, no-scope catch-all. All mock-based. |
Sequence Diagram
sequenceDiagram
participant UI as Dashboard UI
participant Proxy as LiteLLM Proxy
participant PE as Policy Engine
participant DB as Database
Note over UI,DB: Policy Simulator Flow
UI->>Proxy: POST /policies/resolve {tags, team, key, model}
Proxy->>PE: AttachmentRegistry.get_attached_policies_with_reasons(context)
PE->>PE: PolicyMatcher.scope_matches (teams + keys + models + tags)
PE-->>Proxy: matched policies + reasons
Proxy->>PE: PolicyMatcher.get_policies_with_matching_conditions
Proxy->>PE: PolicyResolver.resolve_policy_guardrails
Proxy-->>UI: {effective_guardrails, matched_policies}
Note over UI,DB: Impact Estimation Flow
UI->>Proxy: POST /policies/attachments/estimate-impact {tags, teams, keys}
Proxy->>DB: find_many keys/teams (take=1000)
Proxy->>Proxy: Filter by tag/alias patterns in Python
Proxy-->>UI: {affected_keys_count, affected_teams_count, samples}
Note over UI,DB: Request-Time Tag Matching (Critical Path)
UI->>Proxy: POST /chat/completions {metadata: {tags: ["healthcare"]}}
Proxy->>PE: get_tags_from_request_body + build PolicyMatchContext
PE->>PE: get_attached_policies_with_reasons (includes tag matching)
PE-->>Proxy: policies + attribution reasons
Proxy->>Proxy: add_policy_sources_to_metadata (for x-litellm-policy-sources header)
Proxy-->>UI: Response + x-litellm-policy-sources header
| affected: list = [] | ||
| keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore | ||
| where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS, | ||
| ) | ||
| for key in keys: | ||
| key_alias = key.key_alias or "" | ||
| key_tags = _get_tags_from_metadata( | ||
| key.metadata, getattr(key, "metadata_json", None) | ||
| ) | ||
| if key_tags and any( | ||
| RouteChecks._route_matches_wildcard_pattern(route=tag, pattern=pat) | ||
| for tag in key_tags | ||
| for pat in tag_patterns | ||
| ): | ||
| affected.append(key_alias or _UNNAMED_KEY_PLACEHOLDER) | ||
| return affected |
There was a problem hiding this comment.
Multiple unnamed keys (keys without key_alias) all get appended as "(unnamed key)", which inflates affected_keys_count with duplicate placeholder entries. The same issue applies to _find_affected_teams_by_tags with "(unnamed team)".
Consider deduplicating or counting unnamed entities separately:
| affected: list = [] | |
| keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore | |
| where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS, | |
| ) | |
| for key in keys: | |
| key_alias = key.key_alias or "" | |
| key_tags = _get_tags_from_metadata( | |
| key.metadata, getattr(key, "metadata_json", None) | |
| ) | |
| if key_tags and any( | |
| RouteChecks._route_matches_wildcard_pattern(route=tag, pattern=pat) | |
| for tag in key_tags | |
| for pat in tag_patterns | |
| ): | |
| affected.append(key_alias or _UNNAMED_KEY_PLACEHOLDER) | |
| return affected | |
| affected: list = [] | |
| unnamed_count = 0 | |
| keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore | |
| where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS, | |
| ) | |
| for key in keys: | |
| key_alias = key.key_alias or "" | |
| key_tags = _get_tags_from_metadata( | |
| key.metadata, getattr(key, "metadata_json", None) | |
| ) | |
| if key_tags and any( | |
| RouteChecks._route_matches_wildcard_pattern(route=tag, pattern=pat) | |
| for tag in key_tags | |
| for pat in tag_patterns | |
| ): | |
| if key_alias: | |
| affected.append(key_alias) | |
| else: | |
| unnamed_count += 1 | |
| if unnamed_count: | |
| affected.append(f"(unnamed key) x{unnamed_count}") | |
| return affected |
| # Clear ALL cached litellm modules so they're re-imported from the worktree | ||
| _to_remove = [key for key in sys.modules if key == "litellm" or key.startswith("litellm.")] | ||
| for key in _to_remove: | ||
| del sys.modules[key] |
There was a problem hiding this comment.
Deleting all cached litellm.* modules from sys.modules at conftest import time is a side-effect that runs even when these tests are collected alongside other test modules in the same pytest session. This could cause hard-to-debug re-import issues for tests in sibling directories that depend on already-imported litellm state.
Is this worktree fix still needed, or can the tests run without it?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!. Is this worktree-specific conftest still needed? It clears all litellm modules from sys.modules at import time, which could interfere with other tests in the same pytest session.
- Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@greptile review again pls |
Greptile OverviewGreptile SummaryAdds tag-based policy attachments so policies can be scoped by
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/policy_engine/policy_resolve_endpoints.py | New file: adds /policies/resolve and /policies/attachments/estimate-impact endpoints. Tag-based impact queries do full-table scans with take limit; duplicate team queries when both tag_patterns and team_patterns are provided. |
| litellm/proxy/policy_engine/attachment_registry.py | Adds tag support to attachment loading, creation, syncing, and a new get_attached_policies_with_reasons method for match attribution. Clean refactor with proper deduplication via seen_policies set. |
| litellm/proxy/policy_engine/policy_matcher.py | Adds tag matching to scope_matches with opt-in semantics — empty scope tags means "don't check". Logic is clean and well-commented. |
| litellm/proxy/litellm_pre_call_utils.py | Integrates tag extraction from request body into policy matching context and adds policy source attribution tracking for the new response header. |
| litellm/proxy/common_utils/callback_utils.py | Adds x-litellm-policy-sources header and add_policy_sources_to_metadata helper. Header format uses commas as delimiters but reason strings can also contain commas, creating parsing ambiguity. |
| litellm/types/proxy/policy_engine/policy_types.py | Adds tags field to PolicyScope and PolicyAttachment with opt-in semantics. to_policy_scope() for global attachments doesn't pass through tags, which may silently drop constraints. |
| tests/test_litellm/proxy/policy_engine/test_attachment_registry.py | Good test coverage: tag matching, wildcards, combined tag+team AND logic, match attribution reasons, tags-only attachment, and no-scope catch-all. All mock-based, no network calls. |
| tests/test_litellm/proxy/policy_engine/test_policy_matcher.py | Tests tag scope matching: exact match, wildcard, no-match, empty tags, and combined tag+team AND logic. Clean and thorough. |
| ui/litellm-dashboard/src/components/policies/policy_test_panel.tsx | New Policy Simulator UI component. Loads teams/keys/models for dropdown selection, sends resolve request, and displays matched policies and guardrails in a table. |
| ui/litellm-dashboard/src/components/policies/add_attachment_form.tsx | Adds tags field to attachment form, impact estimation preview, and fixes response parsing for teams/keys/models API calls. Refactors attachment data building into shared helper. |
Sequence Diagram
sequenceDiagram
participant UI as Admin UI
participant Proxy as LiteLLM Proxy
participant AR as AttachmentRegistry
participant PM as PolicyMatcher
participant PR as PolicyResolver
participant DB as Database
Note over UI,DB: Policy Simulator Flow
UI->>Proxy: POST /policies/resolve {tags, team, key, model}
Proxy->>AR: get_attached_policies_with_reasons(context)
AR->>PM: scope_matches(scope, context) [incl. tag check]
PM-->>AR: match results with reasons
AR-->>Proxy: [{policy_name, matched_via}]
Proxy->>PM: get_policies_with_matching_conditions()
Proxy->>PR: resolve_policy_guardrails() per policy
PR-->>Proxy: guardrails list
Proxy-->>UI: {effective_guardrails, matched_policies}
Note over UI,DB: Blast Radius Preview Flow
UI->>Proxy: POST /policies/attachments/estimate-impact
Proxy->>DB: find_many keys (where={}, take=1000)
Proxy->>DB: find_many teams (where={}, take=1000)
Proxy->>Proxy: filter by tag/team/key patterns in Python
Proxy-->>UI: {affected_keys_count, affected_teams_count, samples}
Note over UI,DB: Request-Time Tag Matching
participant Client as API Client
Client->>Proxy: POST /chat/completions {metadata: {tags: ["healthcare"]}}
Proxy->>AR: get_attached_policies_with_reasons(context w/ tags)
AR-->>Proxy: matching policies + reasons
Proxy->>Proxy: add_policy_sources_to_metadata()
Proxy-->>Client: response + x-litellm-policy-sources header
| headers["x-litellm-policy-sources"] = ",".join( | ||
| f"{name}={reason}" for name, reason in sources.items() | ||
| ) |
There was a problem hiding this comment.
The matched_via reason string (e.g. "tag:healthcare, team:health-team") contains commas. Using , as both the entry delimiter and allowing it inside values makes parsing ambiguous. For example, "hipaa=tag:healthcare, team:health-team,baseline=scope:*" is unparsable.
Consider using ; as the entry separator, or a structured format like JSON, to avoid ambiguity for downstream consumers.
| keys = await prisma_client.db.litellm_verificationtoken.find_many( # type: ignore | ||
| where={}, order={"created_at": "desc"}, take=MAX_POLICY_ESTIMATE_IMPACT_ROWS, |
There was a problem hiding this comment.
_find_affected_keys_by_tags queries all keys with where={} (no filter) and then filters in Python. Even with the take=1000 limit, this fetches 1000 full key rows from the DB and iterates them in-memory on every call. For tag-based lookups this is unavoidable since tags live inside JSON metadata, but this should be documented as an O(N) scan.
More importantly, _find_affected_teams_by_tags (line 114) does the same where={} scan. When both tag_patterns and team_patterns are provided in estimate_attachment_impact, the teams table gets queried twice (once here, once in _find_affected_by_team_patterns). Consider caching the first teams query result and passing it to _find_affected_by_team_patterns to avoid the duplicate round-trip.
Context Used: Rule from dashboard - What: Avoid creating new database requests or Router objects in the critical request path.
Why: Cre... (source)
Additional Comments (1)
If this is intentional (global means "override all scoping"), consider adding a comment to document this behavior. If not, the tags should be preserved in the returned |
- Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@greptile can u review again ? |
Greptile OverviewGreptile SummaryAdds tag-based policy attachments, a Policy Simulator UI, and blast-radius preview (impact estimation) for policy attachments. Key changes:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/policy_engine/policy_resolve_endpoints.py | New file: policy resolve and attachment impact estimation endpoints. Uses unbounded in-memory filtering for tag-based lookups (capped at 1000 rows). Multiple inline imports of RouteChecks. |
| litellm/proxy/policy_engine/attachment_registry.py | Added tag-based matching, match attribution via get_attached_policies_with_reasons, and _describe_match_reason. Tags field propagated through all CRUD paths. |
| litellm/proxy/policy_engine/policy_matcher.py | Added tag scope matching with opt-in semantics (empty tags = don't check). Correct ANY-match logic for context tags vs scope tag patterns. |
| litellm/proxy/policy_engine/policy_endpoints.py | Replaced module-level singleton constants with per-call get_policy_registry()/get_attachment_registry() calls — avoids stale singleton issues. |
| litellm/proxy/common_utils/callback_utils.py | Added x-litellm-policy-sources header using semicolon delimiter and add_policy_sources_to_metadata helper. Correctly uses ; to avoid comma ambiguity. |
| litellm/proxy/litellm_pre_call_utils.py | Integrated tag extraction into policy matching context and added attribution tracking for the x-litellm-policy-sources header. |
| litellm/types/proxy/policy_engine/policy_types.py | Added tags field to PolicyScope and PolicyAttachment with opt-in semantics (None = not checked, unlike teams/keys/models which default to ["*"]). |
| tests/test_litellm/proxy/policy_engine/test_attachment_registry.py | Good test coverage: tag matching, wildcards, combined tag+team AND logic, match attribution reasons, tags-only attachments, and no-scope catch-all. All mock-based. |
| tests/test_litellm/proxy/policy_engine/test_policy_matcher.py | Added tag scope matching tests: exact, wildcard, no-match, empty context tags, opt-in semantics, and combined tag+team AND logic. |
| ui/litellm-dashboard/src/components/policies/add_attachment_form.tsx | Added tags field, impact preview button, and refactored form data building into shared helper. Switched from keyInfoCall to keyListCall. |
| ui/litellm-dashboard/src/components/policies/policy_test_panel.tsx | New Policy Simulator UI component. Loads teams/keys/models for dropdowns and calls /policies/resolve. |
Sequence Diagram
sequenceDiagram
participant Admin as Admin UI
participant Proxy as LiteLLM Proxy
participant AR as AttachmentRegistry
participant PM as PolicyMatcher
participant PR as PolicyResolver
participant DB as Database
Note over Admin,DB: Policy Simulator Flow
Admin->>Proxy: POST /policies/resolve {tags, team, key, model}
Proxy->>AR: get_attached_policies_with_reasons(context)
AR->>PM: scope_matches(scope, context) [incl. tag matching]
PM-->>AR: matched + reasons
AR-->>Proxy: [{policy_name, matched_via}]
Proxy->>PM: get_policies_with_matching_conditions()
Proxy->>PR: resolve_policy_guardrails() per policy
PR-->>Proxy: guardrails list
Proxy-->>Admin: {effective_guardrails, matched_policies}
Note over Admin,DB: Impact Estimation Flow
Admin->>Proxy: POST /policies/attachments/estimate-impact
Proxy->>DB: find_many teams (take=1000)
Proxy->>DB: find_many keys (take=1000)
Proxy->>Proxy: Filter by tags/teams/keys in-memory
Proxy-->>Admin: {affected_keys_count, affected_teams_count, samples}
Note over Admin,DB: Request-time Tag Matching
Admin->>Proxy: POST /chat/completions {metadata: {tags: ["healthcare"]}}
Proxy->>AR: get_attached_policies_with_reasons(context w/ tags)
AR-->>Proxy: matched policies + reasons
Proxy->>Proxy: add_policy_sources_to_metadata()
Proxy-->>Admin: Response + x-litellm-policy-sources header
| """ | ||
| Policy resolve and attachment impact estimation endpoints. | ||
|
|
||
| - /policies/resolve — debug which guardrails apply for a given context | ||
| - /policies/attachments/estimate-impact — preview blast radius before creating an attachment | ||
| """ | ||
|
|
There was a problem hiding this comment.
The _describe_match_reason in attachment_registry.py uses + as the delimiter between reasons (e.g. tag:healthcare+team:health-team), but the x-litellm-policy-sources header in callback_utils.py uses = between name and reason and ; between entries (e.g. hipaa=tag:healthcare; baseline=scope:*). The matched_via value itself can contain : — consider documenting the format for downstream consumers.
|
|
||
| # ───────────────────────────────────────────────────────────────────────────── | ||
| # Attachment Impact Estimation Endpoint | ||
| # ───────────────────────────────────────────────────────────────────────────── | ||
|
|
||
|
|
There was a problem hiding this comment.
Same duplicate auth dependency — user_api_key_auth runs twice per request. Remove from decorator.
| # ───────────────────────────────────────────────────────────────────────────── | |
| # Attachment Impact Estimation Endpoint | |
| # ───────────────────────────────────────────────────────────────────────────── | |
| @router.post( | |
| "/policies/attachments/estimate-impact", | |
| tags=["Policies"], | |
| response_model=AttachmentImpactResponse, | |
| ) |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…e OpenAI (#20883) * feat: add opus 4.5 and 4.6 to use outout_format param * generate poetry lock with 2.3.2 poetry * restore poetry lock * e2e tests, key delete, update tpm rpm, and regenerate * Split e2e ui testing for browser * new login with sso button in login page * option to hide usage indicator * fix(cloudzero): update CBF field mappings per LIT-1907 (#20906) * fix(cloudzero): update CBF field mappings per LIT-1907 Phase 1 field updates for CloudZero integration: ADD/UPDATE: - resource/account: Send concat(api_key_alias, '|', api_key_prefix) - resource/service: Send model_group instead of service_type - resource/usage_family: Send provider instead of hardcoded 'llm-usage' - action/operation: NEW - Send team_id - resource/id: Send model name instead of CZRN - resource/tag:organization_alias: Add if exists - resource/tag:project_alias: Add if exists - resource/tag:user_alias: Add if exists REMOVE: - resource/tag:total_tokens: Removed - resource/tag:team_id: Removed (team_id now in action/operation) Fixes LIT-1907 * Update litellm/integrations/cloudzero/transform.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: define api_key_alias variable, update CBFRecord docstring - Fix F821 lint error: api_key_alias was used but not defined - Update CBFRecord docstring to reflect LIT-1907 field mappings - Remove unused Optional import --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add banner notifying of breaking change * Add semgrep & Fix OOMs (#20912) * [Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on (#20904) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update litellm/proxy/policy_engine/policy_resolve_endpoints.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: type error & better error handling (#20689) * [Docs] Add docs guide for using policies (#20914) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs v1 guide with UI imgs * docs fix --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: add dashscope/qwen3-max model with tiered pricing (#20919) Add support for Alibaba Cloud's Qwen3-Max model with: - 258K input tokens, 65K output tokens - Tiered pricing based on context window usage (0-32K, 32K-128K, 128K-252K) - Function calling and tool choice support - Reasoning capabilities enabled Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * fix linting * docs: add Greptile review requirement to PR template (#20762) * fix(azure): preserve content_policy_violation error details from Azure OpenAI Closes #20811 Azure OpenAI returns rich error payloads for content policy violations (inner_error with ResponsibleAIPolicyViolation, content_filter_results, revised_prompt). Previously these details were lost when: 1. The top-level error code was not "content_policy_violation" but the inner_error.code was "ResponsibleAIPolicyViolation" -- the structured check only examined the top-level code. 2. The DALL-E image generation polling path stringified the error JSON into the message field instead of setting the structured body, making it impossible for exception_type() to extract error details. 3. The string-based fallback detector used "invalid_request_error" as a content-policy indicator, which is too broad and could misclassify regular bad-request errors. Changes: - exception_mapping_utils.py: Check inner_error.code for ResponsibleAIPolicyViolation when top-level code is not content_policy_violation. Replace overly broad "invalid_request_error" string match with specific Azure safety-system messages. - azure.py: Set structured body on AzureOpenAIError in both async and sync DALL-E polling paths so exception_type() can inspect error details. - test_azure_exception_mapping.py: Add regression tests covering the exact error payloads from issue #20811. - Fix pre-existing lint: duplicate PerplexityResponsesConfig dict key, unused RouteChecks top-level import. --------- Co-authored-by: Kelvin Tran <kelvin-tran@users.noreply.github.com> Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com> Co-authored-by: shin-bot-litellm <shin-bot-litellm@berri.ai> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: ken <122603020@qq.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>
[Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on
Adds tag-based policy attachments so platform teams can say "all requests with tag healthcare get HIPAA guardrails" without manually assigning policies to every key/team. Also adds a Policy Simulator UI and
blast radius preview so admins can debug and preview policy behavior before deploying
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
✅ Test
Changes