Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile OverviewGreptile SummaryAdds Semgrep to CI as a release gate and bounds all
Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| .circleci/config.yml | Adds semgrep CI job with custom rules only, gates publish_to_pypi on semgrep passing. Clean addition. |
| cookbook/nova_sonic_realtime.py | Bounds audio queue with configurable maxsize (default 10,000). Uses shared env var name LITELLM_ASYNCIO_QUEUE_MAXSIZE but different default (10,000 vs 1,000). |
| litellm/constants.py | Adds LITELLM_ASYNCIO_QUEUE_MAXSIZE constant (default 1000). Clean addition. |
| litellm/integrations/gcs_bucket/gcs_bucket.py | Bounds GCS log queue and adds pre-put flush. Race condition: full() + flush_queue() + put() is not atomic; concurrent producers can cause put() to still block. Also flush_queue() override does not use the flush_lock from the parent class. |
| litellm/proxy/db/db_transaction_queue/base_update_queue.py | Bounds base queue to LITELLM_ASYNCIO_QUEUE_MAXSIZE (1000). add_update() uses blocking put() which will block callers on the request path when queue is full. |
| litellm/proxy/db/db_transaction_queue/daily_spend_update_queue.py | Overrides queue with same bounded maxsize. Aggregation check (qsize >= 2000) is unreachable since queue maxsize is 1000. |
| litellm/proxy/db/db_transaction_queue/spend_update_queue.py | Same issue as daily_spend_update_queue: aggregation check at 2000 is unreachable with maxsize 1000. |
Sequence Diagram
sequenceDiagram
participant Req as Request Handler
participant SQ as SpendUpdateQueue
participant AQ as asyncio.Queue(maxsize=1000)
participant Sched as Scheduler (10-15s)
participant DB as Database
Req->>SQ: add_update(spend_item)
SQ->>AQ: await put(item)
alt Queue not full
AQ-->>SQ: item added
SQ->>SQ: check qsize >= 2000 (unreachable)
else Queue full (1000 items)
AQ-->>SQ: BLOCKS until space available
Note over Req,SQ: Request stalls here
end
loop Every 10-15 seconds
Sched->>SQ: flush_all_updates_from_in_memory_queue()
SQ->>AQ: get() up to 1000 items
AQ-->>SQ: drained items
SQ->>DB: write aggregated spend
Note over AQ: Space freed, blocked put() resumes
end
| if self.log_queue.full(): | ||
| await self.flush_queue() | ||
| await self.log_queue.put( |
There was a problem hiding this comment.
TOCTOU race: between full() returning True and put() executing, concurrent coroutines can re-fill the queue after the flush. The put() on line 78 can still block. Also, the overridden flush_queue() (line 381) doesn't acquire self.flush_lock, so concurrent flushes from periodic_flush() and this manual flush can race.
Consider using put_nowait() in a try/except QueueFull block to avoid blocking the logging path.
| from typing import Optional | ||
|
|
||
| # Bounded queue size for audio chunks (configurable via env to avoid unbounded memory) | ||
| AUDIO_QUEUE_MAXSIZE = int(os.getenv("LITELLM_ASYNCIO_QUEUE_MAXSIZE", 10_000)) |
There was a problem hiding this comment.
This uses the same env var LITELLM_ASYNCIO_QUEUE_MAXSIZE as the proxy queues but with a different default (10,000 vs 1,000). This could confuse users who set the env var expecting uniform behavior — proxy queues get 1000 from constants.py while this cookbook gets 10,000 from the hardcoded fallback here.
Additional Comments (3)
Either lower
|
…e OpenAI (#20883) * feat: add opus 4.5 and 4.6 to use outout_format param * generate poetry lock with 2.3.2 poetry * restore poetry lock * e2e tests, key delete, update tpm rpm, and regenerate * Split e2e ui testing for browser * new login with sso button in login page * option to hide usage indicator * fix(cloudzero): update CBF field mappings per LIT-1907 (#20906) * fix(cloudzero): update CBF field mappings per LIT-1907 Phase 1 field updates for CloudZero integration: ADD/UPDATE: - resource/account: Send concat(api_key_alias, '|', api_key_prefix) - resource/service: Send model_group instead of service_type - resource/usage_family: Send provider instead of hardcoded 'llm-usage' - action/operation: NEW - Send team_id - resource/id: Send model name instead of CZRN - resource/tag:organization_alias: Add if exists - resource/tag:project_alias: Add if exists - resource/tag:user_alias: Add if exists REMOVE: - resource/tag:total_tokens: Removed - resource/tag:team_id: Removed (team_id now in action/operation) Fixes LIT-1907 * Update litellm/integrations/cloudzero/transform.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: define api_key_alias variable, update CBFRecord docstring - Fix F821 lint error: api_key_alias was used but not defined - Update CBFRecord docstring to reflect LIT-1907 field mappings - Remove unused Optional import --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add banner notifying of breaking change * Add semgrep & Fix OOMs (#20912) * [Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on (#20904) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update litellm/proxy/policy_engine/policy_resolve_endpoints.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: type error & better error handling (#20689) * [Docs] Add docs guide for using policies (#20914) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs v1 guide with UI imgs * docs fix --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: add dashscope/qwen3-max model with tiered pricing (#20919) Add support for Alibaba Cloud's Qwen3-Max model with: - 258K input tokens, 65K output tokens - Tiered pricing based on context window usage (0-32K, 32K-128K, 128K-252K) - Function calling and tool choice support - Reasoning capabilities enabled Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * fix linting * docs: add Greptile review requirement to PR template (#20762) * fix(azure): preserve content_policy_violation error details from Azure OpenAI Closes #20811 Azure OpenAI returns rich error payloads for content policy violations (inner_error with ResponsibleAIPolicyViolation, content_filter_results, revised_prompt). Previously these details were lost when: 1. The top-level error code was not "content_policy_violation" but the inner_error.code was "ResponsibleAIPolicyViolation" -- the structured check only examined the top-level code. 2. The DALL-E image generation polling path stringified the error JSON into the message field instead of setting the structured body, making it impossible for exception_type() to extract error details. 3. The string-based fallback detector used "invalid_request_error" as a content-policy indicator, which is too broad and could misclassify regular bad-request errors. Changes: - exception_mapping_utils.py: Check inner_error.code for ResponsibleAIPolicyViolation when top-level code is not content_policy_violation. Replace overly broad "invalid_request_error" string match with specific Azure safety-system messages. - azure.py: Set structured body on AzureOpenAIError in both async and sync DALL-E polling paths so exception_type() can inspect error details. - test_azure_exception_mapping.py: Add regression tests covering the exact error payloads from issue #20811. - Fix pre-existing lint: duplicate PerplexityResponsesConfig dict key, unused RouteChecks top-level import. --------- Co-authored-by: Kelvin Tran <kelvin-tran@users.noreply.github.com> Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com> Co-authored-by: shin-bot-litellm <shin-bot-litellm@berri.ai> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: ken <122603020@qq.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>
This reverts commit b7993b1.
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🧹 Refactoring
Changes