fix: prevent double-counting of litellm_proxy_total_requests_metric#21159
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@greptile review |
Greptile OverviewGreptile SummaryThis PR fixes double-counting of
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/integrations/prometheus.py | Consolidates litellm_proxy_total_requests_metric increment into async_log_success_event (for both streaming and non-streaming), removes it from async_post_call_success_hook to prevent double-counting, and removes dead code in _increment_token_metrics. The logic is correct. |
| litellm/proxy/utils.py | Rewrites _init_litellm_callbacks to replace string callbacks in-place rather than appending initialized instances, preventing duplicate entries. The approach is sound but has a subtle behavioral change where initialized string callbacks skip add_litellm_callback registration. |
| tests/litellm/proxy/test_init_litellm_callbacks.py | New unit tests for the _init_litellm_callbacks fix. Covers string replacement, deduplication, unrecognized callbacks, and multiple string callbacks. Tests are well-structured but don't verify that callbacks are still functional after replacement. |
| tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py | Updated test expectations to verify litellm_proxy_total_requests_metric is NOT incremented in async_post_call_success_hook. Changes align with the new behavior. |
| tests/enterprise/litellm_enterprise/integrations/test_prometheus.py | Updated semantic validation tests so that async_log_success_event is expected to increment the counter for both streaming and non-streaming, and async_post_call_success_hook is expected to NOT increment. Assertions and messages updated consistently. |
Sequence Diagram
sequenceDiagram
participant Client
participant Proxy as LiteLLM Proxy
participant Logging as litellm_logging (core)
participant Prom as PrometheusLogger
Client->>Proxy: LLM Request
Proxy->>Logging: Route to LLM provider
alt Non-Streaming Response
Logging->>Logging: async_success_handler()
Logging->>Prom: async_log_success_event()
Prom->>Prom: litellm_proxy_total_requests_metric.inc() ✅
else Streaming Response (complete)
Logging->>Logging: async_success_handler() [stream complete]
Logging->>Prom: async_log_success_event()
Prom->>Prom: litellm_proxy_total_requests_metric.inc() ✅
end
Proxy->>Prom: async_post_call_success_hook()
Note over Prom: No-op (pass) — metric NOT incremented here<br/>to prevent double-counting
Proxy->>Client: Success Response
Last reviewed commit: 0f9f0da
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
@greptile re-review with the change |
Greptile OverviewGreptile SummaryThis PR aims to fix double-counting of
Confidence Score: 1/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/utils.py | Critical bug: string_callbacks_to_replace dict is never populated because the assignment was removed in the second commit, so no string callbacks are ever initialized or replaced. The else branch for non-string callbacks was also removed, breaking callback registration entirely. |
| litellm/integrations/prometheus.py | Consolidates litellm_proxy_total_requests_metric increment into async_log_success_event for both streaming and non-streaming, and removes the duplicate increment from async_post_call_success_hook. Logic change is sound. |
| tests/litellm/proxy/test_init_litellm_callbacks.py | New tests assert correct behavior (string callbacks replaced in-place), but the current implementation code they test has a bug — these tests would fail when run against the code in this PR. |
| tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py | Updated to assert that async_post_call_success_hook no longer increments the total requests metric. Changes are consistent with the prometheus.py changes. |
| tests/enterprise/litellm_enterprise/integrations/test_prometheus.py | Updated semantic validation tests to expect that async_log_success_event always increments (for both streaming and non-streaming), and that async_post_call_success_hook does not. Correct test updates. |
Flowchart
flowchart TD
A["litellm.callbacks = ['prometheus']"] --> B["_init_litellm_callbacks()"]
B --> C{"Is callback a string?"}
C -->|Yes| D["_init_custom_logger_compatible_class()"]
D --> E{"initialized_callback is not None?"}
E -->|Yes - BUG: this branch missing| F["string_callbacks_to_replace[idx] = initialized_callback"]
E -->|No| G["Skip (leave string in list)"]
C -->|No - BUG: else branch missing| H["add_litellm_callback(callback)"]
F --> I["Replace string in litellm.callbacks"]
I --> J["add_litellm_callback(initialized_callback)"]
subgraph "Prometheus Metric Flow (Fixed correctly)"
K["async_log_success_event()"] -->|"Always increments"| L["litellm_proxy_total_requests_metric.inc()"]
M["async_post_call_success_hook()"] -->|"Now pass (no-op)"| N["No increment"]
end
style E fill:#ff6666,stroke:#cc0000,color:#000
style C fill:#ff6666,stroke:#cc0000,color:#000
style K fill:#66cc66,stroke:#009900,color:#000
style M fill:#66cc66,stroke:#009900,color:#000
Last reviewed commit: b672cc0
litellm/proxy/utils.py
Outdated
| for idx, callback in enumerate(litellm.callbacks): | ||
| if isinstance(callback, str): | ||
| callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class( # type: ignore | ||
| initialized_callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class( | ||
| cast(_custom_logger_compatible_callbacks_literal, callback), | ||
| internal_usage_cache=self.internal_usage_cache.dual_cache, | ||
| llm_router=llm_router, | ||
| ) | ||
|
|
||
| if callback is None: | ||
| continue | ||
|
|
||
| litellm.logging_callback_manager.add_litellm_callback(callback) | ||
| # Replace string entries in litellm.callbacks with initialized instances | ||
| for idx, initialized_callback in string_callbacks_to_replace.items(): | ||
| litellm.callbacks[idx] = initialized_callback | ||
| litellm.logging_callback_manager.add_litellm_callback(initialized_callback) |
There was a problem hiding this comment.
string_callbacks_to_replace is never populated — callbacks silently break
The string_callbacks_to_replace dict is created at line 457 but never receives any entries. The first commit (0f9f0da4) correctly had:
if initialized_callback is not None:
string_callbacks_to_replace[idx] = initialized_callbackThe second commit (b672cc0c) removed this assignment without replacing it. As a result, the replacement loop at lines 468-470 iterates over an empty dict, and no string callbacks are ever initialized or replaced. This means the entire fix for the double-counting bug is ineffective — string callbacks like "prometheus" remain as raw strings in litellm.callbacks and are never turned into PrometheusLogger instances via this code path.
Additionally, the else branch that registered pre-existing CustomLogger instances with add_litellm_callback was also removed, so non-string callbacks are no longer registered with the callback manager either.
| for idx, callback in enumerate(litellm.callbacks): | |
| if isinstance(callback, str): | |
| callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class( # type: ignore | |
| initialized_callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class( | |
| cast(_custom_logger_compatible_callbacks_literal, callback), | |
| internal_usage_cache=self.internal_usage_cache.dual_cache, | |
| llm_router=llm_router, | |
| ) | |
| if callback is None: | |
| continue | |
| litellm.logging_callback_manager.add_litellm_callback(callback) | |
| # Replace string entries in litellm.callbacks with initialized instances | |
| for idx, initialized_callback in string_callbacks_to_replace.items(): | |
| litellm.callbacks[idx] = initialized_callback | |
| litellm.logging_callback_manager.add_litellm_callback(initialized_callback) | |
| for idx, callback in enumerate(litellm.callbacks): | |
| if isinstance(callback, str): | |
| initialized_callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class( | |
| cast(_custom_logger_compatible_callbacks_literal, callback), | |
| internal_usage_cache=self.internal_usage_cache.dual_cache, | |
| llm_router=llm_router, | |
| ) | |
| if initialized_callback is not None: | |
| string_callbacks_to_replace[idx] = initialized_callback | |
| else: | |
| litellm.logging_callback_manager.add_litellm_callback(callback) | |
| # Replace string entries in litellm.callbacks with initialized instances | |
| for idx, initialized_callback in string_callbacks_to_replace.items(): | |
| litellm.callbacks[idx] = initialized_callback | |
| litellm.logging_callback_manager.add_litellm_callback(initialized_callback) |
|
@greptile re-review the commit. I reversed the change that lead to a bug. |
Greptile OverviewGreptile SummaryThis PR fixes double-counting of
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/utils.py | Refactored _init_litellm_callbacks to replace string callbacks in-place rather than appending initialized instances, preventing string+instance duplicates. Has minor trailing whitespace. Logic is correct. |
| litellm/integrations/prometheus.py | Moves litellm_proxy_total_requests_metric increment from two locations (streaming in async_log_success_event + all in async_post_call_success_hook) to a single location (async_log_success_event for all requests). Removes dead code in _increment_token_metrics. Correctly fixes double-counting for streaming requests. |
| tests/litellm/proxy/test_init_litellm_callbacks.py | New test file with 4 well-structured tests covering string callback replacement, existing instance deduplication, unrecognized callback handling, and multiple callback replacement. Good coverage of the new in-place replacement logic. |
| tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py | Updated test for async_post_call_success_hook to assert the metric is NOT incremented (was previously asserting it WAS incremented). Correctly reflects the new behavior. |
| tests/enterprise/litellm_enterprise/integrations/test_prometheus.py | Updated semantic validation tests to assert litellm_proxy_total_requests_metric is incremented once in async_log_success_event (for both streaming and non-streaming) and not again in async_post_call_success_hook. Correctly validates the fix. |
Flowchart
flowchart TD
A[LLM Request Completes] --> B{Request Type?}
B -->|Streaming| C[async_log_success_event]
B -->|Non-Streaming| C
C --> D[_increment_top_level_request_and_spend_metrics]
C --> E["Increment litellm_proxy_total_requests_metric ✅ (single location)"]
C --> F[_increment_token_metrics]
A --> G[async_post_call_success_hook]
G --> H["pass (no-op) — metric NOT incremented here"]
style E fill:#4CAF50,color:#fff
style H fill:#FF9800,color:#fff
Last reviewed commit: 2113d08
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Poetry Lock CI Failure - Action NeededHi! 👋 Your PR is failing CI with this error: The IssueYour PR was created before the latest SolutionPlease rebase your branch onto the latest main to pick up the synchronized lock file: git fetch upstream main # or 'origin' if that's your main remote
git rebase upstream/main
git push --force-with-leaseThis will resolve the poetry lock error and your tests should run properly. ✅ No changes needed to your code - this is just a matter of updating the base! 🚀 |
|
@greptile re-review with the new commit now that i have remove the else branch as it was redundant for non-string callbacks. |
Greptile OverviewGreptile SummaryThis PR fixes double-counting of
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/integrations/prometheus.py | Consolidates litellm_proxy_total_requests_metric increment to async_log_success_event for both streaming and non-streaming requests, removes it from async_post_call_success_hook and cleans up dead code in _increment_token_metrics. |
| litellm/proxy/utils.py | Rewrites _init_litellm_callbacks to replace string callbacks in-place rather than appending instances, preventing duplicates. However, initialized callbacks are no longer registered via add_litellm_callback, which currently is functionally equivalent but bypasses the callback manager's dedup/validation layer. |
| tests/litellm/proxy/test_init_litellm_callbacks.py | New test file with thorough unit tests covering the in-place replacement of string callbacks, handling of unrecognized strings, and prevention of duplicates. |
| tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py | Updates test_async_post_call_success_hook to assert the metric is NOT incremented, matching the new behavior. |
| tests/enterprise/litellm_enterprise/integrations/test_prometheus.py | Updates semantic validation tests to expect the metric increment in async_log_success_event for both streaming and non-streaming, and confirms async_post_call_success_hook no longer increments. |
Flowchart
flowchart TD
A["Proxy startup: _init_litellm_callbacks()"] --> B{"litellm.callbacks entry is string?"}
B -- Yes --> C["Initialize via _init_custom_logger_compatible_class()"]
C --> D{"Initialization returned instance?"}
D -- Yes --> E["Replace string in-place at same index"]
D -- No --> F["Leave string in litellm.callbacks"]
B -- No --> G["Skip (already an instance)"]
H["Request succeeds"] --> I["async_log_success_event fires"]
I --> J["Increment litellm_proxy_total_requests_metric (streaming + non-streaming)"]
H --> K["async_post_call_success_hook fires"]
K --> L["No-op (pass) — metric NOT incremented here"]
Last reviewed commit: ad07e53
d448682
into
litellm_oss_staging_02_16_2026
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
Root cause
When "prometheus" was added as a string to litellm.callbacks, _init_litellm_callbacks() converted it to a PrometheusLogger instance and added it via add_litellm_callback() without removing the original string. Both the string and the instance stayed in litellm.callbacks. post_call_success_hook() iterated over both entries, so async_post_call_success_hook() ran twice per request and incremented litellm_proxy_total_requests_metric twice.
The metric after one non-streaming request.

Before:
After:

tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🐛 Bug Fix
✅ Test
Changes
Fixes
litellm/proxy/utils.py – In _init_litellm_callbacks(), string callbacks are now replaced in-place with their initialized instances instead of appending them, so each callback appears only once in litellm.callbacks.
litellm/integrations/prometheus.py – The metric is incremented only in async_log_success_event (for both streaming and non-streaming). The increment was removed from async_post_call_success_hook to avoid double-counting. Dead code in _increment_token_metrics was removed.