Skip to content

fix: prevent double-counting of litellm_proxy_total_requests_metric#21159

Merged
krrishdholakia merged 5 commits intolitellm_oss_staging_02_16_2026from
litellm_double_counting_promo_metric
Feb 16, 2026
Merged

fix: prevent double-counting of litellm_proxy_total_requests_metric#21159
krrishdholakia merged 5 commits intolitellm_oss_staging_02_16_2026from
litellm_double_counting_promo_metric

Conversation

@shivamrawat1
Copy link
Collaborator

@shivamrawat1 shivamrawat1 commented Feb 13, 2026

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR
Root cause
When "prometheus" was added as a string to litellm.callbacks, _init_litellm_callbacks() converted it to a PrometheusLogger instance and added it via add_litellm_callback() without removing the original string. Both the string and the instance stayed in litellm.callbacks. post_call_success_hook() iterated over both entries, so async_post_call_success_hook() ran twice per request and incremented litellm_proxy_total_requests_metric twice.

The metric after one non-streaming request.
Before:
Screenshot 2026-02-13 at 6 43 06 PM

After:
Screenshot 2026-02-13 at 6 36 39 PM

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🐛 Bug Fix
✅ Test

Changes

Fixes
litellm/proxy/utils.py – In _init_litellm_callbacks(), string callbacks are now replaced in-place with their initialized instances instead of appending them, so each callback appears only once in litellm.callbacks.

litellm/integrations/prometheus.py – The metric is incremented only in async_log_success_event (for both streaming and non-streaming). The increment was removed from async_post_call_success_hook to avoid double-counting. Dead code in _increment_token_metrics was removed.

@vercel
Copy link

vercel bot commented Feb 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 14, 2026 6:19pm

Request Review

@shivamrawat1
Copy link
Collaborator Author

@greptile review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 13, 2026

Greptile Overview

Greptile Summary

This PR fixes double-counting of litellm_proxy_total_requests_metric in Prometheus by addressing two root causes: (1) string callbacks in litellm.callbacks were being appended as initialized instances without removing the original string, causing the same callback to fire twice, and (2) the metric was being incremented in both async_log_success_event (for streaming) and async_post_call_success_hook (for all requests), causing streaming requests to be counted twice.

  • litellm/proxy/utils.py: _init_litellm_callbacks() now replaces string callbacks in-place with initialized instances instead of appending, preventing duplicate entries in litellm.callbacks.
  • litellm/integrations/prometheus.py: litellm_proxy_total_requests_metric is now incremented only in async_log_success_event for all successful requests (both streaming and non-streaming). The increment was removed from async_post_call_success_hook, and dead code in _increment_token_metrics was cleaned up.
  • Tests: New unit tests validate the in-place replacement behavior, and existing enterprise tests were updated to match the new single-increment semantics.

Confidence Score: 4/5

  • This PR is safe to merge — it correctly fixes the double-counting bug with a well-reasoned approach and includes comprehensive tests.
  • The fix correctly consolidates the metric increment into a single location (async_log_success_event) and resolves the callback duplication in _init_litellm_callbacks. Both root causes are addressed with clear code and comments. Tests cover the callback replacement logic and updated Prometheus metric semantics. One minor style concern: initialized string callbacks bypass add_litellm_callback, which is functionally harmless today but could be a maintenance risk if that method gains side effects in the future.
  • litellm/proxy/utils.py — the in-place replacement of string callbacks bypasses add_litellm_callback registration, which could matter if that method gains side effects in the future.

Important Files Changed

Filename Overview
litellm/integrations/prometheus.py Consolidates litellm_proxy_total_requests_metric increment into async_log_success_event (for both streaming and non-streaming), removes it from async_post_call_success_hook to prevent double-counting, and removes dead code in _increment_token_metrics. The logic is correct.
litellm/proxy/utils.py Rewrites _init_litellm_callbacks to replace string callbacks in-place rather than appending initialized instances, preventing duplicate entries. The approach is sound but has a subtle behavioral change where initialized string callbacks skip add_litellm_callback registration.
tests/litellm/proxy/test_init_litellm_callbacks.py New unit tests for the _init_litellm_callbacks fix. Covers string replacement, deduplication, unrecognized callbacks, and multiple string callbacks. Tests are well-structured but don't verify that callbacks are still functional after replacement.
tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py Updated test expectations to verify litellm_proxy_total_requests_metric is NOT incremented in async_post_call_success_hook. Changes align with the new behavior.
tests/enterprise/litellm_enterprise/integrations/test_prometheus.py Updated semantic validation tests so that async_log_success_event is expected to increment the counter for both streaming and non-streaming, and async_post_call_success_hook is expected to NOT increment. Assertions and messages updated consistently.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Proxy as LiteLLM Proxy
    participant Logging as litellm_logging (core)
    participant Prom as PrometheusLogger

    Client->>Proxy: LLM Request
    Proxy->>Logging: Route to LLM provider

    alt Non-Streaming Response
        Logging->>Logging: async_success_handler()
        Logging->>Prom: async_log_success_event()
        Prom->>Prom: litellm_proxy_total_requests_metric.inc() ✅
    else Streaming Response (complete)
        Logging->>Logging: async_success_handler() [stream complete]
        Logging->>Prom: async_log_success_event()
        Prom->>Prom: litellm_proxy_total_requests_metric.inc() ✅
    end

    Proxy->>Prom: async_post_call_success_hook()
    Note over Prom: No-op (pass) — metric NOT incremented here<br/>to prevent double-counting

    Proxy->>Client: Success Response
Loading

Last reviewed commit: 0f9f0da

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@shivamrawat1
Copy link
Collaborator Author

@greptile re-review with the change

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 14, 2026

Greptile Overview

Greptile Summary

This PR aims to fix double-counting of litellm_proxy_total_requests_metric by: (1) consolidating the metric increment into async_log_success_event for all requests (both streaming and non-streaming), removing it from async_post_call_success_hook; and (2) replacing string callbacks in-place in litellm.callbacks to prevent both a string and an initialized instance from coexisting.

  • Prometheus metric consolidation (litellm/integrations/prometheus.py): The litellm_proxy_total_requests_metric increment is moved out of the if stream is True conditional in async_log_success_event so it fires for all successful requests. The async_post_call_success_hook is gutted to a pass to avoid the second increment. This change is logically correct.
  • Critical bug in callback initialization (litellm/proxy/utils.py): The first commit (0f9f0da4) correctly implemented the in-place replacement logic, but the second commit (b672cc0c) accidentally removed the lines that populate string_callbacks_to_replace and the else branch for non-string callbacks. As a result, string_callbacks_to_replace is always empty, no string callbacks are ever initialized, and non-string callbacks are never registered with the callback manager. The new unit tests in tests/litellm/proxy/test_init_litellm_callbacks.py would fail against the current code.
  • Test updates in the enterprise test files are consistent with the prometheus.py changes and correctly validate the new behavior.

Confidence Score: 1/5

  • This PR has a critical logic bug in litellm/proxy/utils.py that silently breaks all string callback initialization.
  • The second commit (b672cc0) removed the lines that populate the string_callbacks_to_replace dictionary and the else branch for non-string callbacks. This means no string callbacks (like "prometheus") are ever initialized or replaced, and no pre-existing instance callbacks are registered with the callback manager. The Prometheus metric consolidation in prometheus.py is correct, but the callback initialization bug makes the overall PR unsafe to merge. The new unit tests would fail against the current code.
  • litellm/proxy/utils.py requires immediate attention — the _init_litellm_callbacks method has a critical bug where string_callbacks_to_replace is never populated.

Important Files Changed

Filename Overview
litellm/proxy/utils.py Critical bug: string_callbacks_to_replace dict is never populated because the assignment was removed in the second commit, so no string callbacks are ever initialized or replaced. The else branch for non-string callbacks was also removed, breaking callback registration entirely.
litellm/integrations/prometheus.py Consolidates litellm_proxy_total_requests_metric increment into async_log_success_event for both streaming and non-streaming, and removes the duplicate increment from async_post_call_success_hook. Logic change is sound.
tests/litellm/proxy/test_init_litellm_callbacks.py New tests assert correct behavior (string callbacks replaced in-place), but the current implementation code they test has a bug — these tests would fail when run against the code in this PR.
tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py Updated to assert that async_post_call_success_hook no longer increments the total requests metric. Changes are consistent with the prometheus.py changes.
tests/enterprise/litellm_enterprise/integrations/test_prometheus.py Updated semantic validation tests to expect that async_log_success_event always increments (for both streaming and non-streaming), and that async_post_call_success_hook does not. Correct test updates.

Flowchart

flowchart TD
    A["litellm.callbacks = ['prometheus']"] --> B["_init_litellm_callbacks()"]
    B --> C{"Is callback a string?"}
    C -->|Yes| D["_init_custom_logger_compatible_class()"]
    D --> E{"initialized_callback is not None?"}
    E -->|Yes - BUG: this branch missing| F["string_callbacks_to_replace[idx] = initialized_callback"]
    E -->|No| G["Skip (leave string in list)"]
    C -->|No - BUG: else branch missing| H["add_litellm_callback(callback)"]
    F --> I["Replace string in litellm.callbacks"]
    I --> J["add_litellm_callback(initialized_callback)"]

    subgraph "Prometheus Metric Flow (Fixed correctly)"
        K["async_log_success_event()"] -->|"Always increments"| L["litellm_proxy_total_requests_metric.inc()"]
        M["async_post_call_success_hook()"] -->|"Now pass (no-op)"| N["No increment"]
    end

    style E fill:#ff6666,stroke:#cc0000,color:#000
    style C fill:#ff6666,stroke:#cc0000,color:#000
    style K fill:#66cc66,stroke:#009900,color:#000
    style M fill:#66cc66,stroke:#009900,color:#000
Loading

Last reviewed commit: b672cc0

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +459 to +470
for idx, callback in enumerate(litellm.callbacks):
if isinstance(callback, str):
callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class( # type: ignore
initialized_callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class(
cast(_custom_logger_compatible_callbacks_literal, callback),
internal_usage_cache=self.internal_usage_cache.dual_cache,
llm_router=llm_router,
)

if callback is None:
continue

litellm.logging_callback_manager.add_litellm_callback(callback)
# Replace string entries in litellm.callbacks with initialized instances
for idx, initialized_callback in string_callbacks_to_replace.items():
litellm.callbacks[idx] = initialized_callback
litellm.logging_callback_manager.add_litellm_callback(initialized_callback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string_callbacks_to_replace is never populated — callbacks silently break

The string_callbacks_to_replace dict is created at line 457 but never receives any entries. The first commit (0f9f0da4) correctly had:

if initialized_callback is not None:
    string_callbacks_to_replace[idx] = initialized_callback

The second commit (b672cc0c) removed this assignment without replacing it. As a result, the replacement loop at lines 468-470 iterates over an empty dict, and no string callbacks are ever initialized or replaced. This means the entire fix for the double-counting bug is ineffective — string callbacks like "prometheus" remain as raw strings in litellm.callbacks and are never turned into PrometheusLogger instances via this code path.

Additionally, the else branch that registered pre-existing CustomLogger instances with add_litellm_callback was also removed, so non-string callbacks are no longer registered with the callback manager either.

Suggested change
for idx, callback in enumerate(litellm.callbacks):
if isinstance(callback, str):
callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class( # type: ignore
initialized_callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class(
cast(_custom_logger_compatible_callbacks_literal, callback),
internal_usage_cache=self.internal_usage_cache.dual_cache,
llm_router=llm_router,
)
if callback is None:
continue
litellm.logging_callback_manager.add_litellm_callback(callback)
# Replace string entries in litellm.callbacks with initialized instances
for idx, initialized_callback in string_callbacks_to_replace.items():
litellm.callbacks[idx] = initialized_callback
litellm.logging_callback_manager.add_litellm_callback(initialized_callback)
for idx, callback in enumerate(litellm.callbacks):
if isinstance(callback, str):
initialized_callback = litellm.litellm_core_utils.litellm_logging._init_custom_logger_compatible_class(
cast(_custom_logger_compatible_callbacks_literal, callback),
internal_usage_cache=self.internal_usage_cache.dual_cache,
llm_router=llm_router,
)
if initialized_callback is not None:
string_callbacks_to_replace[idx] = initialized_callback
else:
litellm.logging_callback_manager.add_litellm_callback(callback)
# Replace string entries in litellm.callbacks with initialized instances
for idx, initialized_callback in string_callbacks_to_replace.items():
litellm.callbacks[idx] = initialized_callback
litellm.logging_callback_manager.add_litellm_callback(initialized_callback)

@shivamrawat1
Copy link
Collaborator Author

@greptile re-review the commit. I reversed the change that lead to a bug.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 14, 2026

Greptile Overview

Greptile Summary

This PR fixes double-counting of litellm_proxy_total_requests_metric in Prometheus logging. The root cause was twofold: (1) streaming requests had the metric incremented in both async_log_success_event and async_post_call_success_hook, and (2) string callbacks like "prometheus" in litellm.callbacks were being initialized and appended as instances without removing the original string, causing iteration over both entries.

  • Prometheus metric fix: litellm_proxy_total_requests_metric is now incremented in a single location (async_log_success_event) for both streaming and non-streaming requests. The increment was removed from async_post_call_success_hook (now a no-op pass). Dead code in _increment_token_metrics that computed labels for litellm_proxy_total_requests_metric but never used them was also removed.
  • Callback initialization fix: _init_litellm_callbacks now replaces string callbacks in-place at their index in litellm.callbacks rather than appending initialized instances, preventing duplicate entries (string + instance).
  • Tests: Added 4 new unit tests in tests/litellm/proxy/test_init_litellm_callbacks.py covering the in-place replacement logic. Updated existing Prometheus tests to validate that async_post_call_success_hook no longer increments the metric and that async_log_success_event increments it exactly once for all request types.

Confidence Score: 4/5

  • This PR is safe to merge — it correctly consolidates a duplicated metric increment into a single location and fixes callback list pollution.
  • The prometheus metric change is straightforward and correct: streaming requests were double-counted because both async_log_success_event and async_post_call_success_hook incremented the same counter. The fix consolidates to one location. The callback initialization change correctly prevents string+instance duplicates in litellm.callbacks. Tests are comprehensive and validate both the positive (metric incremented once) and negative (not incremented in the removed path) cases. Minor trailing whitespace issue noted but non-blocking.
  • litellm/proxy/utils.py — has trailing whitespace on lines 478-479 that may cause lint failures.

Important Files Changed

Filename Overview
litellm/proxy/utils.py Refactored _init_litellm_callbacks to replace string callbacks in-place rather than appending initialized instances, preventing string+instance duplicates. Has minor trailing whitespace. Logic is correct.
litellm/integrations/prometheus.py Moves litellm_proxy_total_requests_metric increment from two locations (streaming in async_log_success_event + all in async_post_call_success_hook) to a single location (async_log_success_event for all requests). Removes dead code in _increment_token_metrics. Correctly fixes double-counting for streaming requests.
tests/litellm/proxy/test_init_litellm_callbacks.py New test file with 4 well-structured tests covering string callback replacement, existing instance deduplication, unrecognized callback handling, and multiple callback replacement. Good coverage of the new in-place replacement logic.
tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py Updated test for async_post_call_success_hook to assert the metric is NOT incremented (was previously asserting it WAS incremented). Correctly reflects the new behavior.
tests/enterprise/litellm_enterprise/integrations/test_prometheus.py Updated semantic validation tests to assert litellm_proxy_total_requests_metric is incremented once in async_log_success_event (for both streaming and non-streaming) and not again in async_post_call_success_hook. Correctly validates the fix.

Flowchart

flowchart TD
    A[LLM Request Completes] --> B{Request Type?}
    B -->|Streaming| C[async_log_success_event]
    B -->|Non-Streaming| C
    C --> D[_increment_top_level_request_and_spend_metrics]
    C --> E["Increment litellm_proxy_total_requests_metric ✅ (single location)"]
    C --> F[_increment_token_metrics]
    A --> G[async_post_call_success_hook]
    G --> H["pass (no-op) — metric NOT incremented here"]

    style E fill:#4CAF50,color:#fff
    style H fill:#FF9800,color:#fff
Loading

Last reviewed commit: 2113d08

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@jquinter
Copy link
Contributor

Poetry Lock CI Failure - Action Needed

Hi! 👋

Your PR is failing CI with this error:

pyproject.toml changed significantly since poetry.lock was last generated. 
Run `poetry lock` to fix the lock file.

The Issue

Your PR was created before the latest pyproject.toml updates on main (Feb 14). The poetry.lock file has since been synced on main.

Solution

Please rebase your branch onto the latest main to pick up the synchronized lock file:

git fetch upstream main  # or 'origin' if that's your main remote
git rebase upstream/main
git push --force-with-lease

This will resolve the poetry lock error and your tests should run properly. ✅

No changes needed to your code - this is just a matter of updating the base! 🚀

@shivamrawat1
Copy link
Collaborator Author

@greptile re-review with the new commit now that i have remove the else branch as it was redundant for non-string callbacks.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 14, 2026

Greptile Overview

Greptile Summary

This PR fixes double-counting of litellm_proxy_total_requests_metric in Prometheus by making two changes:

  • litellm/proxy/utils.py: _init_litellm_callbacks() now replaces string callbacks (e.g., "prometheus") in litellm.callbacks in-place with their initialized CustomLogger instances, instead of appending instances alongside the original strings. This prevents duplicate entries that caused hooks iterating litellm.callbacks to fire twice per callback.
  • litellm/integrations/prometheus.py: The litellm_proxy_total_requests_metric increment is consolidated into async_log_success_event (for both streaming and non-streaming requests). The increment was removed from async_post_call_success_hook to prevent double-counting. Dead code in _increment_token_metrics (an unused _labels computation) was also cleaned up.
  • Tests are updated to validate the new single-increment behavior and a new test file covers the in-place callback replacement logic.

Confidence Score: 4/5

  • This PR correctly fixes a real double-counting bug with appropriate tests and minimal risk of regression.
  • The core logic changes are sound: consolidating the metric increment to a single location and preventing duplicate callback entries. The in-place replacement approach is simple and effective. Tests cover the key scenarios. The only minor concern is that initialized callbacks bypass add_litellm_callback, but this is currently functionally equivalent since that method only appends to litellm.callbacks with dedup — which is already handled by the in-place replacement.
  • litellm/proxy/utils.py deserves the most attention since it changes the callback initialization flow.

Important Files Changed

Filename Overview
litellm/integrations/prometheus.py Consolidates litellm_proxy_total_requests_metric increment to async_log_success_event for both streaming and non-streaming requests, removes it from async_post_call_success_hook and cleans up dead code in _increment_token_metrics.
litellm/proxy/utils.py Rewrites _init_litellm_callbacks to replace string callbacks in-place rather than appending instances, preventing duplicates. However, initialized callbacks are no longer registered via add_litellm_callback, which currently is functionally equivalent but bypasses the callback manager's dedup/validation layer.
tests/litellm/proxy/test_init_litellm_callbacks.py New test file with thorough unit tests covering the in-place replacement of string callbacks, handling of unrecognized strings, and prevention of duplicates.
tests/enterprise/litellm_enterprise/enterprise_callbacks/test_prometheus_logging_callbacks.py Updates test_async_post_call_success_hook to assert the metric is NOT incremented, matching the new behavior.
tests/enterprise/litellm_enterprise/integrations/test_prometheus.py Updates semantic validation tests to expect the metric increment in async_log_success_event for both streaming and non-streaming, and confirms async_post_call_success_hook no longer increments.

Flowchart

flowchart TD
    A["Proxy startup: _init_litellm_callbacks()"] --> B{"litellm.callbacks entry is string?"}
    B -- Yes --> C["Initialize via _init_custom_logger_compatible_class()"]
    C --> D{"Initialization returned instance?"}
    D -- Yes --> E["Replace string in-place at same index"]
    D -- No --> F["Leave string in litellm.callbacks"]
    B -- No --> G["Skip (already an instance)"]

    H["Request succeeds"] --> I["async_log_success_event fires"]
    I --> J["Increment litellm_proxy_total_requests_metric (streaming + non-streaming)"]
    H --> K["async_post_call_success_hook fires"]
    K --> L["No-op (pass) — metric NOT incremented here"]
Loading

Last reviewed commit: ad07e53

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@krrishdholakia krrishdholakia changed the base branch from main to litellm_oss_staging_02_16_2026 February 16, 2026 17:10
@krrishdholakia krrishdholakia merged commit d448682 into litellm_oss_staging_02_16_2026 Feb 16, 2026
58 of 78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants