[Fix] Team Usage Spend Truncated Due to Pagination by yuneng-jiang · Pull Request #22938 · BerriAI/litellm

yuneng-jiang · 2026-03-06T01:01:15Z

Relevant issues

Summary

Failure Path (Before Fix)

The /team/daily/activity endpoint used Prisma's find_many with skip/take pagination. The UI sends page_size=1000 but only fetches page 1. The LiteLLM_DailyTeamSpend table stores one row per unique (team_id, date, api_key, model, provider, endpoint) combination — a team with 141 keys and multiple models over 30 days produces ~1.3M rows (user confirmed total_pages: 1329). The total_spend in the response was computed only from the first 1000 rows.

This caused the team spend to appear identical across all date ranges (7-day, MTD, 30-day, YTD) since it always returned the same newest 1000 rows.

Fix

Switches /team/daily/activity from paginated Prisma queries to SQL GROUP BY via get_daily_activity_aggregated, returning all data in a single response. Adds include_entity_breakdown option to preserve per-team breakdown data in the response. Also adds timezone parameter support and api_key list filtering for the aggregated query path.

Changes

common_daily_activity.py: Added include_entity_id param to _build_aggregated_sql_query to optionally include entity_id in SELECT/GROUP BY. Widened api_key type to accept List[str] with proper SQL IN clause handling. Added include_entity_breakdown param to get_daily_activity_aggregated.
team_endpoints.py: /team/daily/activity now calls get_daily_activity_aggregated instead of get_daily_activity. Added timezone parameter. page/page_size kept in signature for backward compat but are no-ops.
test_team_endpoints.py: Updated existing tests to mock get_daily_activity_aggregated. Added test verifying include_entity_breakdown=True and timezone passthrough.

Testing

All 5 team daily activity tests pass
Verified with user that total_pages: 1329 confirms pagination truncation as root cause

Type

🐛 Bug Fix
✅ Test

The /team/daily/activity endpoint used Prisma pagination (page_size=1000) but the UI only fetched page 1. Teams with many keys/models easily exceed 1000 rows in LiteLLM_DailyTeamSpend, causing truncated totals. Switches the endpoint to use SQL GROUP BY via get_daily_activity_aggregated with include_entity_breakdown=True, returning all data in a single response while preserving per-team breakdown. Also adds timezone parameter support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-03-06T01:01:21Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Error		Mar 6, 2026 1:01am

greptile-apps · 2026-03-06T01:06:12Z

Greptile Summary

This PR fixes a real-data bug where /team/daily/activity silently returned truncated spend totals by replacing a paginated Prisma find_many call (capped at the first 1 000 rows of up to 1.3 M rows) with a SQL GROUP BY query that aggregates the entire dataset in the database before returning it. The fix is well-scoped: backward compatibility of page/page_size is preserved, a new timezone parameter is threaded through correctly, and per-team entity breakdown is restored by optionally including entity_id in the GROUP BY.

Key changes:

_build_aggregated_sql_query now accepts include_entity_id to selectively add entity_id to SELECT/GROUP BY, enabling per-team breakdowns.
api_key filter upgraded to accept List[str] with a proper SQL IN clause (used when non-admin users are filtered to their own keys).
get_daily_activity_aggregated gains include_entity_breakdown; the team endpoint always passes True to preserve per-team spend breakdown.
All 5 existing tests updated and a new integration-style test added to verify the aggregated path with include_entity_breakdown=True and timezone passthrough.

Two items to be aware of:

The new timezone parameter docstring in team_endpoints.py says "offset in minutes from UTC (e.g., 480 for PST)" but the implementation uses JavaScript's Date.getTimezoneOffset() sign convention (positive = west of UTC), the opposite of standard UTC offset notation. Without explicit documentation of this convention, API consumers using the standard convention will apply timezone adjustments to the wrong direction.
The aggregated SQL query has no LIMIT clause. For admin-level requests covering all teams with no team_ids filter, the GROUP BY over many teams × keys × models × days could still produce a very large in-memory result set. This is an acceptable tradeoff for correctness, but worth monitoring in large deployments.

Confidence Score: 4/5

Safe to merge — the fix correctly eliminates spend truncation with no breaking changes; minor documentation and scalability concerns are non-blocking.
The root cause (pagination truncation) is well-diagnosed and the GROUP BY approach is the correct fix. Backward compat is preserved for page/page_size. Tests cover the core scenarios with mocks and no real network calls. Two non-critical concerns lower the score slightly: the timezone docstring uses JS convention without labeling it (could cause API misuse), and the absence of a LIMIT on the aggregated query is a latent scalability risk for very large unfiltered admin requests.
litellm/proxy/management_endpoints/common_daily_activity.py — review the unbounded SQL result set concern for large deployments; litellm/proxy/management_endpoints/team_endpoints.py — clarify the timezone offset sign convention in the docstring.

Important Files Changed

Filename	Overview
litellm/proxy/management_endpoints/common_daily_activity.py	Adds `include_entity_id` param to `_build_aggregated_sql_query` (controls whether entity_id is in SELECT/GROUP BY), upgrades `api_key` to accept `List[str]` with correct `IN` clause, and adds `include_entity_breakdown` to `get_daily_activity_aggregated`. Logic is sound; the only concern is the lack of a LIMIT on the resulting SQL query which could be expensive for very large unfiltered requests.
litellm/proxy/management_endpoints/team_endpoints.py	Switches `/team/daily/activity` from `get_daily_activity` (paginated Prisma `find_many`) to `get_daily_activity_aggregated` (SQL GROUP BY). Adds `timezone` parameter; deprecates but retains `page`/`page_size` for backward compat. A misleading docstring for the new `timezone` parameter uses JS convention (positive=west-of-UTC) without calling it out explicitly.
tests/test_litellm/proxy/management_endpoints/test_team_endpoints.py	All existing tests updated to mock `get_daily_activity_aggregated` instead of `get_daily_activity`. New test `test_get_team_daily_activity_uses_aggregated_with_entity_breakdown` verifies `include_entity_breakdown=True` and timezone passthrough. Tests are mock-only, in line with CI requirements. Coverage looks thorough.

Sequence Diagram

sequenceDiagram
    participant UI as UI / API Client
    participant TE as team_endpoints.py<br/>/team/daily/activity
    participant CDA as common_daily_activity.py<br/>get_daily_activity_aggregated
    participant SQL as _build_aggregated_sql_query
    participant DB as PostgreSQL<br/>LiteLLM_DailyTeamSpend

    UI->>TE: GET /team/daily/activity<br/>(team_ids, start_date, end_date,<br/>model, api_key, timezone, page, page_size)

    note over TE: page/page_size accepted<br/>but are no-ops (deprecated)

    TE->>TE: Auth check & team membership validation
    TE->>TE: Build final_api_key_filter<br/>(user keys if non-admin)
    TE->>CDA: get_daily_activity_aggregated(<br/>include_entity_breakdown=True,<br/>timezone_offset_minutes=timezone)

    CDA->>SQL: _build_aggregated_sql_query(<br/>include_entity_id=True)
    SQL-->>CDA: SQL + params<br/>GROUP BY (team_id, date, api_key,<br/>model, model_group, provider,<br/>mcp_tool, endpoint)

    CDA->>DB: query_raw(sql, *params)
    DB-->>CDA: Pre-aggregated rows<br/>(all matching rows, no pagination)

    CDA->>CDA: _aggregate_spend_records(<br/>entity_id_field="team_id",<br/>entity_metadata_field=team_alias_metadata)
    CDA-->>TE: SpendAnalyticsPaginatedResponse<br/>(page=1, total_pages=1, has_more=False)
    TE-->>UI: Full spend data with per-team breakdown

_{Last reviewed commit: d0e4804}

greptile-apps · 2026-03-06T01:06:16Z

litellm/proxy/management_endpoints/team_endpoints.py

+        page (int): Deprecated, kept for backward compatibility. All results are returned in a single page.
+        page_size (int): Deprecated, kept for backward compatibility.
        exclude_team_ids (Optional[str]): Comma-separated list of team IDs to exclude.
+        timezone (Optional[int]): Timezone offset in minutes from UTC (e.g., 480 for PST).


Misleading timezone offset convention in docstring

The docstring says "Timezone offset in minutes from UTC (e.g., 480 for PST)", but standard UTC offset notation for PST is −480 (UTC−8). This uses JavaScript's Date.getTimezoneOffset() convention (positive = west/behind UTC), which is the opposite of the IANA/ISO standard.

While the underlying _adjust_dates_for_timezone implementation is consistent (positive = west of UTC, so PST = +480), API consumers who follow the standard UTC-offset convention would pass −480 for PST and get the end-date expansion applied to the wrong hemisphere. This is a functional risk for anyone calling this new timezone parameter.

Consider clarifying the docstring to explicitly call out the JS convention and distinguish it from standard UTC offsets:

Suggested change

timezone (Optional[int]): Timezone offset in minutes from UTC (e.g., 480 for PST).

timezone (Optional[int]): Timezone offset in minutes using JavaScript's Date.getTimezoneOffset() convention — positive values are *west* of UTC (e.g., 480 for PST = UTC-8). This is the opposite of standard UTC offset notation.

greptile-apps · 2026-03-06T01:06:16Z

litellm/proxy/management_endpoints/common_daily_activity.py

@@ -563,7 +578,7 @@ def _build_aggregated_sql_query(
            SUM(failed_requests)::bigint AS failed_requests
        FROM "{pg_table}"
        WHERE {where_clause}
-        GROUP BY date, api_key, model, model_group, custom_llm_provider,
+        GROUP BY {entity_group_by} date, api_key, model, model_group, custom_llm_provider,
                 mcp_namespaced_tool_name, endpoint
        ORDER BY date DESC


Unbounded result set when querying without entity filter

The SQL query has no LIMIT clause. When include_entity_breakdown=True (always the case for the team endpoint now) and no team_ids filter is provided — e.g., a proxy admin loading the full dashboard — the query groups by (team_id, date, api_key, model, model_group, custom_llm_provider, mcp_namespaced_tool_name, endpoint). For a large deployment with many teams × keys × models × days, this can still produce tens or hundreds of thousands of grouped rows pulled entirely into Python memory in a single request.

The original paginated approach bounded memory per request via take=page_size. The new approach trades that safety valve for correctness (which is the right call), but the trade-off is worth surfacing. Consider either:

Adding a defensive LIMIT with a generous cap (e.g., 500 000 rows) and logging a warning if it's hit, or

Documenting this scalability assumption explicitly so operators are aware.

This is not a blocker for the fix itself, but is worth tracking for very large multi-team deployments.

vercel bot had a problem deploying to Preview March 6, 2026 01:01 Failure

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

yuneng-jiang merged commit 99c4f3c into main Mar 6, 2026
74 of 101 checks passed

yuneng-jiang mentioned this pull request Mar 7, 2026

Revert "[Fix] Team Usage Spend Truncated Due to Pagination" #23028

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] Team Usage Spend Truncated Due to Pagination#22938

[Fix] Team Usage Spend Truncated Due to Pagination#22938
yuneng-jiang merged 1 commit intomainfrom
litellm_fix_team_usage_spend

yuneng-jiang commented Mar 6, 2026

Uh oh!

vercel bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 6, 2026

Important Files Changed

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	timezone (Optional[int]): Timezone offset in minutes from UTC (e.g., 480 for PST).
	timezone (Optional[int]): Timezone offset in minutes using JavaScript's Date.getTimezoneOffset() convention — positive values are west of UTC (e.g., 480 for PST = UTC-8). This is the opposite of standard UTC offset notation.

Uh oh!

Conversation

yuneng-jiang commented Mar 6, 2026

Relevant issues

Summary

Failure Path (Before Fix)

Fix

Changes

Testing

Type

Uh oh!

vercel bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 6, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 6, 2026 •

edited

Loading