Skip to content

"Fix: Subtract cached tokens from batch cost calculation"#19704

Open
priyam-that wants to merge 1 commit intoBerriAI:mainfrom
priyam-that:fix/cached-token-batch-cost-calculation
Open

"Fix: Subtract cached tokens from batch cost calculation"#19704
priyam-that wants to merge 1 commit intoBerriAI:mainfrom
priyam-that:fix/cached-token-batch-cost-calculation

Conversation

@priyam-that
Copy link
Contributor

  • Fixes issue Cost calculation incorrectly charges for cached tokens #19680 where cached tokens were being charged in batch operations
  • Handles both cache_read_input_tokens (Anthropic/OpenAI) and prompt_tokens_details.cached_tokens (z.ai/Bedrock/Gemini) formats
  • Applies fix to both input_cost_per_token_batches and input_cost_per_token paths
  • Prevents overcharging users by 10x+ on requests with high cache hit rates"

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

- Fixes issue BerriAI#19680 where cached tokens were being charged in batch operations
- Handles both cache_read_input_tokens (Anthropic/OpenAI) and prompt_tokens_details.cached_tokens (z.ai/Bedrock/Gemini) formats
- Applies fix to both input_cost_per_token_batches and input_cost_per_token paths
- Prevents overcharging users by 10x+ on requests with high cache hit rates"
@vercel
Copy link

vercel bot commented Jan 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
litellm Ready Ready Preview, Comment Jan 24, 2026 6:08pm

Request Review

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants