-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
docs: add PayGo/priority cost tracking for Gemini Vertex AI #22948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -104,9 +104,18 @@ There are other keys you can use to specify costs for different scenarios and mo | |||||||||
| - `input_cost_per_video_per_second` - Cost per second of video input | ||||||||||
| - `input_cost_per_video_per_second_above_128k_tokens` - Video cost for large contexts | ||||||||||
| - `input_cost_per_character` - Character-based pricing for some providers | ||||||||||
| - `input_cost_per_token_priority` / `output_cost_per_token_priority` - Priority/PayGo pricing (Vertex AI Gemini, Bedrock) | ||||||||||
| - `input_cost_per_token_flex` / `output_cost_per_token_flex` - Batch/flex pricing | ||||||||||
|
Comment on lines
+107
to
+108
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Misleading Bedrock attribution for priority/flex pricing keys The description says The code does have a fallback ( Consider either:
Suggested change
|
||||||||||
|
|
||||||||||
| These keys evolve based on how new models handle multimodality. The latest version can be found at [https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json). | ||||||||||
|
|
||||||||||
| ### Service Tier / PayGo Pricing (Vertex AI, Bedrock) | ||||||||||
|
|
||||||||||
| For providers that support multiple pricing tiers (e.g., Vertex AI PayGo, Bedrock service tiers), LiteLLM automatically applies the correct cost based on the response: | ||||||||||
|
|
||||||||||
| - **Vertex AI Gemini**: Uses `usageMetadata.trafficType` (`ON_DEMAND_PRIORITY` → priority, `FLEX`/`BATCH` → flex). See [Vertex AI - PayGo / Priority Cost Tracking](../providers/vertex.md#paygo--priority-cost-tracking). | ||||||||||
| - **Bedrock**: Uses `serviceTier` from the response. See [Bedrock - Usage - Service Tier](../providers/bedrock.md#usage---service-tier). | ||||||||||
|
|
||||||||||
| ## Zero-Cost Models (Bypass Budget Checks) | ||||||||||
|
|
||||||||||
| **Use Case**: You have on-premises or free models that should be accessible even when users exceed their budget limits. | ||||||||||
|
|
||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent backtick formatting for
standardtierThe
priorityandflexvalues in theLiteLLM service_tiercolumn are wrapped in backticks, butstandardis not. For consistency:Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!