docs: add PayGo/priority cost tracking for Gemini Vertex AI#22948
docs: add PayGo/priority cost tracking for Gemini Vertex AI#22948Sameerlite merged 1 commit intomainfrom
Conversation
- Add PayGo / Priority Cost Tracking section to Vertex AI provider docs - Document trafficType to service_tier mapping (ON_DEMAND_PRIORITY, FLEX, etc.) - Add service tier cost keys to custom pricing docs - Add provider-specific cost tracking note to spend tracking overview Made-with: Cursor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis is a documentation-only PR that adds a "PayGo / Priority Cost Tracking" section to the Vertex AI provider docs, cross-links in the spend tracking and custom pricing docs, and documents the Key observations:
Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| docs/my-website/docs/providers/vertex.md | Adds a "PayGo / Priority Cost Tracking" section that accurately describes the trafficType→service_tier mapping implemented in vertex_ai/cost_calculator.py; minor backtick formatting inconsistency in the table. |
| docs/my-website/docs/proxy/cost_tracking.md | Adds a single informational sentence with cross-links to provider-specific cost tracking docs; links and anchors are all valid. |
| docs/my-website/docs/proxy/custom_pricing.md | Adds priority/flex pricing key descriptions and a new sub-section on Service Tier / PayGo Pricing; the claim that input_cost_per_token_priority applies to Bedrock is misleading since Bedrock models don't currently have those keys in model_prices_and_context_window.json. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Vertex AI API Response] --> B{usageMetadata.trafficType?}
B -->|ON_DEMAND_PRIORITY| C[service_tier = 'priority']
B -->|FLEX or BATCH| D[service_tier = 'flex']
B -->|ON_DEMAND or absent| E[service_tier = None / standard]
C --> F[Look up input_cost_per_token_priority\nin model_prices_and_context_window.json]
D --> G[Look up input_cost_per_token_flex\nin model_prices_and_context_window.json]
E --> H[Look up input_cost_per_token\nin model_prices_and_context_window.json]
F -->|key missing| I[Fallback to input_cost_per_token]
G -->|key missing| I
H --> J[Final spend recorded]
I --> J
Last reviewed commit: baa5d72
| | Vertex AI `trafficType` | LiteLLM `service_tier` | Pricing applied | | ||
| |-------------------------|-------------------------|-----------------| | ||
| | `ON_DEMAND_PRIORITY` | `priority` | PayGo / priority pricing (`input_cost_per_token_priority`, `output_cost_per_token_priority`) | | ||
| | `ON_DEMAND` | standard | Default on-demand pricing | |
There was a problem hiding this comment.
Inconsistent backtick formatting for standard tier
The priority and flex values in the LiteLLM service_tier column are wrapped in backticks, but standard is not. For consistency:
| | `ON_DEMAND` | standard | Default on-demand pricing | | |
| | `ON_DEMAND` | `standard` | Default on-demand pricing | |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| - `input_cost_per_token_priority` / `output_cost_per_token_priority` - Priority/PayGo pricing (Vertex AI Gemini, Bedrock) | ||
| - `input_cost_per_token_flex` / `output_cost_per_token_flex` - Batch/flex pricing |
There was a problem hiding this comment.
Misleading Bedrock attribution for priority/flex pricing keys
The description says input_cost_per_token_priority / output_cost_per_token_priority applies to "(Vertex AI Gemini, Bedrock)", but inspecting model_prices_and_context_window.json shows that no Bedrock model entries currently have input_cost_per_token_priority or input_cost_per_token_flex keys. Only Vertex AI (and Google Gemini, Azure, OpenAI) models have these entries populated.
The code does have a fallback (_get_cost_per_unit in llm_cost_calc/utils.py) that gracefully returns standard pricing when a tier-specific key is missing, so Bedrock requests won't error — but the cost tracking will silently use standard rates regardless of the requested service tier, which is the opposite of what users reading this note might expect.
Consider either:
- Removing "Bedrock" from this line (since the keys are not pre-populated for it), or
- Clarifying that for Bedrock, these keys can be manually set in custom pricing to enable tier-differentiated tracking, but no out-of-the-box priority/flex pricing data exists for Bedrock today.
| - `input_cost_per_token_priority` / `output_cost_per_token_priority` - Priority/PayGo pricing (Vertex AI Gemini, Bedrock) | |
| - `input_cost_per_token_flex` / `output_cost_per_token_flex` - Batch/flex pricing | |
| - `input_cost_per_token_priority` / `output_cost_per_token_priority` - Priority/PayGo pricing (Vertex AI Gemini; can be manually set for other providers) | |
| - `input_cost_per_token_flex` / `output_cost_per_token_flex` - Batch/flex pricing (Vertex AI Gemini; can be manually set for other providers) |
Summary
Documents the existing PayGo (priority) cost tracking support for Gemini/Vertex AI.
Changes
usageMetadata.trafficTypeto the correct pricing tier (ON_DEMAND_PRIORITY → priority, FLEX/BATCH → flex)input_cost_per_token_priority, etc.) and a "Service Tier / PayGo Pricing" subsection linking to Vertex and Bedrock docsContext
LiteLLM already supports PayGo cost tracking for Gemini/Vertex AI (PR #21909 in v1.82.0). This PR adds documentation so users can discover and understand the feature.
Made with Cursor