Skip to content

docs: add PayGo/priority cost tracking for Gemini Vertex AI#22948

Merged
Sameerlite merged 1 commit intomainfrom
litellm_vertex-paygo-docs
Mar 6, 2026
Merged

docs: add PayGo/priority cost tracking for Gemini Vertex AI#22948
Sameerlite merged 1 commit intomainfrom
litellm_vertex-paygo-docs

Conversation

@Sameerlite
Copy link
Collaborator

Summary

Documents the existing PayGo (priority) cost tracking support for Gemini/Vertex AI.

Changes

  • Vertex AI provider doc: Added "PayGo / Priority Cost Tracking" section explaining how LiteLLM maps usageMetadata.trafficType to the correct pricing tier (ON_DEMAND_PRIORITY → priority, FLEX/BATCH → flex)
  • Custom pricing doc: Added service tier cost keys (input_cost_per_token_priority, etc.) and a "Service Tier / PayGo Pricing" subsection linking to Vertex and Bedrock docs
  • Cost tracking doc: Added note about provider-specific cost tracking (Vertex PayGo, Bedrock service tiers, Azure base model)

Context

LiteLLM already supports PayGo cost tracking for Gemini/Vertex AI (PR #21909 in v1.82.0). This PR adds documentation so users can discover and understand the feature.

Made with Cursor

- Add PayGo / Priority Cost Tracking section to Vertex AI provider docs
- Document trafficType to service_tier mapping (ON_DEMAND_PRIORITY, FLEX, etc.)
- Add service tier cost keys to custom pricing docs
- Add provider-specific cost tracking note to spend tracking overview

Made-with: Cursor
@vercel
Copy link

vercel bot commented Mar 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Building Building Preview, Comment Mar 6, 2026 3:06am

Request Review

@Sameerlite Sameerlite merged commit 31c43ba into main Mar 6, 2026
28 of 42 checks passed
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This is a documentation-only PR that adds a "PayGo / Priority Cost Tracking" section to the Vertex AI provider docs, cross-links in the spend tracking and custom pricing docs, and documents the input_cost_per_token_priority/flex cost keys. The underlying feature (PR #21909, v1.82.0) is already implemented in litellm/llms/vertex_ai/cost_calculator.py and litellm_core_utils/llm_cost_calc/utils.py.

Key observations:

  • The Vertex AI section accurately reflects the code: trafficType from usageMetadata is mapped to service_tier (ON_DEMAND_PRIORITYpriority, FLEX/BATCHflex) and the matching cost keys are looked up in model_prices_and_context_window.json.
  • The anchor links (#paygo--priority-cost-tracking, #usage---service-tier, #set-base_model-for-cost-tracking-eg-azure-deployments) are all valid.
  • The custom_pricing.md description of input_cost_per_token_priority / output_cost_per_token_priority attributing these keys to "Bedrock" is potentially misleading — no Bedrock model currently has these tier-specific keys in model_prices_and_context_window.json. The code falls back silently to standard pricing for Bedrock, so tier-differentiated cost tracking does not work out of the box for Bedrock as the note implies.
  • A minor formatting inconsistency: standard in the vertex.md table is missing backticks while priority and flex are wrapped.

Confidence Score: 5/5

  • This is a documentation-only PR with no code changes; safe to merge after addressing the Bedrock attribution inaccuracy.
  • All changes are in Markdown docs with no runtime impact. The Vertex AI documentation is accurate and matches the code. The only substantive concern is a potentially misleading mention of Bedrock in the pricing key description, which may confuse users but does not affect functionality.
  • docs/my-website/docs/proxy/custom_pricing.md — the Bedrock attribution for priority/flex pricing keys should be clarified.

Important Files Changed

Filename Overview
docs/my-website/docs/providers/vertex.md Adds a "PayGo / Priority Cost Tracking" section that accurately describes the trafficType→service_tier mapping implemented in vertex_ai/cost_calculator.py; minor backtick formatting inconsistency in the table.
docs/my-website/docs/proxy/cost_tracking.md Adds a single informational sentence with cross-links to provider-specific cost tracking docs; links and anchors are all valid.
docs/my-website/docs/proxy/custom_pricing.md Adds priority/flex pricing key descriptions and a new sub-section on Service Tier / PayGo Pricing; the claim that input_cost_per_token_priority applies to Bedrock is misleading since Bedrock models don't currently have those keys in model_prices_and_context_window.json.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Vertex AI API Response] --> B{usageMetadata.trafficType?}
    B -->|ON_DEMAND_PRIORITY| C[service_tier = 'priority']
    B -->|FLEX or BATCH| D[service_tier = 'flex']
    B -->|ON_DEMAND or absent| E[service_tier = None / standard]
    C --> F[Look up input_cost_per_token_priority\nin model_prices_and_context_window.json]
    D --> G[Look up input_cost_per_token_flex\nin model_prices_and_context_window.json]
    E --> H[Look up input_cost_per_token\nin model_prices_and_context_window.json]
    F -->|key missing| I[Fallback to input_cost_per_token]
    G -->|key missing| I
    H --> J[Final spend recorded]
    I --> J
Loading

Last reviewed commit: baa5d72

| Vertex AI `trafficType` | LiteLLM `service_tier` | Pricing applied |
|-------------------------|-------------------------|-----------------|
| `ON_DEMAND_PRIORITY` | `priority` | PayGo / priority pricing (`input_cost_per_token_priority`, `output_cost_per_token_priority`) |
| `ON_DEMAND` | standard | Default on-demand pricing |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent backtick formatting for standard tier

The priority and flex values in the LiteLLM service_tier column are wrapped in backticks, but standard is not. For consistency:

Suggested change
| `ON_DEMAND` | standard | Default on-demand pricing |
| `ON_DEMAND` | `standard` | Default on-demand pricing |

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +107 to +108
- `input_cost_per_token_priority` / `output_cost_per_token_priority` - Priority/PayGo pricing (Vertex AI Gemini, Bedrock)
- `input_cost_per_token_flex` / `output_cost_per_token_flex` - Batch/flex pricing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading Bedrock attribution for priority/flex pricing keys

The description says input_cost_per_token_priority / output_cost_per_token_priority applies to "(Vertex AI Gemini, Bedrock)", but inspecting model_prices_and_context_window.json shows that no Bedrock model entries currently have input_cost_per_token_priority or input_cost_per_token_flex keys. Only Vertex AI (and Google Gemini, Azure, OpenAI) models have these entries populated.

The code does have a fallback (_get_cost_per_unit in llm_cost_calc/utils.py) that gracefully returns standard pricing when a tier-specific key is missing, so Bedrock requests won't error — but the cost tracking will silently use standard rates regardless of the requested service tier, which is the opposite of what users reading this note might expect.

Consider either:

  • Removing "Bedrock" from this line (since the keys are not pre-populated for it), or
  • Clarifying that for Bedrock, these keys can be manually set in custom pricing to enable tier-differentiated tracking, but no out-of-the-box priority/flex pricing data exists for Bedrock today.
Suggested change
- `input_cost_per_token_priority` / `output_cost_per_token_priority` - Priority/PayGo pricing (Vertex AI Gemini, Bedrock)
- `input_cost_per_token_flex` / `output_cost_per_token_flex` - Batch/flex pricing
- `input_cost_per_token_priority` / `output_cost_per_token_priority` - Priority/PayGo pricing (Vertex AI Gemini; can be manually set for other providers)
- `input_cost_per_token_flex` / `output_cost_per_token_flex` - Batch/flex pricing (Vertex AI Gemini; can be manually set for other providers)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant