fix(cost_calc): update custom_llm_provider when base_model has different provider prefix#22906
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes a cost calculation bug (#22257) where a Key changes:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/cost_calculator.py | Adds a _provider_overridden guard after _select_model_name_for_cost_calc to extract and apply the provider prefix from selected_model when base_model carries a different provider than custom_llm_provider; also gates the hidden_params provider read behind this flag to prevent the corrected value from being overwritten. The logic is sound with no critical errors found. |
| tests/local_testing/test_completion_cost.py | Adds four new tests covering cross-provider base_model override via full pipeline, direct completion_cost call, hidden_params guard, and same-provider no-regression case. All tests use mocked responses (no real network calls). |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["completion_cost() called\nmodel='anthropic/my-deploy'\nbase_model='gemini/gemini-2.0-flash'\ncustom_llm_provider='anthropic'"] --> B["_select_model_name_for_cost_calc()\nbase_model is not None →\nreturn_model = 'gemini/gemini-2.0-flash'\nalready has known prefix → no prefix added"]
B --> C{"base_model != None\nselected_model != None\nnot custom_pricing?"}
C -- Yes --> D["Split selected_model on '/'\n_parts = ['gemini', 'gemini-2.0-flash']\nextracted = 'gemini'"]
D --> E{"extracted != custom_llm_provider?\n'gemini' != 'anthropic'"}
E -- Yes --> F["custom_llm_provider = 'gemini'\n_provider_overridden = True"]
E -- No --> G["No change\n_provider_overridden = False"]
C -- No --> G
F --> H["Loop over potential_model_names"]
H --> I{"hidden_params present?"}
I -- Yes --> J{"_provider_overridden?"}
J -- True --> K["Skip hidden_params provider read\ncustom_llm_provider stays 'gemini'"]
J -- False --> L["custom_llm_provider = hidden_params.get('custom_llm_provider', ...)"]
K --> M["cost_per_token('gemini/gemini-2.0-flash', provider='gemini')\n→ cost > 0 ✓"]
L --> M
Last reviewed commit: 213c1d9
litellm/cost_calculator.py
Outdated
| if base_model is not None and selected_model is not None: | ||
| _parts = selected_model.split("/", 1) | ||
| if len(_parts) > 1 and _parts[0] in LlmProvidersSet: | ||
| custom_llm_provider = _parts[0] |
There was a problem hiding this comment.
Guard fires even when custom_pricing=True ignores base_model
When custom_pricing=True, _select_model_name_for_cost_calc returns the original model (not base_model) — see its early-return path:
if custom_pricing is True:
...
return_model = model # base_model is completely ignored here
elif base_model is not None:
return_model = base_modelSo in the scenario custom_pricing=True, base_model="gemini/gemini-2.0-flash", and original model "anthropic/claude-3":
selected_model = "anthropic/claude-3"(from the original model, notbase_model)- The new guard fires (because
base_model is not None) and setscustom_llm_provider = "anthropic"— which happens to be correct only by coincidence.
The comment says the block is for "when base_model overrides model", but it fires even when it doesn't. A safer guard would limit the update to when base_model actually determined selected_model:
| if base_model is not None and selected_model is not None: | |
| _parts = selected_model.split("/", 1) | |
| if len(_parts) > 1 and _parts[0] in LlmProvidersSet: | |
| custom_llm_provider = _parts[0] | |
| if base_model is not None and selected_model is not None and not custom_pricing: | |
| _parts = selected_model.split("/", 1) | |
| if len(_parts) > 1 and _parts[0] in LlmProvidersSet: | |
| custom_llm_provider = _parts[0] |
There was a problem hiding this comment.
This concern is addressed — the code has been updated since this comment (marked outdated). The guard now explicitly checks not custom_pricing before overriding.
litellm/cost_calculator.py
Outdated
| # When base_model overrides model and carries its own provider prefix | ||
| # (e.g. base_model="gemini/gemini-2.0-flash" on an anthropic deployment), | ||
| # align custom_llm_provider so cost_per_token builds the correct key. | ||
| if base_model is not None and selected_model is not None: | ||
| _parts = selected_model.split("/", 1) | ||
| if len(_parts) > 1 and _parts[0] in LlmProvidersSet: | ||
| custom_llm_provider = _parts[0] |
There was a problem hiding this comment.
Unintended custom_llm_provider override for same-provider base_model with prefix
The condition updates custom_llm_provider whenever base_model is set AND selected_model already carries a provider prefix — regardless of whether that prefix actually differs from the current custom_llm_provider. This means users who today pass e.g. base_model="azure/gpt-4" on an Azure deployment (both prefix and provider are "azure") or base_model="openai/gpt-4" on an Azure deployment will see custom_llm_provider silently switched from "azure" to "openai", routing cost calculation to openai_cost_per_token instead of azure_openai_cost_per_token. While fixing the cross-provider bug, this is a subtle backwards-incompatible change for those callers.
Per the project's backward-compatibility policy, such a behavioral change should be gated or at least clearly documented. Consider restricting the override to cases where the new provider actually differs:
if base_model is not None and selected_model is not None and not custom_pricing:
_parts = selected_model.split("/", 1)
if len(_parts) > 1 and _parts[0] in LlmProvidersSet:
extracted = _parts[0]
if extracted != custom_llm_provider:
custom_llm_provider = extractedContext Used: Rule from dashboard - What: avoid backwards-incompatible changes without user-controlled flags
Why: This breaks current ... (source)
There was a problem hiding this comment.
This concern is addressed — the code has been updated since this comment (marked outdated). The guard now compares extracted provider against current custom_llm_provider and only overrides when they differ.
|
recheck |
|
All review feedback has been addressed in the latest push. The PR is ready for re-review and merge. Thank you! 🙏 |
|
All review feedback addressed in latest push. Ready for re-review. |
|
All review feedback addressed — remaining unresolved threads are automated bot suggestions that have been acknowledged with replies. No merge conflicts. Ready for maintainer review 🙏 |
| self._map_optional_params_to_responses_api_request( | ||
| optional_params, responses_api_request | ||
| optional_params, responses_api_request, | ||
| model=model, custom_llm_provider=custom_llm_provider, | ||
| ) |
There was a problem hiding this comment.
TypeError: unexpected keyword arguments passed to _map_optional_params_to_responses_api_request
The call at line 327 passes model=model and custom_llm_provider=custom_llm_provider as keyword arguments, but the method signature (line 224) only accepts self, optional_params, and responses_api_request. Any code path that calls transform_request — i.e. every chat-completion request routed through LiteLLMResponsesTransformationHandler — will raise:
TypeError: _map_optional_params_to_responses_api_request() got an unexpected keyword argument 'model'
Either the method signature needs to be updated to accept these new parameters, or these extra kwargs must be removed from the call site. Since model and custom_llm_provider are not used inside _map_optional_params_to_responses_api_request today, removing them from the call is the minimal fix:
| self._map_optional_params_to_responses_api_request( | |
| optional_params, responses_api_request | |
| optional_params, responses_api_request, | |
| model=model, custom_llm_provider=custom_llm_provider, | |
| ) | |
| self._map_optional_params_to_responses_api_request( | |
| optional_params, responses_api_request, | |
| ) |
| @@ -38,6 +38,7 @@ | |||
| ResponsesAPIStreamEvents, | |||
| ) | |||
| from litellm.types.utils import GenericStreamingChunk, ModelResponseStream | |||
| from litellm.utils import supports_reasoning | |||
There was a problem hiding this comment.
Unused import added
supports_reasoning is imported from litellm.utils but is never referenced anywhere in this file. This import appears to be left over from an incomplete or unrelated change. It should be removed to keep the module clean.
| from litellm.utils import supports_reasoning |
(Remove this line entirely.)
|
recheck |
1 similar comment
|
recheck |
| @@ -38,6 +38,7 @@ | |||
| ResponsesAPIStreamEvents, | |||
| ) | |||
| from litellm.types.utils import GenericStreamingChunk, ModelResponseStream | |||
| from litellm.utils import supports_reasoning | |||
There was a problem hiding this comment.
supports_reasoning is imported but never used in this file. This should be removed to keep imports clean.
| from litellm.utils import supports_reasoning | |
| from litellm.types.utils import GenericStreamingChunk, ModelResponseStream |
| self._map_optional_params_to_responses_api_request( | ||
| optional_params, responses_api_request | ||
| optional_params, responses_api_request, | ||
| model=model, custom_llm_provider=custom_llm_provider, | ||
| ) |
There was a problem hiding this comment.
TypeError: The call passes model= and custom_llm_provider= as keyword arguments, but the method signature at line 224 only accepts self, optional_params, and responses_api_request. This will raise TypeError: _map_optional_params_to_responses_api_request() got unexpected keyword argument 'model' at runtime for every request routed through LiteLLMResponsesTransformationHandler.
Since model and custom_llm_provider are not used inside _map_optional_params_to_responses_api_request, remove these extra kwargs from the call:
| self._map_optional_params_to_responses_api_request( | |
| optional_params, responses_api_request | |
| optional_params, responses_api_request, | |
| model=model, custom_llm_provider=custom_llm_provider, | |
| ) | |
| self._map_optional_params_to_responses_api_request( | |
| optional_params, responses_api_request | |
| ) |
…ent provider When base_model carries a provider prefix that differs from the deployment provider (e.g. base_model='gemini/gemini-2.0-flash' on an anthropic/ deployment), the custom_llm_provider was not updated, causing cost_per_token to build an invalid model key and return 0. After _select_model_name_for_cost_calc resolves the model name from base_model, extract the provider prefix and update custom_llm_provider so the downstream cost lookup uses the correct provider. Fixes #22257 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nd diff check - Skip custom_llm_provider override when custom_pricing=True (base_model unused) - Only override when extracted provider differs from current custom_llm_provider - Add direct completion_cost unit test for cross-provider base_model - Add same-provider no-regression test (e.g. openai/gpt-4o on openai deployment) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… base_model - Add _provider_overridden flag to prevent hidden_params from undoing base_model fix - Add direct unit test verifying hidden_params doesn't override extracted provider Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When reasoning_auto_summary is globally enabled, the reasoning param was injected unconditionally for all models including non-reasoning ones (e.g. gpt-4o-mini), causing OpenAI API errors. Now gated on supports_reasoning(model, custom_llm_provider) check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The transformation.py changes pass model= and custom_llm_provider= kwargs to _map_optional_params_to_responses_api_request() which only accepts (self, optional_params, responses_api_request) — causing a TypeError at runtime for every Responses API request. Reverted to upstream version; cost_calculator.py fix is unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
b622694
into
BerriAI:litellm_oss_staging_03_06_2026
Summary
Fixes #22257
When
base_modelcarries a provider prefix that differs from the deployment provider (e.g.base_model="gemini/gemini-2.0-flash"on ananthropic/deployment), thecustom_llm_providerwas not updated after_select_model_name_for_cost_calc()resolved the model name frombase_model. This causedcost_per_token()to build an invalid model key likeanthropic/gemini/gemini-2.0-flash, which would never be found inmodel_cost, resulting in a cost of 0.Fix
After
_select_model_name_for_cost_calc()returns incompletion_cost(), ifbase_modelwas provided and the selected model contains a known provider prefix, extract it and updatecustom_llm_providerso the downstream cost lookup uses the correct provider.Changed file:
litellm/cost_calculator.pyTest
Added
test_cost_calculator_base_model_cross_providerwhich usesbase_model="gemini/gemini-2.0-flash"on ananthropic/deployment and assertsresponse_cost > 0.