fix: preserve deployment custom pricing for /responses and /messages by DmitriyAlergant · Pull Request #23239 · BerriAI/litellm

DmitriyAlergant · 2026-03-10T06:19:15Z

Relevant issues

#23185

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
All good aside of 7 pre-existing failures due to responses package shadowing AttributeError: module 'responses' has no attribute 'activate', this is a pre-existing issue
My PR's scope is as isolated as possible, it only solves 1 specific problem
[will request] I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

Custom pricing from model_info config was ignored for /v1/responses (streaming) and /v1/messages (all modes) endpoints, causing cost calculations to use standard pricing instead. Regression introduced by: PR #20679 (commit 1ee43b11de). See repro steps in the issue linked above.

vercel · 2026-03-10T06:19:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 14, 2026 6:16pm

DmitriyAlergant · 2026-03-10T06:19:39Z

@greptileai please review

greptile-apps · 2026-03-10T06:23:51Z

Greptile Summary

This PR fixes a regression introduced by PR #20679 where custom pricing from model_info config was silently ignored for /v1/responses (streaming) and /v1/messages (all modes) endpoints. The root cause was that the Router places deployment model_info in kwargs["litellm_metadata"]["model_info"], but both affected code paths either dropped litellm_metadata entirely or used a broken or-chain (metadata or kwargs.get("litellm_metadata") or {}) that silently discarded the router-injected metadata when user metadata was non-empty.

Key changes:

Adds get_model_info_from_litellm_params to core_helpers.py: reads model_info from three lookup locations in priority order, using a sentinel (_MISSING_MODEL_INFO) to distinguish an absent key from an explicitly empty {}.
Adds merge_metadata_preserving_deployment_model_info to core_helpers.py: merges user-supplied metadata into litellm_metadata while ensuring the deployment's model_info is never overwritten by user-supplied values.
Fixes responses/main.py and llm_http_handler.py to use the merge helper when constructing metadata_for_callbacks / litellm_params["metadata"].
Fixes anthropic_passthrough_logging_handler.py to explicitly pass router_model_id and custom_pricing to litellm.completion_cost and to merge logging-obj metadata into kwargs["litellm_params"]["metadata"].
Updates all three cost calculation sites in litellm_logging.py (non-streaming, sync-streaming, async-streaming) to extract router_model_id from model_call_details["litellm_params"] via the new helper.
Comprehensive test coverage: unit tests for both helpers, and mock-based integration tests for all streaming modes and both endpoint types, including user-metadata passthrough and user-supplied model_info clobber scenarios.

Confidence Score: 4/5

Safe to merge — the fix is targeted, well-tested with mocked integration tests, and backwards-compatible; custom pricing that was previously broken for these endpoints will now work correctly.
The helpers are correctly implemented with a proper sentinel pattern. All three affected code paths (responses streaming, responses non-streaming, messages all modes) are fixed and covered by tests. The minor outstanding concern is that router_model_id is now injected into cost calculation for all non-streaming call types (not just responses/messages) via the broadened _process_hidden_params_and_response_cost change, though this is guarded by the existing elif response_cost is not None: pass branch for nearly all non-responses paths (which set cost via _hidden_params). The implementation is well-structured with a shared utility rather than scattered logic.
The litellm_logging.py changes deserve a second look to confirm no unintended cost-calculation side-effects for non-responses call types in the else branch of _process_hidden_params_and_response_cost.

Important Files Changed

Filename	Overview
litellm/litellm_core_utils/core_helpers.py	Adds two well-implemented helper functions: `get_model_info_from_litellm_params` (with sentinel pattern for explicit empty dict) and `merge_metadata_preserving_deployment_model_info` (preserves deployment model_info while merging user metadata). Clean, centralized solution. Minor: `or {}` for `metadata` uses falsy semantics while `model_info` uses sentinel, creating a subtle inconsistency (e.g., `metadata=None` falls through to lower-priority locations).
litellm/litellm_core_utils/litellm_logging.py	Three cost calculation call sites updated to extract `router_model_id` from `model_call_details["litellm_params"]` via the new helper — covering non-streaming, sync-streaming, and async-streaming paths. The change in `_process_hidden_params_and_response_cost` now applies to all non-streaming call types (not just responses/messages), which is guarded by the existing `elif response_cost is not None: pass` branch for most non-responses paths.
litellm/llms/custom_httpx/llm_http_handler.py	Fixes the `/v1/messages` path by using `merge_metadata_preserving_deployment_model_info` to build the `litellm_params["metadata"]` passed to `update_environment_variables`, ensuring `model_info` from `litellm_metadata` reaches the logging object's metadata for cost calculation.
litellm/proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py	Explicitly passes `router_model_id` and `custom_pricing` to `completion_cost` for the passthrough path. Also merges logging-obj metadata into `kwargs["litellm_params"]["metadata"]` using the helper, preserving deployment `model_info` while allowing existing pipeline keys to win for non-model_info fields. Well-tested by the new unit test.
litellm/responses/main.py	Core bug fix: replaces the broken `metadata or kwargs.get("litellm_metadata") or {}` or-chain with `merge_metadata_preserving_deployment_model_info(...)`, ensuring `litellm_metadata["model_info"]` (router-injected) is never silently dropped when user `metadata` is also non-empty.
tests/test_litellm/test_custom_pricing_metadata_propagation.py	New comprehensive test file covering unit tests for both helpers and integration tests for all responses API paths (async non-streaming, async streaming, sync streaming) and the messages API path. All HTTP calls are properly mocked. Minor: `list[str]` type annotation (line 143) is Python 3.9+ syntax — consistent with the rest of the codebase.
tests/test_litellm/litellm_core_utils/test_litellm_logging.py	Adds unit tests verifying the sentinel-based empty-dict behavior of `get_model_info_from_litellm_params` and the model_info precedence of `merge_metadata_preserving_deployment_model_info`. Also adds integration tests for `_process_hidden_params_and_response_cost` and `success_handler` confirming `router_model_id` is passed to `_response_cost_calculator` for aresponses call types.
tests/test_litellm/proxy/pass_through_endpoints/llm_provider_handlers/test_anthropic_passthrough_logging_handler.py	Adds `TestAnthropicPassthroughLoggingPayload` class with streaming integration tests using a mocked HTTP response and a `CostCapturingCallback` to verify custom pricing is applied for the passthrough streaming path. Also adds a unit test verifying the metadata merge doesn't clobber existing `litellm_params` keys like `stream_response`.

Sequence Diagram

sequenceDiagram
    participant User
    participant Router
    participant responses_main as responses/main.py
    participant llm_http as llm_http_handler.py
    participant passthrough as anthropic_passthrough_logging_handler.py
    participant logging as litellm_logging.py
    participant core_helpers as core_helpers.py

    User->>Router: aresponses(model, input, metadata={...})
    Router->>Router: _update_kwargs_with_deployment()<br/>sets kwargs["litellm_metadata"]["model_info"]
    Router->>responses_main: responses(kwargs incl. litellm_metadata)

    responses_main->>core_helpers: merge_metadata_preserving_deployment_model_info(<br/>litellm_metadata=kwargs["litellm_metadata"],<br/>user_metadata=metadata, model_info=kwargs["model_info"])
    core_helpers-->>responses_main: metadata_for_callbacks<br/>(model_info preserved from litellm_metadata)

    responses_main->>logging: update_environment_variables(<br/>litellm_params={"metadata": metadata_for_callbacks})
    Note over logging: self.litellm_params["metadata"]["model_info"] = deployment model_info

    responses_main->>llm_http: HTTP call

    alt /v1/messages path
        llm_http->>core_helpers: merge_metadata_preserving_deployment_model_info(...)
        core_helpers-->>llm_http: merged_metadata (model_info preserved)
        llm_http->>logging: update_environment_variables(litellm_params={"metadata": merged_metadata})
    end

    alt Anthropic passthrough path
        passthrough->>core_helpers: get_model_info_from_litellm_params(logging_obj.litellm_params)
        core_helpers-->>passthrough: model_info
        passthrough->>passthrough: completion_cost(router_model_id=model_info["id"],<br/>custom_pricing=use_custom_pricing_for_model(...))
    end

    llm_http-->>responses_main: response
    responses_main->>logging: success_handler(result)
    logging->>core_helpers: get_model_info_from_litellm_params(model_call_details["litellm_params"])
    core_helpers-->>logging: model_info with deployment id
    logging->>logging: _response_cost_calculator(result,<br/>router_model_id=model_info["id"])
    Note over logging: Custom pricing applied via deployment ID lookup
    logging-->>User: async_log_success_event(kwargs, response_cost=custom_price)

_{Last reviewed commit: b19fcaa}

tests/test_litellm/test_custom_pricing_metadata_propagation.py

greptile-apps · 2026-03-10T06:23:55Z

.../proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py

            kwargs["model"] = model
+            kwargs.setdefault("litellm_params", {})
+            if litellm_params:
+                kwargs["litellm_params"].update(litellm_params)


Unconditional .update(litellm_params) may overwrite existing kwargs["litellm_params"] keys

litellm_params here is extracted from logging_obj.model_call_details["litellm_params"], which was populated by update_environment_variables in llm_http_handler.py. It contains not only metadata (the key you need to propagate) but also preset_cache_key: None, stream_response: {}, and all of anthropic_messages_optional_request_params (e.g. max_tokens, stream, etc.).

If kwargs["litellm_params"] already holds data that was set earlier in the proxy request pipeline, calling .update(litellm_params) overwrites any conflicting keys. The intent appears to be to ensure metadata with model_info reaches the downstream logging callbacks — a targeted merge on just metadata would be safer:

kwargs.setdefault("litellm_params", {}) if litellm_params: # Only propagate metadata (which carries model_info for custom pricing). # Avoid clobbering unrelated keys already set in kwargs["litellm_params"]. kwargs["litellm_params"].setdefault("metadata", {}) kwargs["litellm_params"]["metadata"].update( litellm_params.get("metadata") or {} )

greptile-apps · 2026-03-10T06:23:56Z

litellm/litellm_core_utils/litellm_logging.py

+            _lp_meta = (
+                self.model_call_details
+                .get("litellm_params", {})
+                .get("metadata", {}) or {}
+            )
+            _lp = self.model_call_details.get("litellm_params", {}) or {}
+            _mi = (
+                _lp_meta.get("model_info")
+                or _lp.get("model_info")
+                or (_lp.get("litellm_metadata", {}) or {}).get("model_info")
+                or {}
+            )


Duplicated model_info resolution logic

The identical 3-location lookup pattern:

_lp_meta = self.model_call_details.get("litellm_params", {}).get("metadata", {}) or {} _lp = self.model_call_details.get("litellm_params", {}) or {} _mi = ( _lp_meta.get("model_info") or _lp.get("model_info") or (_lp.get("litellm_metadata", {}) or {}).get("model_info") or {} )

appears verbatim three times: here (non-streaming path), at the streaming path around line 2491, and in anthropic_passthrough_logging_handler.py lines 122–129. Any future change to where model_info can live (e.g. a fourth location) requires updating all three copies.

Consider extracting it to a small helper on the Logging class (or as a module-level function), e.g.:

def _extract_model_info_from_litellm_params(litellm_params: dict) -> dict: metadata = litellm_params.get("metadata", {}) or {} return ( metadata.get("model_info") or litellm_params.get("model_info") or (litellm_params.get("litellm_metadata", {}) or {}).get("model_info") or {} )

greptile-apps · 2026-03-10T07:33:48Z

litellm/litellm_core_utils/litellm_logging.py

+def _get_model_info_from_litellm_params(
+    litellm_params: Optional[dict],
+) -> Dict[str, Any]:
+    if litellm_params is None:
+        return {}
+
+    metadata = litellm_params.get("metadata", {}) or {}
+    return (
+        metadata.get("model_info")
+        or litellm_params.get("model_info")
+        or (litellm_params.get("litellm_metadata", {}) or {}).get("model_info")
+        or {}
+    )


_get_model_info_from_litellm_params is annotated to return Dict[str, Any], but the expression chain can return a non-dict truthy value. If someone stores a non-dict value under model_info (e.g., a string), call sites like line 1656 (model_info.get("id")) will raise AttributeError at runtime.

Add a defensive cast to guarantee the return type:

Suggested change

def _get_model_info_from_litellm_params(

litellm_params: Optional[dict],

) -> Dict[str, Any]:

if litellm_params is None:

return {}

metadata = litellm_params.get("metadata", {}) or {}

return (

metadata.get("model_info")

or litellm_params.get("model_info")

or (litellm_params.get("litellm_metadata", {}) or {}).get("model_info")

or {}

)

def _get_model_info_from_litellm_params(

litellm_params: Optional[dict],

) -> Dict[str, Any]:

if litellm_params is None:

return {}

metadata = litellm_params.get("metadata", {}) or {}

result = (

metadata.get("model_info")

or litellm_params.get("model_info")

or (litellm_params.get("litellm_metadata", {}) or {}).get("model_info")

or {}

)

return result if isinstance(result, dict) else {}

litellm/responses/main.py

greptile-apps · 2026-03-11T00:18:55Z

litellm/responses/streaming_iterator.py

+        if "model_info" not in request_payload and self.litellm_metadata:
+            _mi = self.litellm_metadata.get("model_info")
+            if _mi:
+                request_payload["model_info"] = _mi


Top-level model_info injection bypasses the established lookup path

_get_model_info_from_litellm_params (the helper added by this PR) looks inside litellm_params["metadata"], litellm_params["model_info"], or litellm_params["litellm_metadata"] — it does not look at the top-level key being set here (request_payload["model_info"]).

After the fix in responses/main.py, litellm_params["metadata"]["model_info"] is already populated correctly when _handle_completed_response runs. The streaming cost path at litellm_logging.py:2473 extracts model_info from model_call_details["litellm_params"] (via the same helper), so this top-level injection has no effect on cost calculation.

The change appears to target update_response_metadata / set_hidden_params so that hidden_params["model_id"] gets set — but the explicit router_model_id argument now passed by litellm_logging.py renders that fallback unnecessary.

Consider documenting the exact code path this unblocks (i.e., which consumer of request_payload["model_info"] requires it), or remove the block if it is truly redundant, to avoid future confusion.

litellm/llms/custom_httpx/llm_http_handler.py

greptile-apps · 2026-03-11T05:14:12Z

litellm/litellm_core_utils/litellm_logging.py

+            model_info = _get_model_info_from_litellm_params(
+                self.model_call_details.get("litellm_params", {}) or {}
+            )
            self.model_call_details["response_cost"] = self._response_cost_calculator(
-                result=logging_result
+                result=logging_result,
+                router_model_id=model_info.get("id"),
            )


Broader scope than intended for this fix

The new router_model_id injection at lines 1651–1657 affects ALL non-streaming call types that reach the else branch (chat completions, embeddings, etc.), not just /v1/responses. While regressions are prevented by the earlier elif self.model_call_details.get("response_cost") is not None: pass guard, it is worth verifying that for these other call types _get_model_info_from_litellm_params consistently returns {} (or a dict without "id") so that router_model_id=None is passed and cost calculation is unchanged.

If any existing call type stores a model_info with an "id" in litellm_params (which many router-dispatched calls do), _response_cost_calculator will now be called with a non-None router_model_id where it previously wasn't. If the deployment ID happens to exist in litellm.model_cost with unexpectedly-priced entries, this could silently change cost calculations for those call types.

Consider scoping the router_model_id injection to the specific call types this PR targets (aresponses, responses) by guarding on self.call_type:

# Only apply router_model_id for response API call types if self.call_type in ("aresponses", "responses"): model_info = _get_model_info_from_litellm_params( self.model_call_details.get("litellm_params", {}) or {} ) router_model_id = model_info.get("id") else: router_model_id = None self.model_call_details["response_cost"] = self._response_cost_calculator( result=logging_result, router_model_id=router_model_id, )

tests/test_litellm/test_custom_pricing_metadata_propagation.py

greptile-apps · 2026-03-11T05:14:14Z

.../proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py

+            if litellm_params.get("metadata") is not None:
+                existing_metadata = kwargs["litellm_params"].get("metadata", {}) or {}
+                merged_metadata = dict(existing_metadata)
+                merged_metadata.update(litellm_params.get("metadata") or {})
+                kwargs["litellm_params"]["metadata"] = merged_metadata


Merge direction may silently overwrite proxy-pipeline metadata

The merge applies logging_obj's metadata on top of the existing kwargs["litellm_params"]["metadata"]:

merged_metadata = dict(existing_metadata) merged_metadata.update(litellm_params.get("metadata") or {}) # logging_obj wins

litellm_params.get("metadata") here is the merged_metadata built by llm_http_handler.py, which already contains all user-provided metadata keys plus the deployment's model_info. Because logging_obj's metadata wins, any key that the proxy pipeline added to kwargs["litellm_params"]["metadata"] after the initial handler ran (e.g., a middleware that injects user or trace_id) will be silently overwritten if that key also appears in the logging object's metadata.

For correctness in the custom-pricing case you only need to ensure the model_info key from litellm_params reaches kwargs["litellm_params"]["metadata"]. A targeted write is safer:

kwargs.setdefault("litellm_params", {}) kwargs["litellm_params"].setdefault("metadata", {}) if litellm_params.get("metadata", {}).get("model_info"): kwargs["litellm_params"]["metadata"]["model_info"] = ( litellm_params["metadata"]["model_info"] )

tests/test_litellm/test_custom_pricing_metadata_propagation.py

litellm/litellm_core_utils/litellm_logging.py

greptile-apps · 2026-03-11T17:36:52Z

.../proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py

+            if litellm_params.get("metadata") is not None:
+                logging_metadata = litellm_params.get("metadata") or {}
+                existing_metadata = kwargs["litellm_params"].get("metadata", {}) or {}
+                merged_metadata = dict(logging_metadata)
+                merged_metadata.update(existing_metadata)
+                if logging_metadata.get("model_info") is not None:
+                    merged_metadata["model_info"] = logging_metadata["model_info"]
+                kwargs["litellm_params"]["metadata"] = merged_metadata


Merge start-point is counter-intuitive but correctly tested

The merge starts from logging_metadata and then overlays existing_metadata, which means existing pipeline keys win for every field except model_info. The test at test_custom_pricing_metadata_propagation.py:785 (shared_field == "from-existing") confirms this is intentional.

A brief comment would help future readers understand the direction:

Suggested change

if litellm_params.get("metadata") is not None:

logging_metadata = litellm_params.get("metadata") or {}

existing_metadata = kwargs["litellm_params"].get("metadata", {}) or {}

merged_metadata = dict(logging_metadata)

merged_metadata.update(existing_metadata)

if logging_metadata.get("model_info") is not None:

merged_metadata["model_info"] = logging_metadata["model_info"]

kwargs["litellm_params"]["metadata"] = merged_metadata

if litellm_params.get("metadata") is not None:

logging_metadata = litellm_params.get("metadata") or {}

existing_metadata = kwargs["litellm_params"].get("metadata", {}) or {}

# Start from logging_obj metadata, let existing pipeline metadata win for

# all keys, then unconditionally restore deployment model_info.

merged_metadata = dict(logging_metadata)

merged_metadata.update(existing_metadata)

if logging_metadata.get("model_info") is not None:

merged_metadata["model_info"] = logging_metadata["model_info"]

kwargs["litellm_params"]["metadata"] = merged_metadata

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-03-11T17:36:53Z

tests/test_litellm/test_custom_pricing_metadata_propagation.py

+                deadline = time.time() + 2.0
+                while cost_callback.response_cost is None and time.time() < deadline:
+                    time.sleep(0.01)


Polling loop is redundant for sync streaming

For sync streaming, log_success_event is called synchronously as each chunk is processed inside the for _chunk in response loop above. By the time that loop exits, cost_callback.response_cost is already set. The while polling loop will exit immediately on the first iteration.

Consider simplifying:

Suggested change

deadline = time.time() + 2.0

while cost_callback.response_cost is None and time.time() < deadline:

time.sleep(0.01)

# For sync streaming, cost_callback.response_cost is already set

# by the time the for-loop above completes (callbacks are synchronous).

If there is a theoretical case where the callback is deferred (e.g. a background thread), a brief comment explaining that intent would help readers understand why the poll is needed.

DmitriyAlergant · 2026-03-11T19:52:42Z

@greptileai @greptile-apps can you review one last commit 1f3530185 and update the summary?

tests/test_litellm/test_custom_pricing_metadata_propagation.py

greptile-apps · 2026-03-13T05:20:19Z

litellm/litellm_core_utils/litellm_logging.py

+        or litellm_params.get("model_info")
+        or (litellm_params.get("litellm_metadata", {}) or {}).get("model_info")


Empty-dict short-circuit in or-chain silently skips explicit model_info: {}

The chained or evaluates Python truthiness, so if metadata["model_info"] is an empty dict {} (falsy), the expression falls through to litellm_params.get("model_info") even though model_info was explicitly set in the highest-priority location.

In practice this is benign (an empty model_info has no custom pricing), but it creates an inconsistency: a caller who deliberately sets metadata={"model_info": {}} to reset any inherited model info would silently get a different (lower-priority) model_info returned instead of {}.

Consider using an explicit sentinel instead of truthiness:

_MISSING = object() metadata_mi = metadata.get("model_info", _MISSING) if metadata_mi is not _MISSING: result = metadata_mi else: lp_mi = litellm_params.get("model_info", _MISSING) if lp_mi is not _MISSING: result = lp_mi else: result = (litellm_params.get("litellm_metadata") or {}).get("model_info", {}) return result if isinstance(result, dict) else {}

tests/test_litellm/test_custom_pricing_metadata_propagation.py

litellm/llms/custom_httpx/llm_http_handler.py

DmitriyAlergant · 2026-03-13T13:59:29Z

@ishaan-berri can you review?

This fixes a regression you introduced in #20679

4/5 greptile confidence, 5.0 is not feasible at this complexity; It keeps coming up with ridiculously minor and far-fetched stiff.

…sages

krrishdholakia · 2026-03-14T17:28:04Z

litellm/litellm_core_utils/litellm_logging.py

            pass
        else:
+            router_model_id = None
+            if self.call_type in (


this is a weird hardcoding, why do it this way? @DmitriyAlergant

DmitriyAlergant · 2026-03-14T19:08:52Z

@krrishdholakia to be honest this was mainly to satisfy "the beast" (greptile): #23239 (comment)

She is not wrong, since other endpoints are not properly covered by tests for this change. And this core pathway is complex and not 100% understood for implications on all possible endpoints.

Based on your comment, I pushed a commit to remove this guard, and the tests are still passing.

krrishdholakia · 2026-03-14T20:21:28Z

Fix on main

vercel bot had a problem deploying to Preview March 10, 2026 06:21 Failure

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

vercel bot had a problem deploying to Preview March 10, 2026 07:27 Failure

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

vercel bot had a problem deploying to Preview March 11, 2026 00:19 Failure

vercel bot had a problem deploying to Preview March 11, 2026 05:09 Failure

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

vercel bot had a problem deploying to Preview March 11, 2026 16:59 Failure

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

tests/test_litellm/test_custom_pricing_metadata_propagation.py Show resolved Hide resolved

vercel bot had a problem deploying to Preview March 11, 2026 17:21 Failure

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

tests/test_litellm/test_custom_pricing_metadata_propagation.py Show resolved Hide resolved

vercel bot had a problem deploying to Preview March 11, 2026 17:31 Failure

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

vercel bot had a problem deploying to Preview March 11, 2026 18:03 Failure

vercel bot had a problem deploying to Preview March 13, 2026 04:27 Failure

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

tests/test_litellm/test_custom_pricing_metadata_propagation.py Outdated Show resolved Hide resolved

vercel bot had a problem deploying to Preview March 13, 2026 05:11 Failure

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

vercel bot had a problem deploying to Preview March 13, 2026 06:14 Failure

vercel bot had a problem deploying to Preview March 13, 2026 07:29 Failure

vercel bot had a problem deploying to Preview March 13, 2026 08:12 Failure

DmitriyAlergant mentioned this pull request Mar 13, 2026

[Bug]: Custom Pricing is broken for /messages and /responses endpoints (streaming) #23185

Closed

1 task

DmitriyAlergant force-pushed the fix-custom-pricing branch from 24c8438 to b07d091 Compare March 13, 2026 16:35

vercel bot had a problem deploying to Preview March 13, 2026 16:37 Failure

DmitriyAlergant added 13 commits March 13, 2026 12:51

fix: preserve deployment custom pricing for /v1/responses and /v1/mes…

5f860e7

…sages

fix: address Greptile feedback on custom pricing propagation

76c9fcb

Guard model_info lookup in logging

6c8c4b1

Preserve deployment pricing metadata

db0348a

Refine custom pricing metadata propagation

2b742db

test: tighten messages custom pricing metadata coverage

88a7e10

fix: honor custom pricing for sync responses streaming

25ef379

Clean up custom pricing callback propagation

5d29f13

Fix anthropic messages metadata overwrite

c627eca

Fix model_info precedence and move proxy test (no merge-helper refactor)

687fe79

Fix model_info merge edge cases

a8131dd

Handle aresponses logging cost fallback

30a58fc

refactor: share metadata merge for pricing callbacks

719629c

DmitriyAlergant force-pushed the fix-custom-pricing branch from b07d091 to 719629c Compare March 13, 2026 17:13

vercel bot deployed to Preview March 13, 2026 17:14 View deployment

Clean up custom pricing metadata helpers

a5b023d

vercel bot deployed to Preview March 13, 2026 17:57 View deployment

Align model_info merge semantics

38fa475

vercel bot deployed to Preview March 13, 2026 18:14 View deployment

Remove model info wrapper and harden test cleanup

800fb8f

vercel bot deployed to Preview March 13, 2026 21:05 View deployment

krrishdholakia reviewed Mar 14, 2026

View reviewed changes

logging: remove router model id call type guard

b19fcaa

vercel bot deployed to Preview March 14, 2026 18:16 View deployment

krrishdholakia closed this Mar 14, 2026

		or litellm_params.get("model_info")
		or (litellm_params.get("litellm_metadata", {}) or {}).get("model_info")

Uh oh!

Conversation

DmitriyAlergant commented Mar 10, 2026

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DmitriyAlergant commented Mar 10, 2026

Uh oh!

greptile-apps bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

DmitriyAlergant commented Mar 11, 2026

Uh oh!

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DmitriyAlergant commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krrishdholakia Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

DmitriyAlergant commented Mar 14, 2026

Uh oh!

krrishdholakia commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 10, 2026 •

edited

Loading

greptile-apps bot commented Mar 10, 2026 •

edited

Loading

DmitriyAlergant commented Mar 13, 2026 •

edited

Loading