fix(proxy): log alias model mismatch at debug, not warning#21749
fix(proxy): log alias model mismatch at debug, not warning#21749Elliot0122 wants to merge 2 commits intoBerriAI:mainfrom
Conversation
Also fix pre-existing pyright type narrowing error in _handle_llm_api_exception for httpx.HTTPStatusError.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds an
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/common_request_processing.py | Adds upstream_model parameter to _override_openai_response_model to distinguish known alias mismatches (DEBUG) from unexpected ones (WARNING). Also fixes pyright type narrowing for httpx.HTTPStatusError. The dict branch has an inconsistency — it doesn't apply the same upstream_model check, leading to different log levels for the same logical mismatch depending on response type. |
| tests/test_litellm/proxy/test_common_request_processing.py | Adds four well-structured mock-only tests covering the new upstream_model parameter: known alias (debug), unknown mismatch (warning), no upstream model (warning), and no mismatch (no logging). All tests use proper mocking with no network calls. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["_override_openai_response_model called"] --> B{requested_model falsy?}
B -->|Yes| Z[Return early]
B -->|No| C{Fallback detected?}
C -->|Yes| Z
C -->|No| D{response_obj is dict?}
D -->|Yes| E{downstream != requested?}
E -->|Yes| F["LOG: DEBUG (always)"]
E -->|No| G["Set dict model → requested"]
F --> G
G --> Z
D -->|No| H{has model attr?}
H -->|No| I["LOG: ERROR"]
I --> Z
H -->|Yes| J{downstream != requested?}
J -->|No| K["setattr model → requested"]
J -->|Yes| L{upstream_model provided AND downstream == upstream?}
L -->|Yes| M["LOG: DEBUG (known alias)"]
L -->|No| N["LOG: WARNING (unexpected mismatch)"]
M --> K
N --> K
Last reviewed commit: 4b3cb5e
| else: | ||
| verbose_proxy_logger.warning( | ||
| "%s: response model mismatch - requested=%r downstream=%r. Overriding response.model to requested model.", | ||
| log_context, | ||
| requested_model, | ||
| downstream_model, | ||
| ) |
There was a problem hiding this comment.
Re-introduces WARNING that was intentionally removed
PR #20994 (commit a2e9e73b6) deliberately changed all model-mismatch logs from WARNING to DEBUG because high-traffic customers using model aliases were generating millions of warnings per day, flooding logs and causing disk space issues.
This PR re-introduces WARNING for the non-alias mismatch case (when upstream_model is None or doesn't match downstream_model). However, many legitimate scenarios produce this mismatch without an upstream_model being available — e.g., the streaming path in proxy_server.py:5059 doesn't pass upstream_model at all.
Consider whether the WARNING reintroduction here is intentional, given the history of #20994. If it is, the streaming path in proxy_server.py should also be updated for consistency.
Additional Comments (1)
The |
Additional Comments (1)
The If the intent is to surface genuine unexpected mismatches at |
Review1. Does this PR fix the issue it describes? 2. Has this issue already been solved elsewhere? 3. Are there other PRs addressing the same problem? 4. Are there other issues this potentially closes? ✅ LGTM — also fixes a pyright type error as bonus. |
Type
🐛 Bug Fix
Changes
When a proxy model alias maps to an upstream model (e.g.
"my-alias"→"hosted_vllm/llama-3"), the downstream response carries the upstream modelname. Previously this triggered a
WARNINGlog on every request even thoughthe mismatch was expected.
_override_openai_response_modelnow accepts an optionalupstream_modelparameter. If the downstream model matches the upstream model, the override
logs at
DEBUGinstead ofWARNING. Genuine unexpected mismatches still logat
WARNING.Also fixes a pre-existing pyright type narrowing error in
_handle_llm_api_exceptionforhttpx.HTTPStatusError(was blocking commitvia pre-commit hook).