fix: exclude gpt-5.2-chat from temperature passthrough#22342
Conversation
When a gunicorn worker exits (e.g. from max_requests recycling), its per-process prometheus .db files remain on disk. For gauges using livesum/liveall mode, this means the dead worker's last-known values persist as if the process were still alive. Wire gunicorn's child_exit hook to call mark_process_dead() so live-tracking gauges accurately reflect only running workers.
…cleanup Add Prometheus child_exit cleanup for gunicorn workers
…ng, and LLM Gateway (#21130) * docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway provider config * feat: add AssemblyAI LLM Gateway as OpenAI-compatible provider
…th_info Tests were mocking the old method name `filter_server_ids_by_ip` but production code at server.py:774 calls `filter_server_ids_by_ip_with_info` which returns a (server_ids, blocked_count) tuple. The unmocked method on AsyncMock returned a coroutine, causing "cannot unpack non-iterable coroutine object" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ion behavior Tests were asserting no response.create/conversation.item.create sent to backend when guardrail blocks, but the implementation intentionally sends these to have the LLM voice the guardrail violation message to the user. Updated assertions to verify the correct guardrail flow: - response.cancel is sent to stop any in-progress response - conversation.item.create with violation message is injected - response.create is sent to voice the violation - original blocked content is NOT forwarded Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The revert in 8565c70 removed the parallel_tool_calls handling from map_openai_params, and the subsequent fix d0445e1 only re-added the transform_request consumption but forgot to re-add the map_openai_params producer that sets _parallel_tool_use_config. This meant parallel_tool_calls was silently ignored for all Bedrock models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Commit 99c62ca removed "azure" from _RESPONSES_API_PROVIDERS, routing Azure models through litellm.completion instead of litellm.responses. The test was not updated to match, causing it to assert against the wrong mock. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…22319) * feat: add in_flight_requests metric to /health/backlog + prometheus * refactor: clean class with static methods, add tests, fix sentinel pattern * docs: add in_flight_requests to prometheus metrics and latency troubleshooting
PR #22271 added the LiteLLM_ClaudeCodePluginTable model to schema.prisma but did not include a corresponding migration file, causing test_aaaasschema_migration_check to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses Greptile review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sertions fix(test): update realtime guardrail test assertions for voice violation behavior
…e-test fix(test): update Azure pass-through test after Responses API routing change
fix(db): add missing migration for LiteLLM_ClaudeCodePluginTable
…s-map-params fix(bedrock): restore parallel_tool_calls mapping in map_openai_params
…e agents (#22329) * fix: enforce RBAC on agent endpoints — block non-admin create/update/delete - Add /v1/agents/{agent_id} to agent_routes so internal users can access GET-by-ID (previously returned 403 due to missing route pattern) - Add _check_agent_management_permission() guard to POST, PUT, PATCH, DELETE agent endpoints — only PROXY_ADMIN may mutate agents - Add user_api_key_dict param to delete_agent so the role check works - Add comprehensive unit tests for RBAC enforcement across all roles Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: mock prisma_client in internal user get-agent-by-id test Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * feat(ui): hide agent create/delete controls for non-admin users Match MCP servers pattern: wrap '+ Add New Agent' button in isAdmin conditional so internal users see a read-only agents view. Delete buttons in card and table were already gated. Update empty-state copy for non-admin users. Add 7 Vitest tests covering role-based visibility. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
…d-name fix(mcp): update test mocks for renamed filter_server_ids_by_ip_with_info
gpt-5.2-chat and gpt-5.2-chat-latest only support temperature=1 (like base gpt-5), not arbitrary values (like gpt-5.2). Update is_model_gpt_5_1_model() to exclude gpt-5.2-chat variants so drop_params correctly drops unsupported temperature values. Fixes #21911 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes an issue where
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/llms/openai/chat/gpt_5_transformation.py | Correctly adds not model_name.startswith("gpt-5.2-chat") exclusion to is_model_gpt_5_1_model(), preventing gpt-5.2-chat variants from being classified as temperature-flexible models. Logic change is minimal and correct. |
| tests/test_litellm/llms/openai/test_gpt5_transformation.py | Good regression tests added, but the new test_gpt5_2_chat_temperature_restricted function accidentally absorbed the body of the old test_gpt5_2_pro_allows_reasoning_effort_xhigh test, which no longer exists as a separate function. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[map_openai_params called with temperature] --> B{is_model_gpt_5_1_model?}
B -->|Yes| C{reasoning_effort == 'none' or None?}
C -->|Yes| D[Allow any temperature]
C -->|No| E{temperature == 1?}
B -->|No| E
E -->|Yes| F[Allow temperature=1]
E -->|No| G{drop_params?}
G -->|Yes| H[Silently drop temperature]
G -->|No| I[Raise UnsupportedParamsError]
subgraph "is_model_gpt_5_1_model (updated)"
J[model_name] --> K{startswith gpt-5.1?}
K -->|Yes| L[Return True]
K -->|No| M{startswith gpt-5.2?}
M -->|No| N[Return False]
M -->|Yes| O{"'pro' in name?"}
O -->|Yes| N
O -->|No| P{"startswith gpt-5.2-chat? (NEW)"}
P -->|Yes| N
P -->|No| L
end
Last reviewed commit: b57a908
Additional Comments (1)
The body of the former The code still executes (it will pass), but it tests an unrelated concern ( These lines should be extracted back into their own test function: |
8ae6374
into
BerriAI:litellm_oss_staging_02_28_2026
* Add Prometheus child_exit cleanup for gunicorn workers When a gunicorn worker exits (e.g. from max_requests recycling), its per-process prometheus .db files remain on disk. For gauges using livesum/liveall mode, this means the dead worker's last-known values persist as if the process were still alive. Wire gunicorn's child_exit hook to call mark_process_dead() so live-tracking gauges accurately reflect only running workers. * docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway (#21130) * docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway provider config * feat: add AssemblyAI LLM Gateway as OpenAI-compatible provider * fix(mcp): update test mocks to use renamed filter_server_ids_by_ip_with_info Tests were mocking the old method name `filter_server_ids_by_ip` but production code at server.py:774 calls `filter_server_ids_by_ip_with_info` which returns a (server_ids, blocked_count) tuple. The unmocked method on AsyncMock returned a coroutine, causing "cannot unpack non-iterable coroutine object" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): update realtime guardrail test assertions for voice violation behavior Tests were asserting no response.create/conversation.item.create sent to backend when guardrail blocks, but the implementation intentionally sends these to have the LLM voice the guardrail violation message to the user. Updated assertions to verify the correct guardrail flow: - response.cancel is sent to stop any in-progress response - conversation.item.create with violation message is injected - response.create is sent to voice the violation - original blocked content is NOT forwarded Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(bedrock): restore parallel_tool_calls mapping in map_openai_params The revert in 8565c70 removed the parallel_tool_calls handling from map_openai_params, and the subsequent fix d0445e1 only re-added the transform_request consumption but forgot to re-add the map_openai_params producer that sets _parallel_tool_use_config. This meant parallel_tool_calls was silently ignored for all Bedrock models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): update Azure pass-through test to mock litellm.completion Commit 99c62ca removed "azure" from _RESPONSES_API_PROVIDERS, routing Azure models through litellm.completion instead of litellm.responses. The test was not updated to match, causing it to assert against the wrong mock. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add in_flight_requests metric to /health/backlog + prometheus (#22319) * feat: add in_flight_requests metric to /health/backlog + prometheus * refactor: clean class with static methods, add tests, fix sentinel pattern * docs: add in_flight_requests to prometheus metrics and latency troubleshooting * fix(db): add missing migration for LiteLLM_ClaudeCodePluginTable PR #22271 added the LiteLLM_ClaudeCodePluginTable model to schema.prisma but did not include a corresponding migration file, causing test_aaaasschema_migration_check to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update stale docstring to match guardrail voicing behavior Addresses Greptile review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [Feat] Agent RBAC Permission Fix - Ensure Internal Users cannot create agents (#22329) * fix: enforce RBAC on agent endpoints — block non-admin create/update/delete - Add /v1/agents/{agent_id} to agent_routes so internal users can access GET-by-ID (previously returned 403 due to missing route pattern) - Add _check_agent_management_permission() guard to POST, PUT, PATCH, DELETE agent endpoints — only PROXY_ADMIN may mutate agents - Add user_api_key_dict param to delete_agent so the role check works - Add comprehensive unit tests for RBAC enforcement across all roles Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: mock prisma_client in internal user get-agent-by-id test Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * feat(ui): hide agent create/delete controls for non-admin users Match MCP servers pattern: wrap '+ Add New Agent' button in isAdmin conditional so internal users see a read-only agents view. Delete buttons in card and table were already gated. Update empty-state copy for non-admin users. Add 7 Vitest tests covering role-based visibility. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: exclude gpt-5.2-chat from temperature passthrough (#21911) gpt-5.2-chat and gpt-5.2-chat-latest only support temperature=1 (like base gpt-5), not arbitrary values (like gpt-5.2). Update is_model_gpt_5_1_model() to exclude gpt-5.2-chat variants so drop_params correctly drops unsupported temperature values. Fixes #21911 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Ryan Crabbe <rcrabbe@berkeley.edu> Co-authored-by: ryan-crabbe <128659760+ryan-crabbe@users.noreply.github.com> Co-authored-by: Dylan Duan <dylan.duan@assemblyai.com> Co-authored-by: Julio Quinteros Pro <jquinter@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
drop_params: truedoes not droptemperatureforgpt-5.2-chat/gpt-5.2-chat-latestmodels. These models only supporttemperature=1(like basegpt-5), but were incorrectly classified as gpt-5.1/5.2 models that support arbitrary temperature.Root Cause
is_model_gpt_5_1_model()usesmodel_name.startswith("gpt-5.2")which matchesgpt-5.2-chatvariants. These chat variants don't support arbitrary temperature (unlikegpt-5.2/gpt-5.2-codex).Fix
Add
not model_name.startswith("gpt-5.2-chat")to theis_gpt_5_2classification check.Test Changes
test_gpt5_1_model_detectionto assertgpt-5.2-chatandgpt-5.2-chat-latestare NOT classified as gpt-5.1 modelstest_gpt5_2_chat_temperature_restrictedregression test covering:UnsupportedParamsErrorfor non-1 temperaturetemperature=1drop_params=Truesilently drops non-1 temperatureAll 31 tests pass.
Fixes #21911