Skip to content

fix: exclude gpt-5.2-chat from temperature passthrough#22342

Merged
krrishdholakia merged 17 commits intoBerriAI:litellm_oss_staging_02_28_2026from
giulio-leone:fix/issue-21911-gpt52-chat-drop-temperature
Feb 28, 2026
Merged

fix: exclude gpt-5.2-chat from temperature passthrough#22342
krrishdholakia merged 17 commits intoBerriAI:litellm_oss_staging_02_28_2026from
giulio-leone:fix/issue-21911-gpt52-chat-drop-temperature

Conversation

@giulio-leone
Copy link

Summary

drop_params: true does not drop temperature for gpt-5.2-chat / gpt-5.2-chat-latest models. These models only support temperature=1 (like base gpt-5), but were incorrectly classified as gpt-5.1/5.2 models that support arbitrary temperature.

Root Cause

is_model_gpt_5_1_model() uses model_name.startswith("gpt-5.2") which matches gpt-5.2-chat variants. These chat variants don't support arbitrary temperature (unlike gpt-5.2/gpt-5.2-codex).

Fix

Add not model_name.startswith("gpt-5.2-chat") to the is_gpt_5_2 classification check.

Test Changes

  • Updated test_gpt5_1_model_detection to assert gpt-5.2-chat and gpt-5.2-chat-latest are NOT classified as gpt-5.1 models
  • Added test_gpt5_2_chat_temperature_restricted regression test covering:
    • Raises UnsupportedParamsError for non-1 temperature
    • Allows temperature=1
    • drop_params=True silently drops non-1 temperature

All 31 tests pass.

Fixes #21911

ryan-crabbe and others added 17 commits February 27, 2026 16:11
When a gunicorn worker exits (e.g. from max_requests recycling), its
per-process prometheus .db files remain on disk. For gauges using
livesum/liveall mode, this means the dead worker's last-known values
persist as if the process were still alive. Wire gunicorn's child_exit
hook to call mark_process_dead() so live-tracking gauges accurately
reflect only running workers.
…cleanup

Add Prometheus child_exit cleanup for gunicorn workers
…ng, and LLM Gateway (#21130)

* docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway provider config

* feat: add AssemblyAI LLM Gateway as OpenAI-compatible provider
…th_info

Tests were mocking the old method name `filter_server_ids_by_ip` but production
code at server.py:774 calls `filter_server_ids_by_ip_with_info` which returns
a (server_ids, blocked_count) tuple. The unmocked method on AsyncMock returned
a coroutine, causing "cannot unpack non-iterable coroutine object" errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ion behavior

Tests were asserting no response.create/conversation.item.create sent to
backend when guardrail blocks, but the implementation intentionally sends
these to have the LLM voice the guardrail violation message to the user.

Updated assertions to verify the correct guardrail flow:
- response.cancel is sent to stop any in-progress response
- conversation.item.create with violation message is injected
- response.create is sent to voice the violation
- original blocked content is NOT forwarded

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The revert in 8565c70 removed the parallel_tool_calls handling from
map_openai_params, and the subsequent fix d0445e1 only re-added the
transform_request consumption but forgot to re-add the map_openai_params
producer that sets _parallel_tool_use_config. This meant parallel_tool_calls
was silently ignored for all Bedrock models.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Commit 99c62ca removed "azure" from _RESPONSES_API_PROVIDERS,
routing Azure models through litellm.completion instead of
litellm.responses. The test was not updated to match, causing it
to assert against the wrong mock.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…22319)

* feat: add in_flight_requests metric to /health/backlog + prometheus

* refactor: clean class with static methods, add tests, fix sentinel pattern

* docs: add in_flight_requests to prometheus metrics and latency troubleshooting
PR #22271 added the LiteLLM_ClaudeCodePluginTable model to
schema.prisma but did not include a corresponding migration file,
causing test_aaaasschema_migration_check to fail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses Greptile review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sertions

fix(test): update realtime guardrail test assertions for voice violation behavior
…e-test

fix(test): update Azure pass-through test after Responses API routing change
fix(db): add missing migration for LiteLLM_ClaudeCodePluginTable
…s-map-params

fix(bedrock): restore parallel_tool_calls mapping in map_openai_params
…e agents (#22329)

* fix: enforce RBAC on agent endpoints — block non-admin create/update/delete

- Add /v1/agents/{agent_id} to agent_routes so internal users can
  access GET-by-ID (previously returned 403 due to missing route pattern)
- Add _check_agent_management_permission() guard to POST, PUT, PATCH,
  DELETE agent endpoints — only PROXY_ADMIN may mutate agents
- Add user_api_key_dict param to delete_agent so the role check works
- Add comprehensive unit tests for RBAC enforcement across all roles

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: mock prisma_client in internal user get-agent-by-id test

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* feat(ui): hide agent create/delete controls for non-admin users

Match MCP servers pattern: wrap '+ Add New Agent' button in
isAdmin conditional so internal users see a read-only agents view.
Delete buttons in card and table were already gated.
Update empty-state copy for non-admin users.
Add 7 Vitest tests covering role-based visibility.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
…d-name

fix(mcp): update test mocks for renamed filter_server_ids_by_ip_with_info
gpt-5.2-chat and gpt-5.2-chat-latest only support temperature=1
(like base gpt-5), not arbitrary values (like gpt-5.2).
Update is_model_gpt_5_1_model() to exclude gpt-5.2-chat variants
so drop_params correctly drops unsupported temperature values.

Fixes #21911

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vercel
Copy link

vercel bot commented Feb 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 28, 2026 4:17am

Request Review

@CLAassistant
Copy link

CLAassistant commented Feb 28, 2026

CLA assistant check
All committers have signed the CLA.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 28, 2026

Greptile Summary

This PR fixes an issue where gpt-5.2-chat and gpt-5.2-chat-latest models were incorrectly classified by is_model_gpt_5_1_model() as supporting arbitrary temperature values. The fix adds a not model_name.startswith("gpt-5.2-chat") check so these chat variants correctly fall through to the base gpt-5 temperature restriction (only temperature=1 allowed).

  • Core fix in gpt_5_transformation.py is correct and minimal — adds a single exclusion condition to is_gpt_5_2 classification
  • Test updates correctly verify gpt-5.2-chat variants are excluded from 5.1 model detection
  • Regression test test_gpt5_2_chat_temperature_restricted covers error, passthrough, and drop_params scenarios
  • Bug: The new test function accidentally absorbs the body of the former test_gpt5_2_pro_allows_reasoning_effort_xhigh test due to a bad merge/rebase — this standalone test is effectively deleted and its assertions are now orphaned inside an unrelated test function

Confidence Score: 3/5

  • The core logic fix is correct and safe to merge, but the test file has a structural bug that should be fixed first.
  • The production code change is a simple, correct one-line condition addition. However, the test file has an accidental merge issue where test_gpt5_2_pro_allows_reasoning_effort_xhigh was deleted and its body was absorbed into the new test function, which reduces test clarity and removes a named test case.
  • tests/test_litellm/llms/openai/test_gpt5_transformation.py — the orphaned test code at lines 431-437 needs to be extracted back into its own test_gpt5_2_pro_allows_reasoning_effort_xhigh function.

Important Files Changed

Filename Overview
litellm/llms/openai/chat/gpt_5_transformation.py Correctly adds not model_name.startswith("gpt-5.2-chat") exclusion to is_model_gpt_5_1_model(), preventing gpt-5.2-chat variants from being classified as temperature-flexible models. Logic change is minimal and correct.
tests/test_litellm/llms/openai/test_gpt5_transformation.py Good regression tests added, but the new test_gpt5_2_chat_temperature_restricted function accidentally absorbed the body of the old test_gpt5_2_pro_allows_reasoning_effort_xhigh test, which no longer exists as a separate function.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[map_openai_params called with temperature] --> B{is_model_gpt_5_1_model?}
    B -->|Yes| C{reasoning_effort == 'none' or None?}
    C -->|Yes| D[Allow any temperature]
    C -->|No| E{temperature == 1?}
    B -->|No| E
    E -->|Yes| F[Allow temperature=1]
    E -->|No| G{drop_params?}
    G -->|Yes| H[Silently drop temperature]
    G -->|No| I[Raise UnsupportedParamsError]

    subgraph "is_model_gpt_5_1_model (updated)"
        J[model_name] --> K{startswith gpt-5.1?}
        K -->|Yes| L[Return True]
        K -->|No| M{startswith gpt-5.2?}
        M -->|No| N[Return False]
        M -->|Yes| O{"'pro' in name?"}
        O -->|Yes| N
        O -->|No| P{"startswith gpt-5.2-chat? (NEW)"}
        P -->|Yes| N
        P -->|No| L
    end
Loading

Last reviewed commit: b57a908

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 28, 2026

Additional Comments (1)

tests/test_litellm/llms/openai/test_gpt5_transformation.py
Orphaned test code merged into wrong function

The body of the former test_gpt5_2_pro_allows_reasoning_effort_xhigh test (lines 431-437) has been accidentally absorbed into test_gpt5_2_chat_temperature_restricted. In the base branch, this was a standalone test function; in the PR, the def line was replaced by the new test but the old body was left behind inside the new function.

The code still executes (it will pass), but it tests an unrelated concern (reasoning_effort='xhigh' for gpt-5.2-pro) inside a function that should only test gpt-5.2-chat temperature restrictions. The separate test_gpt5_2_pro_allows_reasoning_effort_xhigh test no longer exists.

These lines should be extracted back into their own test function:

        assert "temperature" not in params


def test_gpt5_2_pro_allows_reasoning_effort_xhigh(config: OpenAIConfig):
    params = config.map_openai_params(
        non_default_params={"reasoning_effort": "xhigh"},
        optional_params={},
        model="gpt-5.2-pro",
        drop_params=False,
    )
    assert params["reasoning_effort"] == "xhigh"

@krrishdholakia krrishdholakia changed the base branch from main to litellm_oss_staging_02_28_2026 February 28, 2026 07:58
@krrishdholakia krrishdholakia merged commit 8ae6374 into BerriAI:litellm_oss_staging_02_28_2026 Feb 28, 2026
26 of 30 checks passed
Sameerlite pushed a commit that referenced this pull request Mar 2, 2026
* Add Prometheus child_exit cleanup for gunicorn workers

When a gunicorn worker exits (e.g. from max_requests recycling), its
per-process prometheus .db files remain on disk. For gauges using
livesum/liveall mode, this means the dead worker's last-known values
persist as if the process were still alive. Wire gunicorn's child_exit
hook to call mark_process_dead() so live-tracking gauges accurately
reflect only running workers.

* docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway (#21130)

* docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway provider config

* feat: add AssemblyAI LLM Gateway as OpenAI-compatible provider

* fix(mcp): update test mocks to use renamed filter_server_ids_by_ip_with_info

Tests were mocking the old method name `filter_server_ids_by_ip` but production
code at server.py:774 calls `filter_server_ids_by_ip_with_info` which returns
a (server_ids, blocked_count) tuple. The unmocked method on AsyncMock returned
a coroutine, causing "cannot unpack non-iterable coroutine object" errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): update realtime guardrail test assertions for voice violation behavior

Tests were asserting no response.create/conversation.item.create sent to
backend when guardrail blocks, but the implementation intentionally sends
these to have the LLM voice the guardrail violation message to the user.

Updated assertions to verify the correct guardrail flow:
- response.cancel is sent to stop any in-progress response
- conversation.item.create with violation message is injected
- response.create is sent to voice the violation
- original blocked content is NOT forwarded

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(bedrock): restore parallel_tool_calls mapping in map_openai_params

The revert in 8565c70 removed the parallel_tool_calls handling from
map_openai_params, and the subsequent fix d0445e1 only re-added the
transform_request consumption but forgot to re-add the map_openai_params
producer that sets _parallel_tool_use_config. This meant parallel_tool_calls
was silently ignored for all Bedrock models.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): update Azure pass-through test to mock litellm.completion

Commit 99c62ca removed "azure" from _RESPONSES_API_PROVIDERS,
routing Azure models through litellm.completion instead of
litellm.responses. The test was not updated to match, causing it
to assert against the wrong mock.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add in_flight_requests metric to /health/backlog + prometheus (#22319)

* feat: add in_flight_requests metric to /health/backlog + prometheus

* refactor: clean class with static methods, add tests, fix sentinel pattern

* docs: add in_flight_requests to prometheus metrics and latency troubleshooting

* fix(db): add missing migration for LiteLLM_ClaudeCodePluginTable

PR #22271 added the LiteLLM_ClaudeCodePluginTable model to
schema.prisma but did not include a corresponding migration file,
causing test_aaaasschema_migration_check to fail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update stale docstring to match guardrail voicing behavior

Addresses Greptile review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [Feat] Agent RBAC Permission Fix - Ensure Internal Users cannot create agents (#22329)

* fix: enforce RBAC on agent endpoints — block non-admin create/update/delete

- Add /v1/agents/{agent_id} to agent_routes so internal users can
  access GET-by-ID (previously returned 403 due to missing route pattern)
- Add _check_agent_management_permission() guard to POST, PUT, PATCH,
  DELETE agent endpoints — only PROXY_ADMIN may mutate agents
- Add user_api_key_dict param to delete_agent so the role check works
- Add comprehensive unit tests for RBAC enforcement across all roles

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: mock prisma_client in internal user get-agent-by-id test

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* feat(ui): hide agent create/delete controls for non-admin users

Match MCP servers pattern: wrap '+ Add New Agent' button in
isAdmin conditional so internal users see a read-only agents view.
Delete buttons in card and table were already gated.
Update empty-state copy for non-admin users.
Add 7 Vitest tests covering role-based visibility.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: exclude gpt-5.2-chat from temperature passthrough (#21911)

gpt-5.2-chat and gpt-5.2-chat-latest only support temperature=1
(like base gpt-5), not arbitrary values (like gpt-5.2).
Update is_model_gpt_5_1_model() to exclude gpt-5.2-chat variants
so drop_params correctly drops unsupported temperature values.

Fixes #21911

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Ryan Crabbe <rcrabbe@berkeley.edu>
Co-authored-by: ryan-crabbe <128659760+ryan-crabbe@users.noreply.github.com>
Co-authored-by: Dylan Duan <dylan.duan@assemblyai.com>
Co-authored-by: Julio Quinteros Pro <jquinter@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: drop_params: true does not drop temperature for gpt-5.2-chat / gpt-5.2-chat-latest (both OpenAI and Azure)

7 participants