Skip to content

support thinking_blocks inputs in OpenAI Responses API via litellm.completions#21322

Open
piyushhhxyz wants to merge 6 commits intoBerriAI:mainfrom
piyushhhxyz:piyush/feat-support-thinking-blocks-input-for-openai
Open

support thinking_blocks inputs in OpenAI Responses API via litellm.completions#21322
piyushhhxyz wants to merge 6 commits intoBerriAI:mainfrom
piyushhhxyz:piyush/feat-support-thinking-blocks-input-for-openai

Conversation

@piyushhhxyz
Copy link
Contributor

Feat: Support thinking_blocks in OpenAI Responses API transformation layer

Type

New Feature

Changes

When litellm.completion() routes to OpenAI's Responses API (e.g., o3-mini, o1), the transformation layer now handles thinking_blocks bidirectionally:

INPUT (Chat Completions → Responses API):

  • Extracts thinking_blocks from assistant messages
  • Converts type: "thinking" → reasoning item with summary (full text preserved)
  • Converts type: "redacted_thinking" → reasoning item with encrypted_content + empty summary: []
  • Reasoning items are ordered BEFORE content/tool_calls per OpenAI spec

OUTPUT (Responses API → Chat Completions):

  • Extracts encrypted_content from ResponseReasoningItemthinking_blocks with type: "redacted_thinking"
  • Extracts summary text → thinking_blocks with type: "thinking"
  • Populates thinking_blocks on the response Message object

This enables stateless multi-turn conversations with reasoning models via encrypted_content round-tripping, matching the unified thinking_blocks interface already supported for
Claude and Gemini.

Files Changed

  • litellm/completion_extras/litellm_responses_transformation/transformation.py — Added _convert_thinking_blocks_to_reasoning_items(), modified input/output transformation paths
  • tests/test_litellm/completion_extras/litellm_responses_transformation/test_thinking_blocks_transformation.py — 15 unit tests (11 input, 4 output)

Test Plan

  • 15/15 unit tests pass (poetry run pytest tests/test_litellm/completion_extras/litellm_responses_transformation/test_thinking_blocks_transformation.py)
  • Covers: regular thinking, redacted thinking, mixed blocks, tool calls, edge cases (None/empty/unknown types), multi-turn encrypted_content preservation, output transformation
    with/without encrypted_content
  • Verified with real OpenAI API (o3-pro) — encrypted_content round-trip works

@vercel
Copy link

vercel bot commented Feb 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 16, 2026 5:59pm

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 16, 2026

Greptile Summary

Adds bidirectional thinking_blocks transformation between Chat Completions and OpenAI Responses API formats. On input, thinking blocks become reasoning items with summary, and redacted_thinking blocks become reasoning items with encrypted_content. On output, the reverse mapping extracts encrypted_content and summary text back into thinking_blocks. This enables stateless multi-turn reasoning with encrypted content round-tripping through litellm.completion().

  • The transformation is correctly scoped to completion_extras/litellm_responses_transformation/, not provider-specific llms/ code
  • Reasoning items are properly ordered before content/tool_calls per OpenAI spec
  • 15 unit tests cover input transformation (11) and output transformation (4), all mock-based
  • Issue: redacted_thinking blocks without a data field produce empty reasoning items {"type": "reasoning", "summary": []} instead of being skipped
  • Issue: Empty reasoning text in output direction ("") is silently dropped due to falsy check, creating an asymmetry with the input direction which preserves empty thinking as empty summary arrays

Confidence Score: 3/5

  • Generally safe to merge with minor logic issues around edge cases that are unlikely to cause failures in common usage
  • Score reflects two logic issues found: (1) redacted_thinking blocks without data create empty no-op reasoning items, and (2) an asymmetry in empty text handling between input and output directions. The core happy-path logic is correct and well-tested, but these edge cases could cause subtle issues in round-tripping.
  • litellm/completion_extras/litellm_responses_transformation/transformation.py — the _convert_thinking_blocks_to_reasoning_items and _convert_response_output_to_choices methods have edge case handling issues

Important Files Changed

Filename Overview
litellm/completion_extras/litellm_responses_transformation/transformation.py Adds bidirectional thinking_blocks transformation. Two logic issues found: redacted_thinking without data creates empty reasoning items, and empty reasoning text is silently dropped in output direction creating an asymmetry with input direction.
tests/test_litellm/completion_extras/litellm_responses_transformation/test_thinking_blocks_transformation.py Comprehensive test suite with 15 tests covering input/output transformation, edge cases, and multi-turn conversations. All tests are mock-based (no network calls). Minor style issues with redundant imports.

Flowchart

flowchart TD
    subgraph INPUT["INPUT: Chat Completions → Responses API"]
        A[Assistant Message with thinking_blocks] --> B{Block Type?}
        B -->|type: thinking| C[Create reasoning item\nwith summary text]
        B -->|type: redacted_thinking| D{Has data field?}
        D -->|Yes| E[Create reasoning item\nwith encrypted_content]
        D -->|No| F[Create empty reasoning item\nsummary: empty array]
        B -->|unknown type| G[Skip with warning]
        C --> H[Append reasoning items\nBEFORE content/tool_calls]
        E --> H
        F --> H
    end

    subgraph OUTPUT["OUTPUT: Responses API → Chat Completions"]
        I[ResponseReasoningItem] --> J{Has encrypted_content?}
        J -->|Yes| K[Create redacted_thinking\nthinking_block with data]
        J -->|No| L{Has reasoning text?}
        L -->|Non-empty| M[Create thinking\nthinking_block with text]
        L -->|Empty string| N[Silently dropped]
        K --> O[Attach thinking_blocks\nto Message object]
        M --> O
    end

    INPUT -.->|Round-trip| OUTPUT
Loading

Last reviewed commit: 38670bd

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@piyushhhxyz
Copy link
Contributor Author

@greptileai please re-review.

Previous review flagged model_construct with dict summary items as a known issue causing getattr(dict, "text", "") to return "". This is incorrect — Pydantic v2's model_construct
auto-constructs nested models:

from openai.types.responses import ResponseReasoningItem

item = ResponseReasoningItem.model_construct(
id="rs_test", type="reasoning",
summary=[{"type": "summary_text", "text": "Hello world"}],
status="completed",
)

type(item.summary[0]) # Summary (Pydantic object, not dict)
getattr(item.summary[0], "text", "") # 'Hello world'

All 15 tests pass including test_output_transformation_reasoning_without_encrypted_content.

Regarding the two edge cases flagged in latest review:

  1. redacted_thinking without data — A redacted_thinking block exists specifically to carry encrypted_content. Without data, the block is malformed input that doesn't occur in practice.
    Producing an empty reasoning item is harmless and valid per OpenAI schema.
  2. Empty reasoning text asymmetry — On input, we preserve empty thinking blocks because the user explicitly passed them. On output, we skip empty summary text because it's useless noise
    in the response. OpenAI models don't return empty summary text. Both directions handle empty content appropriately for their context.

Neither edge case affects the core functionality: encrypted_content round-tripping for stateless multi-turn reasoning, which is verified with real OpenAI API calls (o3-pro).

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 16, 2026

Greptile Summary

This PR adds bidirectional thinking_blocks support in the OpenAI Responses API transformation layer (litellm.completion() → Responses API path), enabling stateless multi-turn reasoning with encrypted content round-tripping.

  • Input direction: Extracts thinking_blocks from assistant messages and converts them to reasoning items (with summary for regular thinking, encrypted_content for redacted thinking). Reasoning items are correctly ordered before content/tool_calls per OpenAI spec.
  • Output direction: Extracts encrypted_content and summary text from ResponseReasoningItem objects and converts them to thinking_blocks on the response Message, enabling round-trip preservation.
  • Tests: 15 unit tests (all mock-based, no network calls) covering input/output transformation, edge cases, and multi-turn encrypted content preservation.
  • Issue found: In the output direction, reasoning_content is overwritten (not accumulated) when multiple ResponseReasoningItems exist, while thinking_blocks_list correctly accumulates — creating an inconsistency between the two fields on the resulting Message.

Confidence Score: 3/5

  • This PR is likely safe to merge but has a data consistency issue between reasoning_content and thinking_blocks in the output transformation that should be addressed.
  • The input transformation is well-implemented and tested. The output transformation has a reasoning_content overwrite issue when multiple reasoning items are present, which could cause data loss on the reasoning_content field while thinking_blocks correctly accumulates. Tests are comprehensive and mock-only. No breaking changes to existing behavior.
  • Pay attention to litellm/completion_extras/litellm_responses_transformation/transformation.py lines 505-511 where reasoning_content is overwritten instead of accumulated for multiple reasoning items.

Important Files Changed

Filename Overview
litellm/completion_extras/litellm_responses_transformation/transformation.py Adds bidirectional thinking_blocks transformation. Input direction (thinking_blocks → reasoning items) is well-structured. Output direction has a reasoning_content accumulation issue where multiple reasoning items overwrite rather than accumulate, creating inconsistency with the new thinking_blocks_list which correctly accumulates.
tests/test_litellm/completion_extras/litellm_responses_transformation/test_thinking_blocks_transformation.py Comprehensive test suite with 15 tests covering both input and output directions. Tests are mock-only (no real network calls). Good coverage of edge cases including empty/None/unknown types and multi-turn conversations.

Flowchart

flowchart TD
    subgraph Input["INPUT: Chat Completions → Responses API"]
        A[Assistant Message with thinking_blocks] --> B{Block Type?}
        B -->|type: thinking| C[Reasoning Item with summary_text]
        B -->|type: redacted_thinking| D[Reasoning Item with encrypted_content]
        B -->|unknown type| E[Skip with warning]
        C --> F[Add reasoning items to input_items]
        D --> F
        F --> G{Has tool_calls?}
        G -->|Yes| H[Add function_call items]
        G -->|No| I[Add message item with content]
    end

    subgraph Output["OUTPUT: Responses API → Chat Completions"]
        J[ResponseReasoningItem] --> K{Has encrypted_content?}
        K -->|Yes| L[thinking_block: redacted_thinking + data]
        K -->|No| M{Has reasoning text?}
        M -->|Yes| N[thinking_block: thinking + text]
        M -->|No| O[No thinking_block created]
        L --> P[Accumulate in thinking_blocks_list]
        N --> P
        P --> Q[ResponseOutputMessage]
        Q --> R[Message with thinking_blocks + reasoning_content]
    end
Loading

Last reviewed commit: 38670bd

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@piyushhhxyz
Copy link
Contributor Author

@greptileai please rereview:
your last reasoning and scoring is wrong:

The reasoning_content overwrite behavior is pre-existing code, not introduced by this PR. The diff shows those lines are unchanged — this PR only added the thinking_blocks_list
accumulation on top.

reasoning_content is a str field on Message — it holds a single string by design. Changing it to accumulate would require modifying the Message type, which is out of scope for this PR
and would be a breaking change.

This PR actually improves the situation: thinking_blocks is the proper list-based field that correctly accumulates all reasoning items. Users who need the full reasoning history should
use thinking_blocks instead of the legacy reasoning_content string.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 16, 2026

Greptile Summary

This PR adds bidirectional thinking_blocks support to the OpenAI Responses API transformation layer in litellm.completion(). On the input side, thinking_blocks from assistant messages are extracted and converted to Responses API reasoning items (with summary for regular thinking and encrypted_content for redacted thinking), placed before content/tool_calls per OpenAI spec. On the output side, ResponseReasoningItems are converted back to thinking_blocks format, enabling encrypted_content round-tripping for stateless multi-turn conversations with reasoning models.

  • The restructuring of the else branch in convert_chat_completion_messages_to_responses_api (lines 202-243) changes the control flow from elif to nested if/elif, which correctly handles the case where an assistant message has both thinking_blocks and tool_calls (reasoning items emitted first, then tool calls).
  • The new _convert_thinking_blocks_to_reasoning_items helper correctly handles regular thinking, redacted thinking, and unknown types with appropriate logging.
  • Output transformation correctly accumulates thinking_blocks_list across multiple ResponseReasoningItems and flushes it when a ResponseOutputMessage is encountered.
  • Tests are comprehensive (13 test cases), mock-only with no network calls, and cover input transformation, output transformation, edge cases, and multi-turn scenarios.
  • Minor issue: duplicate type annotation for reasoning_item variable (lines 278 and 292) may cause mypy failures in CI.

Confidence Score: 4/5

  • This PR is safe to merge after fixing the duplicate type annotation that may cause mypy CI failures.
  • The logic is sound and well-tested with 13 mock-only unit tests. The control flow refactoring correctly handles all combinations of thinking_blocks, tool_calls, and content. The only concrete issue is a duplicate type annotation that will likely fail mypy. The code is scoped to the transformation layer and doesn't affect other parts of the system.
  • litellm/completion_extras/litellm_responses_transformation/transformation.py - duplicate type annotation on reasoning_item (lines 278, 292) may fail mypy.

Important Files Changed

Filename Overview
litellm/completion_extras/litellm_responses_transformation/transformation.py Adds bidirectional thinking_blocks transformation between Chat Completions and Responses API formats. Has a duplicate type annotation that may fail mypy.
tests/test_litellm/completion_extras/litellm_responses_transformation/test_thinking_blocks_transformation.py Comprehensive mock-only test suite covering 13+ cases for input/output thinking_blocks transformation. No network calls, consistent with existing test patterns.

Flowchart

flowchart TD
    subgraph INPUT["INPUT: Chat Completions → Responses API"]
        A[Assistant Message with thinking_blocks] --> B{Block Type?}
        B -->|"type: thinking"| C["Reasoning Item\n{type: reasoning, summary: [{text: ...}]}"]
        B -->|"type: redacted_thinking"| D["Reasoning Item\n{type: reasoning, encrypted_content: data, summary: []}"]
        B -->|unknown type| E[Skip with warning]
        C --> F[Extend input_items with reasoning items]
        D --> F
        F --> G{Has tool_calls?}
        G -->|Yes| H[Append function_call items]
        G -->|No, content exists| I[Append message item]
    end

    subgraph OUTPUT["OUTPUT: Responses API → Chat Completions"]
        J[ResponseReasoningItem] --> K{Has encrypted_content?}
        K -->|Yes| L["thinking_block\n{type: redacted_thinking, data: encrypted_content}"]
        K -->|No, has summary text| M["thinking_block\n{type: thinking, thinking: text}"]
        L --> N[Accumulate in thinking_blocks_list]
        M --> N
        N --> O[ResponseOutputMessage]
        O --> P["Message with\nthinking_blocks + content"]
        P --> Q[Flush thinking_blocks_list]
    end

    INPUT -.->|Round-trip via encrypted_content| OUTPUT
Loading

Last reviewed commit: 38670bd

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant