fix(websearch_interception): preserve thinking blocks in agentic loop follow-up messages by michelligabriele · Pull Request #21604 · BerriAI/litellm

michelligabriele · 2026-02-19T20:52:46Z

When extended thinking is enabled, the websearch interception agentic loop builds a follow-up assistant message with only tool_use blocks. Anthropic's API requires assistant messages to start with thinking/redacted_thinking blocks when thinking is enabled, causing a 400 Bad Request.

Extract thinking blocks from the model's initial response, thread them through the agentic loop, and prepend them to the follow-up assistant message — matching the pattern used by anthropic_messages_pt in factory.py.

Fixes the error: "Expected 'thinking' or 'redacted_thinking', but found 'tool_use'"

Relevant issues

Fixes #20187

Related PRs: #20488 (by @mpcusack-altos) and #20489 (by @Quentin-M) attempt the same fix with broader scope. This PR takes a minimal, focused approach — fixing only the core thinking block issue in the Anthropic Messages API pass-through path.

Pre-Submission checklist

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🐛 Bug Fix

Changes

handler.py — async_should_run_agentic_loop: Extract thinking/redacted_thinking blocks from the model response content and include them in the tools_dict passed to the agentic loop
handler.py — async_run_agentic_loop / _execute_agentic_loop: Thread thinking_blocks through to transform_response
transformation.py — transform_response / _transform_response_anthropic: Accept optional thinking_blocks parameter and prepend them before tool_use blocks in the follow-up assistant message (same pattern as anthropic_messages_pt in factory.py)
test_websearch_interception_thinking.py: 9 new unit tests covering thinking block extraction (dict + object responses), prepending, backward compatibility (no thinking / empty list), public API routing, and OpenAI path isolation

… follow-up messages When extended thinking is enabled, the websearch interception agentic loop builds a follow-up assistant message with only tool_use blocks. Anthropic's API requires assistant messages to start with thinking/redacted_thinking blocks when thinking is enabled, causing a 400 Bad Request. Extract thinking blocks from the model's initial response, thread them through the agentic loop, and prepend them to the follow-up assistant message — matching the pattern used by anthropic_messages_pt in factory.py. Fixes the error: "Expected 'thinking' or 'redacted_thinking', but found 'tool_use'"

vercel · 2026-02-19T20:52:53Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 19, 2026 8:54pm

greptile-apps · 2026-02-19T20:57:13Z

Greptile Summary

Fixes "Expected 'thinking' or 'redacted_thinking', but found 'tool_use'" error when websearch interception is used with extended thinking enabled.

Key changes:

Extracts thinking/redacted_thinking blocks from initial model response in async_should_run_agentic_loop
Threads thinking blocks through async_run_agentic_loop → _execute_agentic_loop → transform_response
Prepends thinking blocks before tool_use blocks in follow-up assistant message, matching Anthropic API requirements
Adds 9 comprehensive unit tests covering extraction (dict + object responses), prepending, backward compatibility, and format isolation

Issues found:

Missing cache_control field preservation when converting thinking block objects to dicts (handler.py:322-336)

Confidence Score: 4/5

Safe to merge with one logic fix needed for cache_control field preservation
Well-structured fix with comprehensive tests, but missing cache_control field in thinking block object-to-dict conversion could cause issues if prompt caching is used with extended thinking
handler.py lines 322-336 need to preserve cache_control field

Important Files Changed

Filename	Overview
litellm/integrations/websearch_interception/handler.py	Extracts thinking/redacted_thinking blocks from model response and threads them through the agentic loop. One issue: missing cache_control field preservation in object-to-dict conversion.
litellm/integrations/websearch_interception/transformation.py	Prepends thinking blocks before tool_use blocks in assistant message. Implementation correctly matches Anthropic API requirements and includes proper backward compatibility.
tests/test_litellm/integrations/websearch_interception/test_websearch_interception_thinking.py	Comprehensive test coverage with 9 unit tests covering extraction, prepending, backward compatibility, and OpenAI path isolation. All tests use mocks (no network calls).

Sequence Diagram

sequenceDiagram
    participant Model as Anthropic Model
    participant Handler as WebSearchInterceptionLogger
    participant Transform as WebSearchTransformation
    participant Search as litellm.asearch()
    
    Model->>Handler: response with thinking + tool_use blocks
    Note over Handler: Extract thinking/redacted_thinking blocks<br/>from response.content
    Handler->>Transform: transform_request(response)
    Transform-->>Handler: tool_calls
    Note over Handler: Store thinking_blocks in tools_dict
    
    Handler->>Search: Execute searches in parallel
    Search-->>Handler: search_results
    
    Handler->>Transform: transform_response(tool_calls, search_results, thinking_blocks)
    Note over Transform: Prepend thinking_blocks before tool_use blocks<br/>in assistant message
    Transform-->>Handler: assistant_message, user_message
    
    Handler->>Model: Follow-up request with:<br/>1. thinking blocks (prepended)<br/>2. tool_use blocks<br/>3. tool_result blocks
    Model-->>Handler: Final response

_{Last reviewed commit: 4630793}

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-19T20:57:17Z

litellm/integrations/websearch_interception/handler.py

+                    # Convert object to dict using getattr, matching the
+                    # pattern in _detect_from_non_streaming_response
+                    thinking_block_dict: Dict = {"type": block_type}
+                    if block_type == "thinking":
+                        thinking_block_dict["thinking"] = getattr(
+                            block, "thinking", ""
+                        )
+                        thinking_block_dict["signature"] = getattr(
+                            block, "signature", ""
+                        )
+                    else:  # redacted_thinking
+                        thinking_block_dict["data"] = getattr(
+                            block, "data", ""
+                        )
+                    thinking_blocks.append(thinking_block_dict)


missing cache_control field when converting object to dict

thinking blocks can include an optional cache_control field (see ChatCompletionThinkingBlock and ChatCompletionRedactedThinkingBlock in types/llms/openai.py), but this conversion only copies type, thinking, signature, and data fields

Suggested change

# Convert object to dict using getattr, matching the

# pattern in _detect_from_non_streaming_response

thinking_block_dict: Dict = {"type": block_type}

if block_type == "thinking":

thinking_block_dict["thinking"] = getattr(

block, "thinking", ""

)

thinking_block_dict["signature"] = getattr(

block, "signature", ""

)

else: # redacted_thinking

thinking_block_dict["data"] = getattr(

block, "data", ""

)

thinking_blocks.append(thinking_block_dict)

# Convert object to dict using getattr, matching the

# pattern in _detect_from_non_streaming_response

thinking_block_dict: Dict = {"type": block_type}

if block_type == "thinking":

thinking_block_dict["thinking"] = getattr(

block, "thinking", ""

)

thinking_block_dict["signature"] = getattr(

block, "signature", ""

)

else: # redacted_thinking

thinking_block_dict["data"] = getattr(

block, "data", ""

)

# Preserve cache_control if present

cache_control = getattr(block, "cache_control", None)

if cache_control is not None:

thinking_block_dict["cache_control"] = cache_control

thinking_blocks.append(thinking_block_dict)

vercel bot deployed to Preview February 19, 2026 20:54 View deployment

greptile-apps bot reviewed Feb 19, 2026

View reviewed changes

Sameerlite merged commit 1790a6b into BerriAI:main Feb 26, 2026
8 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(websearch_interception): preserve thinking blocks in agentic loop follow-up messages#21604

fix(websearch_interception): preserve thinking blocks in agentic loop follow-up messages#21604
Sameerlite merged 1 commit intoBerriAI:mainfrom
michelligabriele:fix/websearch-thinking-blocks

michelligabriele commented Feb 19, 2026

Uh oh!

vercel bot commented Feb 19, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 19, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

michelligabriele commented Feb 19, 2026

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 19, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Feb 19, 2026 •

edited

Loading