-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Initial Checks
- I confirm that I'm using the latest version of Pydantic AI
- I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue
Description
UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response"
Description
When using Google Gemini models with thinking mode enabled, pydantic-ai occasionally raises UnexpectedModelBehavior: Received empty model response
errors. This happens when the model returns responses containing only thinking content (ThinkingPart
instances) without any actionable text or tool calls.
Steps to Reproduce
- Use a Google Gemini model with thinking mode enabled
- Send a complex query that causes the model to "think" extensively
- Model occasionally returns only
ThinkingPart
instances without actionable output - pydantic-ai raises
UnexpectedModelBehavior
exception
Expected Behavior
The agent should handle thinking-only responses gracefully, either by:
- Prompting the model to provide actionable output after thinking
- Filtering out thinking-only responses and continuing the conversation
- Providing a more specific error message about thinking-only responses
Actual Behavior
pydantic_ai.exceptions.UnexpectedModelBehavior: Received empty model response
Root Cause Analysis
-
Thinking parts filtered out:
ThinkingPart
instances are correctly filtered out when converting responses to Google's API format (see_content_model_response()
inmodels/google.py
), but this can result in completely emptyparts
arrays. -
Empty responses treated as errors: The agent graph in
CallToolsNode._run_stream()
raisesUnexpectedModelBehavior('Received empty model response')
when no text parts or tool calls are found. -
API compatibility issues: Empty model responses (with no parts) violate Google's API requirement for "at least one parts field", causing subsequent API calls to fail with 400 errors.
Debugging Information
When this occurs, you can see the model streaming thinking content:
ThinkingPart START - content length: 411
ThinkingPartDelta - content_delta length: 486
ThinkingPartDelta - content_delta length: 633
...
But then the response contains no TextPart
or ToolCallPart
instances, only ThinkingPart
instances.
Impact
- Agents fail unexpectedly during normal operation
- Complex queries that require significant reasoning are more likely to trigger this
- Thinking mode becomes unreliable for production use cases
Proposed Solution
I've been investigating this issue and have a potential solution that involves:
- Detecting thinking-only responses: Check if a model response contains only
ThinkingPart
instances - Automatic retry with prompt: Send a follow-up message asking the model to provide actionable output
- Message history filtering: Skip empty model responses when building API requests
- Loop prevention: Track retry attempts to prevent infinite loops
Would the maintainers be interested in a PR implementing this approach? I have a working implementation that successfully handles these scenarios while maintaining backward compatibility.
Alternative Approaches Considered
- Disable thinking mode: Not ideal as thinking mode provides valuable reasoning capabilities
- Catch and ignore: Would lose the reasoning context and potentially break conversation flow
- Convert thinking to text: Could work but might clutter the conversation with internal reasoning
The retry approach seems most robust as it preserves the thinking context while ensuring actionable output is eventually provided.
Thanks for the great work on pydantic-ai! Happy to contribute a solution if this approach sounds reasonable.
Example Code
Not reliably reproducible. I have one query, part of a complex private repo that elicits the "thinking-only" responses.
Python, Pydantic AI & LLM client version
- pydantic-ai version: local install from commit
174fc482d7874bd2150dee51f54ec8ec64d2ce71
- Model: Google Gemini (gemini-2.5-pro, gemini-1.5-pro with thinking mode)
- Python version: 3.12
- OS: macOS