Skip to content

UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response" #2480

@ethanabrooks

Description

@ethanabrooks

Initial Checks

Description

UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response"

Description

When using Google Gemini models with thinking mode enabled, pydantic-ai occasionally raises UnexpectedModelBehavior: Received empty model response errors. This happens when the model returns responses containing only thinking content (ThinkingPart instances) without any actionable text or tool calls.

Steps to Reproduce

  1. Use a Google Gemini model with thinking mode enabled
  2. Send a complex query that causes the model to "think" extensively
  3. Model occasionally returns only ThinkingPart instances without actionable output
  4. pydantic-ai raises UnexpectedModelBehavior exception

Expected Behavior

The agent should handle thinking-only responses gracefully, either by:

  • Prompting the model to provide actionable output after thinking
  • Filtering out thinking-only responses and continuing the conversation
  • Providing a more specific error message about thinking-only responses

Actual Behavior

pydantic_ai.exceptions.UnexpectedModelBehavior: Received empty model response

Root Cause Analysis

  1. Thinking parts filtered out: ThinkingPart instances are correctly filtered out when converting responses to Google's API format (see _content_model_response() in models/google.py), but this can result in completely empty parts arrays.

  2. Empty responses treated as errors: The agent graph in CallToolsNode._run_stream() raises UnexpectedModelBehavior('Received empty model response') when no text parts or tool calls are found.

  3. API compatibility issues: Empty model responses (with no parts) violate Google's API requirement for "at least one parts field", causing subsequent API calls to fail with 400 errors.

Debugging Information

When this occurs, you can see the model streaming thinking content:

ThinkingPart START - content length: 411
ThinkingPartDelta - content_delta length: 486
ThinkingPartDelta - content_delta length: 633
...

But then the response contains no TextPart or ToolCallPart instances, only ThinkingPart instances.

Impact

  • Agents fail unexpectedly during normal operation
  • Complex queries that require significant reasoning are more likely to trigger this
  • Thinking mode becomes unreliable for production use cases

Proposed Solution

I've been investigating this issue and have a potential solution that involves:

  1. Detecting thinking-only responses: Check if a model response contains only ThinkingPart instances
  2. Automatic retry with prompt: Send a follow-up message asking the model to provide actionable output
  3. Message history filtering: Skip empty model responses when building API requests
  4. Loop prevention: Track retry attempts to prevent infinite loops

Would the maintainers be interested in a PR implementing this approach? I have a working implementation that successfully handles these scenarios while maintaining backward compatibility.

Alternative Approaches Considered

  1. Disable thinking mode: Not ideal as thinking mode provides valuable reasoning capabilities
  2. Catch and ignore: Would lose the reasoning context and potentially break conversation flow
  3. Convert thinking to text: Could work but might clutter the conversation with internal reasoning

The retry approach seems most robust as it preserves the thinking context while ensuring actionable output is eventually provided.

Thanks for the great work on pydantic-ai! Happy to contribute a solution if this approach sounds reasonable.

Example Code

Not reliably reproducible. I have one query, part of a complex private repo that elicits the "thinking-only" responses.

Python, Pydantic AI & LLM client version

  • pydantic-ai version: local install from commit 174fc482d7874bd2150dee51f54ec8ec64d2ce71
  • Model: Google Gemini (gemini-2.5-pro, gemini-1.5-pro with thinking mode)
  • Python version: 3.12
  • OS: macOS

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions