UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response"

### Initial Checks

- [x] I confirm that I'm using the latest version of Pydantic AI
- [x] I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue

### Description

# UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response"

## Description

When using Google Gemini models with thinking mode enabled, pydantic-ai occasionally raises `UnexpectedModelBehavior: Received empty model response` errors. This happens when the model returns responses containing only thinking content (`ThinkingPart` instances) without any actionable text or tool calls.

## Steps to Reproduce

1. Use a Google Gemini model with thinking mode enabled
2. Send a complex query that causes the model to "think" extensively
3. Model occasionally returns only `ThinkingPart` instances without actionable output
4. pydantic-ai raises `UnexpectedModelBehavior` exception

## Expected Behavior

The agent should handle thinking-only responses gracefully, either by:

- Prompting the model to provide actionable output after thinking
- Filtering out thinking-only responses and continuing the conversation
- Providing a more specific error message about thinking-only responses

## Actual Behavior

```
pydantic_ai.exceptions.UnexpectedModelBehavior: Received empty model response
```

## Root Cause Analysis

1. **Thinking parts filtered out**: `ThinkingPart` instances are correctly filtered out when converting responses to Google's API format (see `_content_model_response()` in `models/google.py`), but this can result in completely empty `parts` arrays.

2. **Empty responses treated as errors**: The agent graph in `CallToolsNode._run_stream()` raises `UnexpectedModelBehavior('Received empty model response')` when no text parts or tool calls are found.

3. **API compatibility issues**: Empty model responses (with no parts) violate Google's API requirement for "at least one parts field", causing subsequent API calls to fail with 400 errors.

## Debugging Information

When this occurs, you can see the model streaming thinking content:

```
ThinkingPart START - content length: 411
ThinkingPartDelta - content_delta length: 486
ThinkingPartDelta - content_delta length: 633
...
```

But then the response contains no `TextPart` or `ToolCallPart` instances, only `ThinkingPart` instances.

## Impact

- Agents fail unexpectedly during normal operation
- Complex queries that require significant reasoning are more likely to trigger this
- Thinking mode becomes unreliable for production use cases

## Proposed Solution

I've been investigating this issue and have a potential solution that involves:

1. **Detecting thinking-only responses**: Check if a model response contains only `ThinkingPart` instances
2. **Automatic retry with prompt**: Send a follow-up message asking the model to provide actionable output
3. **Message history filtering**: Skip empty model responses when building API requests
4. **Loop prevention**: Track retry attempts to prevent infinite loops

Would the maintainers be interested in a PR implementing this approach? I have a working implementation that successfully handles these scenarios while maintaining backward compatibility.

## Alternative Approaches Considered

1. **Disable thinking mode**: Not ideal as thinking mode provides valuable reasoning capabilities
2. **Catch and ignore**: Would lose the reasoning context and potentially break conversation flow
3. **Convert thinking to text**: Could work but might clutter the conversation with internal reasoning

The retry approach seems most robust as it preserves the thinking context while ensuring actionable output is eventually provided.

Thanks for the great work on pydantic-ai! Happy to contribute a solution if this approach sounds reasonable.


### Example Code

Not reliably reproducible. I have one query, part of a complex private repo that elicits the "thinking-only" responses.

### Python, Pydantic AI & LLM client version

- **pydantic-ai version**: local install from commit `174fc482d7874bd2150dee51f54ec8ec64d2ce71`
- **Model**: Google Gemini (gemini-2.5-pro, gemini-1.5-pro with thinking mode)
- **Python version**: 3.12
- **OS**: macOS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response" #2480

Initial Checks

Description

UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response"

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Debugging Information

Impact

Proposed Solution

Alternative Approaches Considered

Example Code

Python, Pydantic AI & LLM client version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response" #2480

Description

Initial Checks

Description

UnexpectedModelBehavior with Google Gemini thinking mode: "Received empty model response"

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Debugging Information

Impact

Proposed Solution

Alternative Approaches Considered

Example Code

Python, Pydantic AI & LLM client version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions