Skip to content

[BUG] Error responses are not handled correctly for google openai/openrouter  #527

@sambhav

Description

@sambhav

In case the API returns a 429/Rate limit exceeded, pydantic-ai throws a date-time parsing exception instead of surfacing the appropriate error message from the API around RLE(rate-limit-exceeded).

This can easily be replicated by using openrouter with one of the free gemini models.

from pydantic_ai import Agent

from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    "google/gemini-2.0-flash-exp:free",
    base_url="https://openrouter.ai/api/v1",
    api_key="key",
)

agent = Agent(
    model=model,
    system_prompt='Be concise, reply with one sentence.',  
)

result = agent.run_sync('Who are you?')
print(result.data)

The above returns -

Traceback (most recent call last):
  File "/Users/sam/dev/openai/openai_demo.py", line 32, in <module>
    result = agent.run_sync('Who are you?')
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 327, in run_sync
    return asyncio.get_event_loop().run_until_complete(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 255, in run
    model_response, request_usage = await agent_model.request(messages, model_settings)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/models/openai.py", line 152, in request
    return self._process_response(response), _map_usage(response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/models/openai.py", line 207, in _process_response
    timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object cannot be interpreted as an integer

This happens because the error response is not correctly handled in _process_response -

ChatCompletion(id=None, choices=None, created=None, model=None, object=None, service_tier=None, system_fingerprint=None, usage=None, error={'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': '{\n  "error": {\n    "code": 429,\n    "message": "Quota exceeded for aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model with base model: gemini-experimental. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.",\n    "status": "RESOURCE_EXHAUSTED"\n  }\n}\n', 'provider_name': 'Google'}}, user_id='user_...')

We should check for the presence of the error object and handle the other fields appropriately.

Note: I have noticed this with both google's OpenAI compat API and openrouter's gemini API.

This is what an example output response may look like

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions