Add OpenRouterModel as OpenAIChatModel subclass #3089

ajac-zero · 2025-10-05T05:43:04Z

Hi! This pull request takes a shot at implementing a dedicated OpenRouterModel model. Closes #2936.

The differentiator for this PR is that this implementation minimizes code duplication as much as possible by delegating the main logic to OpenAIChatModel, such that the new model class serves as a convenience layer for OpenRouter specific features.

The main thinking behind this solution is that as long as the OpenRouter API is still fully accessible via the openai package, it would be inefficient to reimplement the internal logic using this same package again. We can instead use hooks to achieve the requested features.

I would like to get some thoughts on this implementation before starting to update the docs.

Addressed issues

Closes Store OpenRouter provider metadata in ModelResponse vendor details #1849

Provider metadata can now be accessed via the 'downstream_provider' key in ModelMessage.provider_details:

from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_sync
from pydantic_ai.models.openrouter import OpenRouterModel

model = OpenRouterModel('moonshotai/kimi-k2-0905')

response = model_request_sync(model, [ModelRequest.user_text_prompt('Who are you')])

assert response.provider_details is not None
print(response.provider_details['downstream_provider'])  # <-- Final provider that was routed to
# Output: AtlasCloud

Closes Can I get thinking part from openrouter provider using google/gemini-2.5-pro? #2999

The new OpenRouterModelSettings allows for the reasoning parameter by OpenRouter, the thinking can then be accessed as a ThinkingPart in the model response:

from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_sync
from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings

model = OpenRouterModel('google/gemini-2.5-pro')

settings = OpenRouterModelSettings(openrouter_reasoning={'effort': 'high'})

response = model_request_sync(model, [ModelRequest.user_text_prompt('Who are you')], model_settings=settings)

print(response.parts[0])
# Output: ThinkingPart(content='**Identifying the Core Inquiry**\n\nI\'m grappling with the core question: "Who am I?" Initially, I\'m identifying the root of the query. The user wants a fundamental identity explained, and I\'ve begun by pinpointing the key words and associations. AI, specifically. Next step, I\'ll move onto broadening this.\n\n\n**Clarifying My Nature**\n\nI\'m now dissecting the definition of "language model," focusing on what that *means* in practical terms. I\'ve moved past simply stating the term and am now delving into how my functions—answering, generating, translating—are executed. This requires explaining my training on vast datasets and my lack of personal experience, which is key to the identity question. I am trying to find the right framing for this complex process.\n\n\n**Formulating a Direct Response**\n\nI\'m now trying to directly answer the question, avoiding technical jargon where possible. I\'m organizing my response. The essential elements have been identified: My nature, my capabilities, and what I *cannot* do. I\'m thinking of ways to explain these facts in a concise, accessible format, focusing on clarity for the user.\n\n\n**Constructing a Detailed Answer**\n\nI\'m now translating the structured plan into actual sentences. I\'m working on the opening, the "I am..." statement, and aiming for a direct, clear tone. Then, I am carefully crafting the explanation of my capabilities and limitations to avoid misunderstandings. I\'m actively searching for concise and impactful language.\n\n\n**Drafting the Final Response**\n\n\\n\\n\n\nI\'m now integrating all the elements I\'ve identified. I\'m beginning the final draft. I\'m focusing on flow and readability, weaving the key points—my nature, my origin, my abilities, and my constraints—into a cohesive narrative. The goal is a concise and informative self-description, tailored to the user\'s inquiry.\n\n\n', id='reasoning', provider_name='openrouter')

Closes Handle error response from OpenRouter as exception instead of validation failure #2323. Closes OpenRouter uses non-compatible finish reason #2844

These are dependent on some downstream logic from OpenRouter or their own downstream providers (that a response of type 'error' will have a >= 400 status code), but for most cases I would say it works as one would expect:

from pydantic_ai import ModelHTTPError, ModelRequest
from pydantic_ai.direct import model_request_sync
from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings

model = OpenRouterModel('google/gemini-2.5-pro')

settings = OpenRouterModelSettings(
    openrouter_preferences={'only': ['azure']}  # Gemini is not available in Azure; Guaranteed failure.
)

try:
    response = model_request_sync(model, [ModelRequest.user_text_prompt('Who are you')], model_settings=settings)
except ModelHTTPError as e:
    print(e)
# status_code: 404, model_name: google/gemini-2.5-pro, body: {'message': 'No allowed providers are available for the selected model.', 'code': 404}

Add OpenRouterModel #1870 (comment)

Add some additional type support to set the provider routing options from OpenRouter:

from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_sync
from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings

model = OpenRouterModel('moonshotai/kimi-k2-0905')

settings = OpenRouterModelSettings(
    openrouter_preferences={
        'order': ['moonshotai', 'deepinfra', 'fireworks', 'novita'],
        'allow_fallbacks': True,
        'require_parameters': True,
        'data_collection': 'allow',
        'zdr': True,
        'only': ['moonshotai', 'fireworks'],
        'ignore': ['deepinfra'],
        'quantizations': ['fp8'],
        'sort': 'throughput',
        'max_price': {'prompt': 1},
    }
)

response = model_request_sync(model, [ModelRequest.user_text_prompt('Who are you')], model_settings=settings)
assert response.provider_details is not None
print(response.provider_details['downstream_provider'])
# Output: Fireworks

DouweM

@ajac-zero Muchas gracias Anibal!

DouweM · 2025-10-07T14:54:37Z

pydantic_ai_slim/pydantic_ai/models/openrouter.py

+        return new_settings, customized_parameters
+
+    def _process_response(self, response: ChatCompletion | str) -> ModelResponse:
+        model_response = super()._process_response(response=response)


I don't think we've actually fixed #2844 yet. I'd expect us to need to modify that field before calling super()._process_response which would raise the validation error.

According to the OpenRouter docs, a response with the error finish reason will always have a response.error field, so it should get caught by the response.error checker below.

Following this logic, a response with this finish reason but no response.error is probably unintended and should raise an UnexpectedModelBehavior error. We could change this behavior to simply rewrite the error finish reason to stop.

Ah right, checking _verify_response_is_not_error first makes it work.

pydantic_ai_slim/pydantic_ai/models/openrouter.py

DouweM · 2025-10-07T14:56:36Z

pydantic_ai_slim/pydantic_ai/models/openrouter.py

+
+        provider_details: dict[str, str] = {}
+
+        if openrouter_provider := getattr(response, 'provider', None):  # pragma: lax no cover


Is there more interesting data on the OpenRouter response we could store?

We should do this while streaming as well!

So far, I've added native_finish_reason and reasoning_details, we could add annotations as well if you think it belongs there (since I think it is equivalent to the OpenAI annotations)

Annotations/citations we'll get to in #3126. As discussed above, I think we should either fully parse https://github.com/pydantic/pydantic-ai/issues/3126 or omit it for now.

Undertstood, I'd rather omit it for now so we can later use 3126 as a reference.

DouweM · 2025-10-07T14:59:14Z

pydantic_ai_slim/pydantic_ai/models/openrouter.py

+        new_settings = _openrouter_settings_to_openai_settings(cast(OpenRouterModelSettings, merged_settings or {}))
+        return new_settings, customized_parameters
+
+    def _process_response(self, response: ChatCompletion | str) -> ModelResponse:


Should we also implement this so we can support all thinking models on OpenRouter and fully close #2999?

pydantic-ai/pydantic_ai_slim/pydantic_ai/models/openai.py

Lines 555 to 557 in 9b1913e

# NOTE: We don't currently handle OpenRouter `reasoning_details`:

# - https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks

# If you need this, please file an issue.

I'm guessing the Google one you tried works because of this, which also works without a new model class:

pydantic-ai/pydantic_ai_slim/pydantic_ai/models/openai.py

Lines 549 to 553 in 9b1913e

# The `reasoning` field is only present in gpt-oss via Ollama and OpenRouter.

# - https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot#chat-completions-api

# - https://openrouter.ai/docs/use-cases/reasoning-tokens#basic-usage-with-reasoning-tokens

if reasoning := getattr(choice.message, 'reasoning', None):

items.append(ThinkingPart(id='reasoning', content=reasoning, provider_name=self.system))

I have been trying to replicate the bug on the issue without success 🤔.

According to the OpenRouter docs: "Reasoning tokens will appear in the reasoning field of each message". So the logic that is already in OpenAIChatModel should be enough, no?

We could use reasoning_details but the content is the same, with some supplementary info. I've added reasoning_details to the provider_details object in the meantime.

@ajac-zero Among the supplementary info in reasoning_details is the signature, which Anthropic requires to be sent back, at least when used with AnthropicModel and BedrockModel, and presumably also when going through OpenRouter. You can check those classes for how we parse those signatures, store them on ThinkingPart, and send them back.

I'd be OK with doing that in a follow-up PR if necessary, but it'd be good to fully support OpenRouter when we launch this new model class.

Thanks for the pointers! I've added a conditional that adds the signature to the thinking part if it exists, and a wrapper around _map_messages to pass back the reasoning_details following this example.

Currently, the signature conditional assumes the first part will be a thinking part if reasoning_details is not None, because OpenAIChatModel._process_response takes care of that, but maybe adding a type check here would be better in case the OpenAI model logic changes in the future?

if reasoning_details := getattr(choice.message, 'reasoning_details', None): provider_details['reasoning_details'] = reasoning_details if signature := reasoning_details[0].get('signature', None): thinking_part = cast(ThinkingPart, model_response.parts[0]) thinking_part.signature = signature

pydantic_ai_slim/pydantic_ai/models/openrouter.py

Co-authored-by: Douwe Maan <[email protected]>

DouweM · 2025-10-09T16:06:27Z

pydantic_ai_slim/pydantic_ai/models/openrouter.py

+        new_settings = _openrouter_settings_to_openai_settings(cast(OpenRouterModelSettings, merged_settings or {}))
+        return new_settings, customized_parameters
+
+    def _process_response(self, response: ChatCompletion | str) -> ModelResponse:


@ajac-zero Among the supplementary info in reasoning_details is the signature, which Anthropic requires to be sent back, at least when used with AnthropicModel and BedrockModel, and presumably also when going through OpenRouter. You can check those classes for how we parse those signatures, store them on ThinkingPart, and send them back.

I'd be OK with doing that in a follow-up PR if necessary, but it'd be good to fully support OpenRouter when we launch this new model class.

DouweM · 2025-10-09T16:07:58Z

pydantic_ai_slim/pydantic_ai/models/openrouter.py

+        return new_settings, customized_parameters
+
+    def _process_response(self, response: ChatCompletion | str) -> ModelResponse:
+        model_response = super()._process_response(response=response)


Ah right, checking _verify_response_is_not_error first makes it work.

DouweM · 2025-10-09T16:09:47Z

pydantic_ai_slim/pydantic_ai/models/openrouter.py

+
+        provider_details: dict[str, str] = {}
+
+        if openrouter_provider := getattr(response, 'provider', None):  # pragma: lax no cover


Annotations/citations we'll get to in #3126. As discussed above, I think we should either fully parse https://github.com/pydantic/pydantic-ai/issues/3126 or omit it for now.

ajac-zero added 3 commits October 3, 2025 18:40

Add OpenRouter support and test coverage

227e873

Add OpenRouter reasoning config and refactor response details

c3c1546

Move OpenRouterModelSettings import into try block

10a1a17

DouweM self-assigned this Oct 7, 2025

DouweM requested changes Oct 7, 2025

View reviewed changes

DouweM mentioned this pull request Oct 7, 2025

Add native OpenRouter model #2409

Closed

DouweM added the awaiting author revision label Oct 7, 2025

ajac-zero and others added 5 commits October 7, 2025 13:36

Update pydantic_ai_slim/pydantic_ai/models/openrouter.py

5e64a62

Co-authored-by: Douwe Maan <[email protected]>

Merge branch 'main' into main

5e787da

Handle OpenRouter errors and extract response metadata

e219b8c

Merge branch 'pydantic:main' into main

c5e0600

Add type ignores to tests

6f99fb2

DouweM requested changes Oct 9, 2025

View reviewed changes

DouweM mentioned this pull request Oct 10, 2025

Create LiteLLMModel to fix thinking parts not being sent to Anthropic on Vertex via LiteLLM with OpenAIChatModel #3113

Open

2 tasks

ajac-zero added 2 commits October 10, 2025 09:17

Merge branch 'pydantic:main' into main

83d14b1

Send back reasoning_details/signature

ef3c6dd


		provider_details: dict[str, str] = {}

		if openrouter_provider := getattr(response, 'provider', None): # pragma: lax no cover

	# NOTE: We don't currently handle OpenRouter `reasoning_details`:
	# - https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks
	# If you need this, please file an issue.

	# The `reasoning` field is only present in gpt-oss via Ollama and OpenRouter.
	# - https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot#chat-completions-api
	# - https://openrouter.ai/docs/use-cases/reasoning-tokens#basic-usage-with-reasoning-tokens
	if reasoning := getattr(choice.message, 'reasoning', None):
	items.append(ThinkingPart(id='reasoning', content=reasoning, provider_name=self.system))

Add OpenRouterModel as OpenAIChatModel subclass #3089

Are you sure you want to change the base?

Add OpenRouterModel as OpenAIChatModel subclass #3089

Uh oh!

Conversation

ajac-zero commented Oct 5, 2025 • edited by DouweM Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Addressed issues

Uh oh!

DouweM left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ajac-zero commented Oct 5, 2025 •

edited by DouweM

Loading