-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add OpenRouterModel as OpenAIChatModel subclass #3089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ajac-zero
wants to merge
48
commits into
pydantic:main
Choose a base branch
from
ajac-zero:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 11 commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
227e873
Add OpenRouter support and test coverage
ajac-zero c3c1546
Add OpenRouter reasoning config and refactor response details
ajac-zero 10a1a17
Move OpenRouterModelSettings import into try block
ajac-zero 5e64a62
Update pydantic_ai_slim/pydantic_ai/models/openrouter.py
ajac-zero 5e787da
Merge branch 'main' into main
ajac-zero e219b8c
Handle OpenRouter errors and extract response metadata
ajac-zero c5e0600
Merge branch 'pydantic:main' into main
ajac-zero 6f99fb2
Add type ignores to tests
ajac-zero 83d14b1
Merge branch 'pydantic:main' into main
ajac-zero ef3c6dd
Send back reasoning_details/signature
ajac-zero 0ba3691
Merge branch 'pydantic:main' into main
ajac-zero ed9e7df
add OpenRouterChatCompletion model
ajac-zero 0689e29
Merge branch 'pydantic:main' into main
ajac-zero 75adbb4
Update pydantic_ai_slim/pydantic_ai/models/openrouter.py
ajac-zero ab9d690
Update pydantic_ai_slim/pydantic_ai/models/openrouter.py
ajac-zero db1630d
Merge branch 'main' into main
ajac-zero 5700a19
fix spelling mistake
ajac-zero ee93121
add openrouter web plugin
ajac-zero ca45f8a
WIP build reasoning_details from ThinkingParts
ajac-zero 113c1cb
Merge branch 'main' into main
ajac-zero 5db26d0
Merge branch 'main' into main
ajac-zero 91bee62
Merge branch 'main' into main
ajac-zero b325816
wip reasoning details conversion
ajac-zero 1db529f
finish openrouter thinking part
ajac-zero 3d7f1b4
add preserve reasoning tokens test
ajac-zero 1d7a8a4
fix typing
ajac-zero e81621b
Merge branch 'main' into main
ajac-zero c6aca8d
Merge branch 'main' into main
ajac-zero 516e823
remove <thinking> tags from content
ajac-zero b8406d0
Merge branch 'main' into main
ajac-zero c16c960
fix typing
ajac-zero 63d1b84
Merge branch 'main' into main
ajac-zero 0835073
Merge branch 'main' into main
ajac-zero baede41
add _map_model_response method
ajac-zero 89ef9a8
move assert_never import to typing_extensions
ajac-zero ebc8d08
add tool calling test
ajac-zero 21a78e4
replace process_response with hooks
ajac-zero 0b37792
add stream hooks
ajac-zero 8d090f0
simplify hooks
ajac-zero e8c3c81
fix coverage/linting
ajac-zero 02b8527
Merge branch 'main' into main
ajac-zero 895ea03
Merge branch 'main' into main
ajac-zero 7c50f07
fix lint
ajac-zero 8e32475
replace OpenRouterThinking with encoding in 'id'
ajac-zero 9d57be0
Merge branch 'main' into main
ajac-zero 0a110d2
replace cast with asserts
ajac-zero 64f75f8
Merge branch 'main' into main
ajac-zero a1f5385
fix merge changes
ajac-zero File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,337 @@ | ||
| from typing import Any, Literal, cast | ||
|
|
||
| from openai import AsyncOpenAI | ||
| from openai.types.chat import ChatCompletion, ChatCompletionMessageParam | ||
| from pydantic import BaseModel | ||
| from typing_extensions import TypedDict | ||
|
|
||
| from ..exceptions import ModelHTTPError, UnexpectedModelBehavior | ||
| from ..messages import ( | ||
| ModelMessage, | ||
| ModelResponse, | ||
| ThinkingPart, | ||
| ) | ||
| from ..profiles import ModelProfileSpec | ||
| from ..providers import Provider | ||
| from ..settings import ModelSettings | ||
| from . import ModelRequestParameters | ||
| from .openai import OpenAIChatModel, OpenAIChatModelSettings | ||
|
|
||
|
|
||
| class OpenRouterMaxPrice(TypedDict, total=False): | ||
| """The object specifying the maximum price you want to pay for this request. USD price per million tokens, for prompt and completion.""" | ||
|
|
||
| prompt: int | ||
| completion: int | ||
| image: int | ||
| audio: int | ||
| request: int | ||
|
|
||
|
|
||
| LatestOpenRouterSlugs = Literal[ | ||
| 'z-ai', | ||
| 'cerebras', | ||
| 'venice', | ||
| 'moonshotai', | ||
| 'morph', | ||
| 'stealth', | ||
| 'wandb', | ||
| 'klusterai', | ||
| 'openai', | ||
| 'sambanova', | ||
| 'amazon-bedrock', | ||
| 'mistral', | ||
| 'nextbit', | ||
| 'atoma', | ||
| 'ai21', | ||
| 'minimax', | ||
| 'baseten', | ||
| 'anthropic', | ||
| 'featherless', | ||
| 'groq', | ||
| 'lambda', | ||
| 'azure', | ||
| 'ncompass', | ||
| 'deepseek', | ||
| 'hyperbolic', | ||
| 'crusoe', | ||
| 'cohere', | ||
| 'mancer', | ||
| 'avian', | ||
| 'perplexity', | ||
| 'novita', | ||
| 'siliconflow', | ||
| 'switchpoint', | ||
| 'xai', | ||
| 'inflection', | ||
| 'fireworks', | ||
| 'deepinfra', | ||
| 'inference-net', | ||
| 'inception', | ||
| 'atlas-cloud', | ||
| 'nvidia', | ||
| 'alibaba', | ||
| 'friendli', | ||
| 'infermatic', | ||
| 'targon', | ||
| 'ubicloud', | ||
| 'aion-labs', | ||
| 'liquid', | ||
| 'nineteen', | ||
| 'cloudflare', | ||
| 'nebius', | ||
| 'chutes', | ||
| 'enfer', | ||
| 'crofai', | ||
| 'open-inference', | ||
| 'phala', | ||
| 'gmicloud', | ||
| 'meta', | ||
| 'relace', | ||
| 'parasail', | ||
| 'together', | ||
| 'google-ai-studio', | ||
| 'google-vertex', | ||
| ] | ||
| """Known providers in the OpenRouter marketplace""" | ||
|
|
||
| OpenRouterSlug = str | LatestOpenRouterSlugs | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| """Possible OpenRouter provider slugs. | ||
|
|
||
| Since OpenRouter is constantly updating their list of providers, we explicitly list some known providers but | ||
| allow any name in the type hints. | ||
| See [the OpenRouter API](https://openrouter.ai/docs/api-reference/list-available-providers) for a full list. | ||
| """ | ||
|
|
||
| Transforms = Literal['middle-out'] | ||
| """Available messages transforms for OpenRouter models with limited token windows. | ||
|
|
||
| Currently only supports 'middle-out', but is expected to grow in the future. | ||
| """ | ||
|
|
||
|
|
||
| class OpenRouterProvider(TypedDict, total=False): | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| """Represents the 'Provider' object from the OpenRouter API.""" | ||
|
|
||
| order: list[OpenRouterSlug] | ||
| """List of provider slugs to try in order (e.g. ["anthropic", "openai"]). [See details](https://openrouter.ai/docs/features/provider-routing#ordering-specific-providers)""" | ||
|
|
||
| allow_fallbacks: bool | ||
| """Whether to allow backup providers when the primary is unavailable. [See details](https://openrouter.ai/docs/features/provider-routing#disabling-fallbacks)""" | ||
|
|
||
| require_parameters: bool | ||
| """Only use providers that support all parameters in your request.""" | ||
|
|
||
| data_collection: Literal['allow', 'deny'] | ||
| """Control whether to use providers that may store data. [See details](https://openrouter.ai/docs/features/provider-routing#requiring-providers-to-comply-with-data-policies)""" | ||
|
|
||
| zdr: bool | ||
| """Restrict routing to only ZDR (Zero Data Retention) endpoints. [See details](https://openrouter.ai/docs/features/provider-routing#zero-data-retention-enforcement)""" | ||
|
|
||
| only: list[OpenRouterSlug] | ||
| """List of provider slugs to allow for this request. [See details](https://openrouter.ai/docs/features/provider-routing#allowing-only-specific-providers)""" | ||
|
|
||
| ignore: list[str] | ||
| """List of provider slugs to skip for this request. [See details](https://openrouter.ai/docs/features/provider-routing#ignoring-providers)""" | ||
|
|
||
| quantizations: list[Literal['int4', 'int8', 'fp4', 'fp6', 'fp8', 'fp16', 'bf16', 'fp32', 'unknown']] | ||
| """List of quantization levels to filter by (e.g. ["int4", "int8"]). [See details](https://openrouter.ai/docs/features/provider-routing#quantization)""" | ||
|
|
||
| sort: Literal['price', 'throughput', 'latency'] | ||
| """Sort providers by price or throughput. (e.g. "price" or "throughput"). [See details](https://openrouter.ai/docs/features/provider-routing#provider-sorting)""" | ||
|
|
||
| max_price: OpenRouterMaxPrice | ||
| """The maximum pricing you want to pay for this request. [See details](https://openrouter.ai/docs/features/provider-routing#max-price)""" | ||
|
|
||
|
|
||
| class OpenRouterReasoning(TypedDict, total=False): | ||
| """Configuration for reasoning tokens in OpenRouter requests. | ||
|
|
||
| Reasoning tokens allow models to show their step-by-step thinking process. | ||
| You can configure this using either OpenAI-style effort levels or Anthropic-style | ||
| token limits, but not both simultaneously. | ||
| """ | ||
|
|
||
| effort: Literal['high', 'medium', 'low'] | ||
| """OpenAI-style reasoning effort level. Cannot be used with max_tokens.""" | ||
|
|
||
| max_tokens: int | ||
| """Anthropic-style specific token limit for reasoning. Cannot be used with effort.""" | ||
|
|
||
| exclude: bool | ||
| """Whether to exclude reasoning tokens from the response. Default is False. All models support this.""" | ||
|
|
||
| enabled: bool | ||
| """Whether to enable reasoning with default parameters. Default is inferred from effort or max_tokens.""" | ||
|
|
||
|
|
||
| class OpenRouterModelSettings(ModelSettings, total=False): | ||
| """Settings used for an OpenRouter model request.""" | ||
|
|
||
| # ALL FIELDS MUST BE `openrouter_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. | ||
|
|
||
| openrouter_models: list[str] | ||
| """A list of fallback models. | ||
|
|
||
| These models will be tried, in order, if the main model returns an error. [See details](https://openrouter.ai/docs/features/model-routing#the-models-parameter) | ||
| """ | ||
|
|
||
| openrouter_provider: OpenRouterProvider | ||
| """OpenRouter routes requests to the best available providers for your model. By default, requests are load balanced across the top providers to maximize uptime. | ||
|
|
||
| You can customize how your requests are routed using the provider object. [See more](https://openrouter.ai/docs/features/provider-routing)""" | ||
|
|
||
| openrouter_preset: str | ||
| """Presets allow you to separate your LLM configuration from your code. | ||
|
|
||
| Create and manage presets through the OpenRouter web application to control provider routing, model selection, system prompts, and other parameters, then reference them in OpenRouter API requests. [See more](https://openrouter.ai/docs/features/presets)""" | ||
|
|
||
| openrouter_transforms: list[Transforms] | ||
| """To help with prompts that exceed the maximum context size of a model. | ||
|
|
||
| Transforms work by removing or truncating messages from the middle of the prompt, until the prompt fits within the model's context window. [See more](https://openrouter.ai/docs/features/message-transforms) | ||
| """ | ||
|
|
||
| openrouter_reasoning: OpenRouterReasoning | ||
| """To control the reasoning tokens in the request. | ||
|
|
||
| The reasoning config object consolidates settings for controlling reasoning strength across different models. [See more](https://openrouter.ai/docs/use-cases/reasoning-tokens) | ||
| """ | ||
|
|
||
|
|
||
| class OpenRouterError(BaseModel): | ||
| """Utility class to validate error messages from OpenRouter.""" | ||
|
|
||
| code: int | ||
| message: str | ||
|
|
||
|
|
||
| def _openrouter_settings_to_openai_settings(model_settings: OpenRouterModelSettings) -> OpenAIChatModelSettings: | ||
| """Transforms a 'OpenRouterModelSettings' object into an 'OpenAIChatModelSettings' object. | ||
|
|
||
| Args: | ||
| model_settings: The 'OpenRouterModelSettings' object to transform. | ||
|
|
||
| Returns: | ||
| An 'OpenAIChatModelSettings' object with equivalent settings. | ||
| """ | ||
| extra_body: dict[str, Any] = {} | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| if models := model_settings.get('openrouter_models'): | ||
| extra_body['models'] = models | ||
| if provider := model_settings.get('openrouter_provider'): | ||
| extra_body['provider'] = provider | ||
| if preset := model_settings.get('openrouter_preset'): | ||
| extra_body['preset'] = preset | ||
| if transforms := model_settings.get('openrouter_transforms'): | ||
| extra_body['transforms'] = transforms | ||
|
|
||
| base_keys = ModelSettings.__annotations__.keys() | ||
| base_data: dict[str, Any] = {k: model_settings[k] for k in base_keys if k in model_settings} | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| new_settings = OpenAIChatModelSettings(**base_data, extra_body=extra_body) | ||
|
|
||
| return new_settings | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| def _verify_response_is_not_error(response: ChatCompletion) -> ChatCompletion: | ||
| """Checks a pre-validation 'ChatCompletion' object for the error attribute. | ||
|
|
||
| Args: | ||
| response: The 'ChatCompletion' object to validate. | ||
|
|
||
| Returns: | ||
| The same 'ChatCompletion' object. | ||
|
|
||
| Raises: | ||
| ModelHTTPError: If the response contains an error attribute. | ||
| UnexpectedModelBehavior: If the response does not contain an error attribute but contains an 'error' finish_reason. | ||
| """ | ||
| if openrouter_error := getattr(response, 'error', None): | ||
| error = OpenRouterError.model_validate(openrouter_error) | ||
| raise ModelHTTPError(status_code=error.code, model_name=response.model, body=error.message) | ||
| else: | ||
| choice = response.choices[0] | ||
|
|
||
| if choice.finish_reason == 'error': # type: ignore[reportUnnecessaryComparison] | ||
| raise UnexpectedModelBehavior( | ||
| 'Invalid response from OpenRouter chat completions endpoint, error finish_reason without error data' | ||
| ) | ||
|
|
||
| return response | ||
|
|
||
|
|
||
| class OpenRouterModel(OpenAIChatModel): | ||
| """Extends OpenAIModel to capture extra metadata for Openrouter.""" | ||
|
|
||
| def __init__( | ||
| self, | ||
| model_name: str, | ||
| *, | ||
| provider: Literal['openrouter'] | Provider[AsyncOpenAI] = 'openrouter', | ||
| profile: ModelProfileSpec | None = None, | ||
| settings: ModelSettings | None = None, | ||
| ): | ||
| """Initialize an OpenRouter model. | ||
|
|
||
| Args: | ||
| model_name: The name of the model to use. | ||
| provider: The provider to use for authentication and API access. Currently, uses OpenAI as the internal client. Can be either the string | ||
| 'openrouter' or an instance of `Provider[AsyncOpenAI]`. If not provided, a new provider will be | ||
| created using the other parameters. | ||
| profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. | ||
| settings: Model-specific settings that will be used as defaults for this model. | ||
| """ | ||
| super().__init__(model_name, provider=provider, profile=profile, settings=settings) | ||
|
|
||
| def prepare_request( | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Someone just pointed out OpenRouter supports image generation as well: #3258. If you happen to have time, feel free to include it, but if not we'll leave it for a follow-up PR -- this one is getting quite big anyway. |
||
| self, | ||
| model_settings: ModelSettings | None, | ||
| model_request_parameters: ModelRequestParameters, | ||
| ) -> tuple[ModelSettings | None, ModelRequestParameters]: | ||
| merged_settings, customized_parameters = super().prepare_request(model_settings, model_request_parameters) | ||
| new_settings = _openrouter_settings_to_openai_settings(cast(OpenRouterModelSettings, merged_settings or {})) | ||
| return new_settings, customized_parameters | ||
|
|
||
| def _process_response(self, response: ChatCompletion | str) -> ModelResponse: | ||
DouweM marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| if not isinstance(response, ChatCompletion): | ||
| raise UnexpectedModelBehavior( | ||
| 'Invalid response from OpenRouter chat completions endpoint, expected JSON data' | ||
| ) | ||
|
|
||
| response = _verify_response_is_not_error(response) | ||
|
|
||
| model_response = super()._process_response(response=response) | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| provider_details: dict[str, Any] = {} | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| if openrouter_provider := getattr(response, 'provider', None): # pragma: lax no cover | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| provider_details['downstream_provider'] = openrouter_provider | ||
|
|
||
| choice = response.choices[0] | ||
|
|
||
| if native_finish_reason := getattr(choice, 'native_finish_reason', None): # pragma: lax no cover | ||
| provider_details['native_finish_reason'] = native_finish_reason | ||
|
|
||
| if reasoning_details := getattr(choice.message, 'reasoning_details', None): | ||
| provider_details['reasoning_details'] = reasoning_details | ||
|
|
||
| if signature := reasoning_details[0].get('signature', None): | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| thinking_part = cast(ThinkingPart, model_response.parts[0]) | ||
| thinking_part.signature = signature | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| model_response.provider_details = provider_details | ||
|
|
||
| return model_response | ||
|
|
||
| async def _map_messages(self, messages: list[ModelMessage]) -> list[ChatCompletionMessageParam]: | ||
| """Maps a `pydantic_ai.Message` to a `openai.types.ChatCompletionMessageParam` and adds OpenRouter specific parameters.""" | ||
| openai_messages = await super()._map_messages(messages) | ||
|
|
||
| for message, openai_message in zip(messages, openai_messages): | ||
| if isinstance(message, ModelResponse): | ||
| provider_details = cast(dict[str, Any], message.provider_details) | ||
| if reasoning_details := provider_details.get('reasoning_details', None): # pragma: lax no cover | ||
ajac-zero marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| openai_message['reasoning_details'] = reasoning_details # type: ignore[reportGeneralTypeIssue] | ||
|
|
||
| return openai_messages | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.