-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Canonical models for Providers #5694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a canonical model system to standardize model metadata across different LLM providers (Anthropic, Google, OpenAI). The system is based on OpenRouter's model schema and includes automated fetching capabilities. Key additions include data structures for model metadata (pricing, variants, capabilities), a registry for managing canonical models, utilities for version suffix stripping, example scripts for fetching and validating model mappings, and a comprehensive JSON file with model definitions.
Reviewed Changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| crates/goose/src/providers/mod.rs | Adds the canonical module to the providers module structure |
| crates/goose/src/providers/canonical/registry.rs | Implements the registry for loading, storing, and querying canonical models |
| crates/goose/src/providers/canonical/name_builder.rs | Provides utilities to strip version suffixes and build canonical model names |
| crates/goose/src/providers/canonical/model.rs | Defines core data structures for canonical models, pricing, and variants |
| crates/goose/src/providers/canonical/mod.rs | Exports public API for the canonical model system |
| crates/goose/src/providers/canonical/canonical_models.json | Contains 1620 lines of model definitions with pricing and metadata |
| crates/goose/src/providers/canonical/README.md | Documents the canonical model system architecture and usage |
| crates/goose/src/providers/base.rs | Adds trait method for providers to map their models to canonical models |
| crates/goose/src/providers/anthropic.rs | Fixes JSON field name from "models" to "data" |
| crates/goose/examples/fetch_canonical_models.rs | Example script to fetch and update canonical models from OpenRouter |
| crates/goose/examples/canonical_model_checker.rs | Example script to validate provider model mappings |
| canonical_mapping_report.json | Generated report showing unmapped models across providers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 11 out of 13 changed files in this pull request and generated no new comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 16 out of 19 changed files in this pull request and generated 3 comments.
crates/goose/src/providers/base.rs
Outdated
| let force_show_all = std::env::var("FORCE_SHOW_ALL_MODELS") | ||
| .map(|v| { | ||
| let lower = v.to_lowercase(); | ||
| lower == "true" || lower == "1" || lower == "yes" | ||
| }) | ||
| .unwrap_or(false); |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FORCE_SHOW_ALL_MODELS environment variable check is case-sensitive which could lead to unexpected behavior. A user setting "TRUE" or "Yes" would not enable the feature, while "true" or "yes" would. Consider documenting this behavior or handling all common case variations consistently.
| .collect(); | ||
|
|
||
| if recommended_models.is_empty() { | ||
| Ok(Some(all_models)) |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the registry fails to load, the error is converted to a ProviderError::ExecutionError with the error message. However, if all models fail the canonical model check (resulting in an empty recommended_models list), the code falls back to returning all models. This means a registry loading failure could silently degrade to showing all models instead of failing explicitly. Consider whether this fallback behavior is desired when the registry fails to load.
| Ok(Some(all_models)) | |
| // Instead of falling back to all models, return an empty list to avoid silent degradation. | |
| Ok(Some(vec![])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can filter all models and that's just a bad experience to return nothing.
| let recommended_models: Vec<String> = all_models | ||
| .iter() | ||
| .filter(|model| { | ||
| map_to_canonical_model(provider_name, model, registry) | ||
| .and_then(|canonical_id| registry.get(&canonical_id)) | ||
| .map(|m| m.input_modalities.contains(&"text".to_string())) |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filter logic checks that models have "text" in input_modalities but doesn't validate other important characteristics. Models with only image input (no text) would be filtered out even if they're valid. Consider if additional modality checks are needed or if this correctly represents "recommended" models for the use case.
| let recommended_models: Vec<String> = all_models | |
| .iter() | |
| .filter(|model| { | |
| map_to_canonical_model(provider_name, model, registry) | |
| .and_then(|canonical_id| registry.get(&canonical_id)) | |
| .map(|m| m.input_modalities.contains(&"text".to_string())) | |
| // Define recommended modalities; expand as needed for your use case | |
| let recommended_modalities = vec!["text".to_string(), "image".to_string()]; | |
| let recommended_models: Vec<String> = all_models | |
| .iter() | |
| .filter(|model| { | |
| map_to_canonical_model(provider_name, model, registry) | |
| .and_then(|canonical_id| registry.get(&canonical_id)) | |
| .map(|m| { | |
| m.input_modalities.iter().any(|modality| recommended_modalities.contains(modality)) | |
| }) |
DOsinga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the right idea for sure, but some of the code is a tad on the vibe coded side of things. quite some repetition that can be avoided and comments that don't really help
| @@ -0,0 +1,5271 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the other one is the dupe
| @@ -0,0 +1,4413 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these models list a lot of stuff that seems to be block specific
|
|
||
| let json = response.payload.unwrap_or_default(); | ||
| let arr = match json.get("models").and_then(|v| v.as_array()) { | ||
| let arr = match json.get("data").and_then(|v| v.as_array()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh? was this broken?
| #[derive(Deserialize, ToSchema, utoipa::IntoParams)] | ||
| pub struct ModelFilterQuery { | ||
| #[serde(default)] | ||
| pub show_all: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not default booleans if it can be avoided. also do we need this? are we not sure about ourselves that we can just always filter (the user can still override by typing the model name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fair. Got rid of all of the boolean + the cli flag.
So the auto-complete will always filter based on this, but you can ignore it and just type whatever.
| use std::collections::HashMap; | ||
|
|
||
| const OPENROUTER_API_URL: &str = "https://openrouter.ai/api/v1/models"; | ||
| const ALLOWED_PROVIDERS: &[&str] = &[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we base this on our own providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's based on all the ones I have keys for. I took off databricks because it's a bunch of block specific model names, but can potentially add that back (and a lot of the logic in the name converter specifically addresses some of the cases I saw there).
| @@ -0,0 +1,369 @@ | |||
| /// Canonical Model Checker | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be in examples?
in general how are we going to integrate this with the overall process. how long does it take to run all of this? can we add it to the CI and then fail when it is not up to date? or add it to the release process /cc @jamaedo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @jamadeo (working tag)
I was thinking add to the release process. Run build_canonical_models, followed by canonical_model_checker.
So then at least we have some record in the release logs of the model diff.
pricing.rs uses a local cache (refreshes weekly), so that's also worth considering if we're worried about freshness, but I think releases should fresh enough.
| #[test] | ||
| fn test_canonical_names() { | ||
| // canonical_name tests | ||
| assert_eq!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, very repetitive - use parameterized test. also this is an insane long list of tests. strip them down to the ones that we actually need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Now a single parameterized test on just the outermost fn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 13 out of 16 changed files in this pull request and generated 2 comments.
| async fn map_to_canonical_model( | ||
| &self, | ||
| provider_model: &str, | ||
| ) -> Result<Option<String>, ProviderError> { | ||
| use super::canonical::map_to_canonical_model; | ||
|
|
||
| let registry = CanonicalModelRegistry::bundled().map_err(|e| { | ||
| ProviderError::ExecutionError(format!("Failed to load canonical registry: {}", e)) | ||
| })?; | ||
|
|
||
| Ok(map_to_canonical_model( | ||
| self.get_name(), | ||
| provider_model, | ||
| registry, | ||
| )) | ||
| } |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The registry is loaded twice in quick succession - once in fetch_recommended_models (line 426) and again in map_to_canonical_model (line 457). Consider caching the registry at the provider level or passing it as a parameter to avoid redundant deserialize operations.
| while changed { | ||
| let before = result.clone(); | ||
| for pattern in STRIP_PATTERNS.iter() { | ||
| result = pattern.replace(&result, "").to_string(); | ||
| } | ||
| changed = result != before; |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The while loop could potentially run indefinitely if a regex pattern replacement produces a string that triggers another pattern, creating a replacement cycle. Consider adding a maximum iteration count as a safety guard against infinite loops.
| while changed { | |
| let before = result.clone(); | |
| for pattern in STRIP_PATTERNS.iter() { | |
| result = pattern.replace(&result, "").to_string(); | |
| } | |
| changed = result != before; | |
| let mut iterations = 0; | |
| const MAX_ITERATIONS: usize = 10; | |
| while changed && iterations < MAX_ITERATIONS { | |
| let before = result.clone(); | |
| for pattern in STRIP_PATTERNS.iter() { | |
| result = pattern.replace(&result, "").to_string(); | |
| } | |
| changed = result != before; | |
| iterations += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 13 out of 16 changed files in this pull request and generated 1 comment.
| .collect(); | ||
|
|
||
| if recommended_models.is_empty() { | ||
| Ok(Some(all_models)) |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback to all_models when recommended_models is empty may not be the desired behavior. If filtering finds zero models that support text input, returning all models (including potentially non-text models) could lead to errors downstream. Consider logging a warning or returning an empty list instead to make the failure more explicit.
| Ok(Some(all_models)) | |
| Ok(Some(vec![])) |
* main: fix: Make datetime info message more explicit to prevent LLM confusion about current year (#6101) refactor: unify subagent and subrecipe tools into single tool (#5893) goose repo is too big for the issue solver workflow worker (#6099) fix: use system not developer role in db (#6098) Add /goose issue solver github workflow (#6068) OpenAI responses streaming (#5837) Canonical models for Providers (#5694) feat: add Inception provider for Mercury models (#6029) fix old sessions with tool results not loading (#6094) Fix community page mobile responsiveness and horizontal overflow (#6082)
* 'main' of github.com:block/goose: (22 commits) Disallow subagents with no extensions (#5825) chore(deps): bump js-yaml in /documentation (#6093) feat: external goosed server (#5978) fix: Make datetime info message more explicit to prevent LLM confusion about current year (#6101) refactor: unify subagent and subrecipe tools into single tool (#5893) goose repo is too big for the issue solver workflow worker (#6099) fix: use system not developer role in db (#6098) Add /goose issue solver github workflow (#6068) OpenAI responses streaming (#5837) Canonical models for Providers (#5694) feat: add Inception provider for Mercury models (#6029) fix old sessions with tool results not loading (#6094) Fix community page mobile responsiveness and horizontal overflow (#6082) Tool reply meta (#6074) chore: avoid accidentally using native tls again (#6086) Update vars to be capitalised to be in line with other variables in config file (#6085) docs: restructure recipe reference (#5972) docs: configure custom providers (#6044) docs: Community All-Stars Spotlight November 2025, CodeTV Hackathon edition (#6070) fix: include file attachments in queued messages (#5961) ... # Conflicts: # crates/goose-server/src/routes/agent.rs # crates/goose/src/agents/extension_manager.rs # ui/desktop/src/api/sdk.gen.ts
…sions * 'main' of github.com:block/goose: (22 commits) Disallow subagents with no extensions (#5825) chore(deps): bump js-yaml in /documentation (#6093) feat: external goosed server (#5978) fix: Make datetime info message more explicit to prevent LLM confusion about current year (#6101) refactor: unify subagent and subrecipe tools into single tool (#5893) goose repo is too big for the issue solver workflow worker (#6099) fix: use system not developer role in db (#6098) Add /goose issue solver github workflow (#6068) OpenAI responses streaming (#5837) Canonical models for Providers (#5694) feat: add Inception provider for Mercury models (#6029) fix old sessions with tool results not loading (#6094) Fix community page mobile responsiveness and horizontal overflow (#6082) Tool reply meta (#6074) chore: avoid accidentally using native tls again (#6086) Update vars to be capitalised to be in line with other variables in config file (#6085) docs: restructure recipe reference (#5972) docs: configure custom providers (#6044) docs: Community All-Stars Spotlight November 2025, CodeTV Hackathon edition (#6070) fix: include file attachments in queued messages (#5961) ... # Conflicts: # crates/goose-server/src/routes/agent.rs # crates/goose/src/agents/extension_manager.rs # ui/desktop/src/api/sdk.gen.ts
Build core canonical model implementation
Two scripts
Providers can use map_to_canonical_model to lookup this info (if the model is mapped).
Once we're settled on this will further integrate it in providers and pull + add more information to these.
Currently integrated in model filtering.
Pricing is next; The hit rate for the current system is pretty low outside of openrouter.