Skip to content

Conversation

@katzdave
Copy link
Collaborator

@katzdave katzdave commented Nov 12, 2025

Build core canonical model implementation

Two scripts

  • build_canonical_model -> pulls all models from openrouter and loads them into our json file
  • canonical_model_checker -> produces the full mappings from our providers as to which (provider, models) are being mapped into which canonical models.

Providers can use map_to_canonical_model to lookup this info (if the model is mapped).

Once we're settled on this will further integrate it in providers and pull + add more information to these.

Currently integrated in model filtering.

Pricing is next; The hit rate for the current system is pretty low outside of openrouter.

Copilot AI review requested due to automatic review settings November 12, 2025 06:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a canonical model system to standardize model metadata across different LLM providers (Anthropic, Google, OpenAI). The system is based on OpenRouter's model schema and includes automated fetching capabilities. Key additions include data structures for model metadata (pricing, variants, capabilities), a registry for managing canonical models, utilities for version suffix stripping, example scripts for fetching and validating model mappings, and a comprehensive JSON file with model definitions.

Reviewed Changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated no comments.

Show a summary per file
File Description
crates/goose/src/providers/mod.rs Adds the canonical module to the providers module structure
crates/goose/src/providers/canonical/registry.rs Implements the registry for loading, storing, and querying canonical models
crates/goose/src/providers/canonical/name_builder.rs Provides utilities to strip version suffixes and build canonical model names
crates/goose/src/providers/canonical/model.rs Defines core data structures for canonical models, pricing, and variants
crates/goose/src/providers/canonical/mod.rs Exports public API for the canonical model system
crates/goose/src/providers/canonical/canonical_models.json Contains 1620 lines of model definitions with pricing and metadata
crates/goose/src/providers/canonical/README.md Documents the canonical model system architecture and usage
crates/goose/src/providers/base.rs Adds trait method for providers to map their models to canonical models
crates/goose/src/providers/anthropic.rs Fixes JSON field name from "models" to "data"
crates/goose/examples/fetch_canonical_models.rs Example script to fetch and update canonical models from OpenRouter
crates/goose/examples/canonical_model_checker.rs Example script to validate provider model mappings
canonical_mapping_report.json Generated report showing unmapped models across providers

@katzdave katzdave marked this pull request as draft November 12, 2025 07:13
@katzdave katzdave marked this pull request as ready for review November 14, 2025 06:58
@katzdave katzdave requested review from DOsinga and Copilot November 14, 2025 06:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings December 11, 2025 16:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 19 changed files in this pull request and generated 3 comments.

Comment on lines 421 to 426
let force_show_all = std::env::var("FORCE_SHOW_ALL_MODELS")
.map(|v| {
let lower = v.to_lowercase();
lower == "true" || lower == "1" || lower == "yes"
})
.unwrap_or(false);
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FORCE_SHOW_ALL_MODELS environment variable check is case-sensitive which could lead to unexpected behavior. A user setting "TRUE" or "Yes" would not enable the feature, while "true" or "yes" would. Consider documenting this behavior or handling all common case variations consistently.

Copilot uses AI. Check for mistakes.
.collect();

if recommended_models.is_empty() {
Ok(Some(all_models))
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the registry fails to load, the error is converted to a ProviderError::ExecutionError with the error message. However, if all models fail the canonical model check (resulting in an empty recommended_models list), the code falls back to returning all models. This means a registry loading failure could silently degrade to showing all models instead of failing explicitly. Consider whether this fallback behavior is desired when the registry fails to load.

Suggested change
Ok(Some(all_models))
// Instead of falling back to all models, return an empty list to avoid silent degradation.
Ok(Some(vec![]))

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can filter all models and that's just a bad experience to return nothing.

Comment on lines +444 to +449
let recommended_models: Vec<String> = all_models
.iter()
.filter(|model| {
map_to_canonical_model(provider_name, model, registry)
.and_then(|canonical_id| registry.get(&canonical_id))
.map(|m| m.input_modalities.contains(&"text".to_string()))
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter logic checks that models have "text" in input_modalities but doesn't validate other important characteristics. Models with only image input (no text) would be filtered out even if they're valid. Consider if additional modality checks are needed or if this correctly represents "recommended" models for the use case.

Suggested change
let recommended_models: Vec<String> = all_models
.iter()
.filter(|model| {
map_to_canonical_model(provider_name, model, registry)
.and_then(|canonical_id| registry.get(&canonical_id))
.map(|m| m.input_modalities.contains(&"text".to_string()))
// Define recommended modalities; expand as needed for your use case
let recommended_modalities = vec!["text".to_string(), "image".to_string()];
let recommended_models: Vec<String> = all_models
.iter()
.filter(|model| {
map_to_canonical_model(provider_name, model, registry)
.and_then(|canonical_id| registry.get(&canonical_id))
.map(|m| {
m.input_modalities.iter().any(|modality| recommended_modalities.contains(modality))
})

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the right idea for sure, but some of the code is a tad on the vibe coded side of things. quite some repetition that can be avoided and comments that don't really help

@@ -0,0 +1,5271 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the other one is the dupe

@@ -0,0 +1,4413 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these models list a lot of stuff that seems to be block specific


let json = response.payload.unwrap_or_default();
let arr = match json.get("models").and_then(|v| v.as_array()) {
let arr = match json.get("data").and_then(|v| v.as_array()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? was this broken?

#[derive(Deserialize, ToSchema, utoipa::IntoParams)]
pub struct ModelFilterQuery {
#[serde(default)]
pub show_all: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not default booleans if it can be avoided. also do we need this? are we not sure about ourselves that we can just always filter (the user can still override by typing the model name)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fair. Got rid of all of the boolean + the cli flag.

So the auto-complete will always filter based on this, but you can ignore it and just type whatever.

use std::collections::HashMap;

const OPENROUTER_API_URL: &str = "https://openrouter.ai/api/v1/models";
const ALLOWED_PROVIDERS: &[&str] = &[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we base this on our own providers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's based on all the ones I have keys for. I took off databricks because it's a bunch of block specific model names, but can potentially add that back (and a lot of the logic in the name converter specifically addresses some of the cases I saw there).

@@ -0,0 +1,369 @@
/// Canonical Model Checker
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be in examples?

in general how are we going to integrate this with the overall process. how long does it take to run all of this? can we add it to the CI and then fail when it is not up to date? or add it to the release process /cc @jamaedo

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @jamadeo (working tag)

I was thinking add to the release process. Run build_canonical_models, followed by canonical_model_checker.

So then at least we have some record in the release logs of the model diff.

pricing.rs uses a local cache (refreshes weekly), so that's also worth considering if we're worried about freshness, but I think releases should fresh enough.

#[test]
fn test_canonical_names() {
// canonical_name tests
assert_eq!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, very repetitive - use parameterized test. also this is an insane long list of tests. strip them down to the ones that we actually need.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Now a single parameterized test on just the outermost fn.

Copilot AI review requested due to automatic review settings December 12, 2025 19:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 16 changed files in this pull request and generated 2 comments.

Comment on lines 451 to 466
async fn map_to_canonical_model(
&self,
provider_model: &str,
) -> Result<Option<String>, ProviderError> {
use super::canonical::map_to_canonical_model;

let registry = CanonicalModelRegistry::bundled().map_err(|e| {
ProviderError::ExecutionError(format!("Failed to load canonical registry: {}", e))
})?;

Ok(map_to_canonical_model(
self.get_name(),
provider_model,
registry,
))
}
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registry is loaded twice in quick succession - once in fetch_recommended_models (line 426) and again in map_to_canonical_model (line 457). Consider caching the registry at the provider level or passing it as a parameter to avoid redundant deserialize operations.

Copilot uses AI. Check for mistakes.
Comment on lines +284 to +289
while changed {
let before = result.clone();
for pattern in STRIP_PATTERNS.iter() {
result = pattern.replace(&result, "").to_string();
}
changed = result != before;
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The while loop could potentially run indefinitely if a regex pattern replacement produces a string that triggers another pattern, creating a replacement cycle. Consider adding a maximum iteration count as a safety guard against infinite loops.

Suggested change
while changed {
let before = result.clone();
for pattern in STRIP_PATTERNS.iter() {
result = pattern.replace(&result, "").to_string();
}
changed = result != before;
let mut iterations = 0;
const MAX_ITERATIONS: usize = 10;
while changed && iterations < MAX_ITERATIONS {
let before = result.clone();
for pattern in STRIP_PATTERNS.iter() {
result = pattern.replace(&result, "").to_string();
}
changed = result != before;
iterations += 1;

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings December 12, 2025 20:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 16 changed files in this pull request and generated 1 comment.

.collect();

if recommended_models.is_empty() {
Ok(Some(all_models))
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback to all_models when recommended_models is empty may not be the desired behavior. If filtering finds zero models that support text input, returning all models (including potentially non-text models) could lead to errors downstream. Consider logging a warning or returning an empty list instead to make the failure more explicit.

Suggested change
Ok(Some(all_models))
Ok(Some(vec![]))

Copilot uses AI. Check for mistakes.
@katzdave katzdave merged commit 5c964d2 into main Dec 12, 2025
23 checks passed
@katzdave katzdave deleted the dkatz/canonical-model branch December 12, 2025 22:25
michaelneale added a commit that referenced this pull request Dec 15, 2025
* main:
  fix: Make datetime info message more explicit to prevent LLM confusion about current year (#6101)
  refactor: unify subagent and subrecipe tools into single tool (#5893)
  goose repo is too big for the issue solver workflow worker (#6099)
  fix: use system not developer role in db (#6098)
  Add /goose issue solver github workflow (#6068)
  OpenAI responses streaming (#5837)
  Canonical models for Providers (#5694)
  feat: add Inception provider for Mercury models (#6029)
  fix old sessions with tool results not loading (#6094)
  Fix community page mobile responsiveness and horizontal overflow (#6082)
zanesq added a commit that referenced this pull request Dec 15, 2025
* 'main' of github.com:block/goose: (22 commits)
  Disallow subagents with no extensions (#5825)
  chore(deps): bump js-yaml in /documentation (#6093)
  feat: external goosed server (#5978)
  fix: Make datetime info message more explicit to prevent LLM confusion about current year (#6101)
  refactor: unify subagent and subrecipe tools into single tool (#5893)
  goose repo is too big for the issue solver workflow worker (#6099)
  fix: use system not developer role in db (#6098)
  Add /goose issue solver github workflow (#6068)
  OpenAI responses streaming (#5837)
  Canonical models for Providers (#5694)
  feat: add Inception provider for Mercury models (#6029)
  fix old sessions with tool results not loading (#6094)
  Fix community page mobile responsiveness and horizontal overflow (#6082)
  Tool reply meta (#6074)
  chore: avoid accidentally using native tls again (#6086)
  Update vars to be capitalised to be in line with other variables in config file (#6085)
  docs: restructure recipe reference (#5972)
  docs: configure custom providers (#6044)
  docs: Community All-Stars Spotlight November 2025, CodeTV Hackathon edition (#6070)
  fix: include file attachments in queued messages (#5961)
  ...

# Conflicts:
#	crates/goose-server/src/routes/agent.rs
#	crates/goose/src/agents/extension_manager.rs
#	ui/desktop/src/api/sdk.gen.ts
fbalicchia pushed a commit to fbalicchia/goose that referenced this pull request Dec 16, 2025
zanesq added a commit that referenced this pull request Dec 16, 2025
…sions

* 'main' of github.com:block/goose: (22 commits)
  Disallow subagents with no extensions (#5825)
  chore(deps): bump js-yaml in /documentation (#6093)
  feat: external goosed server (#5978)
  fix: Make datetime info message more explicit to prevent LLM confusion about current year (#6101)
  refactor: unify subagent and subrecipe tools into single tool (#5893)
  goose repo is too big for the issue solver workflow worker (#6099)
  fix: use system not developer role in db (#6098)
  Add /goose issue solver github workflow (#6068)
  OpenAI responses streaming (#5837)
  Canonical models for Providers (#5694)
  feat: add Inception provider for Mercury models (#6029)
  fix old sessions with tool results not loading (#6094)
  Fix community page mobile responsiveness and horizontal overflow (#6082)
  Tool reply meta (#6074)
  chore: avoid accidentally using native tls again (#6086)
  Update vars to be capitalised to be in line with other variables in config file (#6085)
  docs: restructure recipe reference (#5972)
  docs: configure custom providers (#6044)
  docs: Community All-Stars Spotlight November 2025, CodeTV Hackathon edition (#6070)
  fix: include file attachments in queued messages (#5961)
  ...

# Conflicts:
#	crates/goose-server/src/routes/agent.rs
#	crates/goose/src/agents/extension_manager.rs
#	ui/desktop/src/api/sdk.gen.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants