Stop load openai fast model for openapi compatible custom endpoint by spikewang · Pull Request #8644 · aaif-goose/goose

spikewang · 2026-04-19T04:10:28Z

Summary

When using custom OPENAI compatible endpoint, gpt-4o-mini is always tried to be loaded automatically even it didn't exist.

There are always warnings in the log like this:

{"timestamp":"2026-04-19T00:48:34.916138Z","level":"WARN","fields":{"message":"Provider request failed with status: 404 Not Found. Payload: Some(Object {\"error\": Object\
 {\"message\": String(\"The model `gpt-4o-mini` does not exist.\"), \"type\": String(\"NotFoundError\"), \"param\": String(\"model\"), \"code\": Number(404)}}). Returning\
 error: RequestFailed(\"Resource not found (404): The model `gpt-4o-mini` does not exist.\")"},"target":"goose::providers::openai_compatible","span":{"gen_ai.request.mode\
l":"gpt-4o-mini","session.id":"20260419_4","name":"complete"},"spans":[{"gen_ai.request.model":"gpt-4o-mini","session.id":"20260419_4","name":"complete"}]}

{"timestamp":"2026-04-19T00:48:34.916189Z","level":"WARN","fields":{"message":"Request failed, retrying (1/3): RequestFailed(\"Resource not found (404): The model `gpt-4o\
-mini` does not exist.\")"},"target":"goose::providers::retry","span":{"gen_ai.request.model":"gpt-4o-mini","session.id":"20260419_4","name":"complete"},"spans":[{"gen_ai\
.request.model":"gpt-4o-mini","session.id":"20260419_4","name":"complete"}]}

ANALYSIS

OpenAiProvider::from_env() read OPENAI_HOST only after unconditionally calling model.with_fast("gpt-4o-mini", ...). So any user pointing the openai provider at a custom endpoint (like http://localhost:8000) always got gpt-4o-mini baked in as the fast model — which then fails with 404 at runtime whenever complete_fast() is called (MOIM summarization, context compaction, orchestrator).

FIX

The fix (openai.rs:73-86): read OPENAI_HOST first, then only set the default fast model when the host contains api.openai.com. For any other host, fast_model_config stays None, so use_fast_model() falls back to the main model instead of attempting a non-existent gpt-4o-mini.

With OPENAI_HOST: http://localhost:8000, the fast model will now silently use /models/Qwen3.5-35B-A3B-FP8 for MOIM/summarization calls instead of retrying gpt-4o-mini 3× and then falling back. If you later want a dedicated fast model on your local server, you can set it via a custom provider config with fast_model: "your-fast-model".

Testing

Tested with local VLLM hosted QWEN3.5:35B and OpenAI API Endpoints. The WARN message disappeared after the fix.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d7b14148cb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

to api.openai.com to avoid false positives.

jh-block

Thanks for the contribution, I agree with the need for this change but the fix is slightly too narrow: api.openai.com is not the only OpenAI API endpoint. There is at least eu.api.openai.com (EU data residency) as well, and I would not be surprised if there are other prefixes. Maybe this would be better:

    .map(|h| h == "api.openai.com" || h.ends_with(".api.openai.com"))

spikewang · 2026-04-21T03:58:50Z

Agreed. It will be great the API will have a way to indicate, but I don't think I could find one... Updated the fix.

…aif-goose#8644)

* main: feat: extend goose2 context window ux with auto-compaction (#8721) improve goose2 agent management flows (#8737) alexhancock/tui-improvements (#8736) fix: add strict:false to Responses API tools and gpt-5.4 to known models (#8636) persist and reliably apply chat model selection (#8734) merge goose-acp crate into goose (#8726) docs: AGENTS.md section on goose2 desktop backend architecture (#8732) feat: goose2 message bubble + action tray (#8720) consolidate provider ACP methods onto inventory (#8710) ci: declare and enforce MSRV of 1.91.1 (#8670) fix(ui): correct grammar in apps view description (#8668) (#8679) Stop load openai fast model for openapi compatible custom endpoint (#8644)

* main: (41 commits) removed the specific code owner for documentation change (#8749) fix(providers): handle missing delta field in streaming chunks (#8700) refactor(providers): extract http_status module and rename handle_status_openai_compat (#8620) fix(providers/openai): accept streaming chunks with both reasoning fields (#8715) feat: associate threads with projects (#8745) upgrade goose sdk and tui to be compatible with the latest agentclientprotocol/sdk package (#8667) feat: extend goose2 context window ux with auto-compaction (#8721) improve goose2 agent management flows (#8737) alexhancock/tui-improvements (#8736) fix: add strict:false to Responses API tools and gpt-5.4 to known models (#8636) persist and reliably apply chat model selection (#8734) merge goose-acp crate into goose (#8726) docs: AGENTS.md section on goose2 desktop backend architecture (#8732) feat: goose2 message bubble + action tray (#8720) consolidate provider ACP methods onto inventory (#8710) ci: declare and enforce MSRV of 1.91.1 (#8670) fix(ui): correct grammar in apps view description (#8668) (#8679) Stop load openai fast model for openapi compatible custom endpoint (#8644) feat(hooks): add Husky git hooks for ui/goose2 (#8577) fix: links in chat could not be opened (#8544) ...

Stop load openai fast model for openapi compatible custom endpoint

d7b1414

chatgpt-codex-connector Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread crates/goose/src/providers/openai.rs Outdated

Parse the URL and compare only the hostname (case-insensitive)

1e13b71

to api.openai.com to avoid false positives.

jh-block reviewed Apr 20, 2026

View reviewed changes

jh-block self-assigned this Apr 20, 2026

Update the openai URI to be more generic

49277b8

spikewang requested a review from jh-block April 21, 2026 04:36

jh-block approved these changes Apr 21, 2026

View reviewed changes

jh-block added this pull request to the merge queue Apr 21, 2026

Merged via the queue into aaif-goose:main with commit aa731a9 Apr 21, 2026
20 checks passed

spikewang added a commit to spikewang/goose that referenced this pull request Apr 22, 2026

Stop load openai fast model for openapi compatible custom endpoint (a…

dc51921

…aif-goose#8644)

github-actions Bot mentioned this pull request Apr 23, 2026

chore(release): release version 1.32.0 #8764

Closed

BrewTestBot mentioned this pull request Apr 23, 2026

block-goose-cli 1.32.0 Homebrew/homebrew-core#279029

Merged

lifeizhou-ap mentioned this pull request Apr 24, 2026

fix(providers): don't hardcode fast model for custom openai-compatible hosts #8706

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop load openai fast model for openapi compatible custom endpoint#8644

Stop load openai fast model for openapi compatible custom endpoint#8644
jh-block merged 3 commits into
aaif-goose:mainfrom
spikewang:fix/no-openai-fast-model-for-custom

spikewang commented Apr 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

jh-block left a comment

Uh oh!

spikewang commented Apr 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

spikewang commented Apr 19, 2026

Summary

ANALYSIS

FIX

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

jh-block left a comment

Choose a reason for hiding this comment

Uh oh!

spikewang commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spikewang commented Apr 21, 2026 •

edited

Loading