Skip to content

Add NVIDIA provider, and improve declarative provider UX#8798

Merged
jh-block merged 1 commit into
mainfrom
jhugo/nvidia-provider
Apr 24, 2026
Merged

Add NVIDIA provider, and improve declarative provider UX#8798
jh-block merged 1 commit into
mainfrom
jhugo/nvidia-provider

Conversation

@jh-block
Copy link
Copy Markdown
Collaborator

@jh-block jh-block commented Apr 24, 2026

Fixes #8505

Summary

  • Add NVIDIA NIM as a declarative provider with NVIDIA-specific docs and setup steps.
  • Allow declarative providers to override model docs and setup metadata, and stop inheriting OpenAI config fields in the settings UI.
  • Fix canonical model limit handling so models with full-context output limits do not request the entire context as output.
  • Refresh provider state immediately after config changes so NVIDIA appears in the model picker without reopening the modal.

Testing

  • Added unit tests for NVIDIA declarative provider deserialization and registry wiring.
  • Added coverage for backward-compatible declarative provider config deserialization.
  • Added a regression test for the model limit behavior that triggered false compaction.
  • Ran cargo fmt --check, targeted cargo test -p goose ... provider/model tests, and desktop TypeScript typecheck.

Signed-off-by: jh-block <jhugo@block.xyz>
@jh-block jh-block force-pushed the jhugo/nvidia-provider branch from f04b989 to b39ff8c Compare April 24, 2026 08:49
@jh-block jh-block changed the title Fix NVIDIA provider configuration and model defaults Add NVIDIA provider, and improve declarative provider UX Apr 24, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f04b989c88

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/goose/src/model.rs
Comment on lines +161 to +165
self.max_tokens = canonical
.limit
.output
.filter(|&output| output < canonical.limit.context)
.map(|output| output as i32);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve canonical output limits above context

This filter now drops every canonical output limit that is not strictly less than context, so models with output > context in the bundled registry lose their explicit limit and fall back to max_output_tokens() = 4096. In practice that silently shrinks allowed output for affected models (for example entries in canonical_models.json with larger output caps), even though this change was intended to handle the output == context sentinel case. Restricting the skip to equality (or clamping) avoids this unintended 4k cap.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix the canonical data and then we can look at removing this workaround anyway.

@jh-block jh-block added this pull request to the merge queue Apr 24, 2026
Merged via the queue into main with commit 4065d44 Apr 24, 2026
21 of 22 checks passed
@jh-block jh-block deleted the jhugo/nvidia-provider branch April 24, 2026 14:02
lifeizhou-ap added a commit that referenced this pull request Apr 27, 2026
* main: (29 commits)
  chore(deps): bump winreg from 0.55.0 to 0.56.0 (#8829)
  Fix grammar issue (#8669)
  colorize context window indicator (#8851)
  Refresh canonical model metadata from models.dev (#8838)
  fix(ci): prevent flaky smoke test timeouts from failing the build (#8837)
  updates: release 0.19.0 of the tui/sdk/etc (#8806)
  add a goose2 signed release flow (#8728)
  Port provider tests to typescript (#8237)
  refactor: make ACP server smaller (#8787)
  Add NVIDIA provider, and improve declarative provider UX (#8798)
  fix: removed failed provider test for deprecated providers (#8801)
  fix: only call cleanup when the pr is from same repo (#8799)
  chore: check stale for draft pr (#8803)
  fix: use _meta instead of meta in newSession request (#8796)
  fix: add missing underscore prefix in updateWorkingDir method name (#8743)
  feat: migrate session metadata storage from frontend overlay to backend (#8769)
  Add more info to BUILDING_LINUX (#8789)
  feat(acp): Align to new request patterns of ACP Streamable HTTP/WS transport (#8605)
  Dedupe and organize skills/sources (#8731)
  docs: add skills slash command (#8783)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for NVIDIA API key to use NVIDIA models

2 participants