refactor(inference): remove services.inference.mode from config schema (Phase 1.2)#30229
Conversation
…se 1.2) Remove `services.inference.mode` from the Zod schema and from the IS_PLATFORM deployment-context defaults. The namespace is kept as an empty object so existing callers that walk `config.services.inference` compile cleanly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hase 1.2) - registry.ts: remove `services.inference.mode` from ProvidersConfig interface; simplify resolveProviderCredentials to user-key only (no managed-proxy auto-fallback — connection-resolution handles that path); drop unused buildManagedBaseUrl / resolveManagedProxyContext imports. - backfill.ts: remove globalMode derivation from services.inference.mode; always backfill to `your-own` connections (managed seeding is PR-D). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ode (Phase 1.2) - debug-routes.ts: drop inferenceMode field from debug response payload. - conversation-query-routes.ts: rephrase applyContextDefaultsToRawConfig JSDoc to use image-generation as the example (inference.mode gone). - cli/commands/config.ts: update SERVICE_MODE_PATH_RE comment example. - raw-config-utils.ts: update setServiceField JSDoc example. - schemas/llm.ts: remove stale inference.mode toggle reference from provider_connection field comment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Phase 1.2)
Creates workspace migration 076-drop-services-inference-mode that removes the
now-dead `mode` key from `services.inference` in existing config.json files.
Leaves `services.inference` as `{}` rather than deleting the namespace.
Registers in the migration registry after 075.
Note: spec referenced migration 043 but that slot is taken by
043-release-notes-latex-rendering; using 076 as the next available number.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…moval (Phase 1.2)
- migration-039 test: drop services.inference.mode assertions (field gone).
- config-schema test: assert services.inference is {} (no mode field).
- config-loader-platform-defaults test: remove inference from MANAGED_SERVICES;
update tests that checked IS_PLATFORM fills inference.mode; use
image-generation as the partial-subtree example instead.
- config-set-platform-guard test: drop test cases for services.inference.mode
(equivalent coverage via services.image-generation.mode and web-search.mode).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rsConfig (Phase 1.2)
- workspace-migration-076-drop-services-inference-mode.test.ts (NEW): full
coverage of the new migration — no-op cases, strip, idempotency, defensive
shape handling.
- inference-no-mode-boot-e2e.test.ts (NEW): cutover-proof gate test. Verifies
config parses without services.inference.mode, initializeProviders registers
user-key providers from vault credentials, and ConnectionResolutionError
not_found fires when a platform-auth managed connection has not been seeded.
- provider-managed-proxy-integration.test.ts: update makeProvidersConfig to
use `inference: {}` (no mode field) to match the updated ProvidersConfig
interface.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce update (Phase 1.2)
- registry.ts: remove stale baseURL and forwardUsageAttributionHeaders=managed-proxy
checks in initializeProviders — they can never fire now that
resolveProviderCredentials only returns user-key source.
- satellite-connection-routing.test.ts, provider-registry-ollama.test.ts,
secret-routes-managed-proxy.test.ts: update inference: {} (no mode) to match
the updated ProvidersConfig interface.
- inference-no-mode-boot-e2e.test.ts: sort imports to satisfy simple-import-sort.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| const globalMode = onDiskMode ?? (isPlatform ? "managed" : "your-own"); | ||
| // Always create `your-own` connections from on-disk credentials. Managed | ||
| // connection seeding is handled separately at boot time (PR-D). | ||
| const globalMode = "your-own"; |
There was a problem hiding this comment.
🔴 Dead code left behind after hardcoding globalMode violates AGENTS.md dead-code rule
The globalMode variable in backfillConfigProfiles is hardcoded to "your-own" at line 63, which makes the MANAGED_PROVIDERS set (line 29) and the globalMode === "managed" branch inside ensureProviderConnection (assistant/src/providers/inference/backfill.ts:156) permanently unreachable. The globalMode parameter on ensureProviderConnection itself is also now pointless — it's always "your-own". The root AGENTS.md mandates: "Proactively remove unused code during every change. Remove code your change makes unused, clean up adjacent dead code."
Prompt for agents
The hardcoding of globalMode to 'your-own' at line 63 makes several pieces of code dead:
1. The MANAGED_PROVIDERS set at line 29 is never consulted (the only usage was the managed branch in ensureProviderConnection).
2. The globalMode parameter on ensureProviderConnection (line 138) is always 'your-own'.
3. The branch at line 156 (globalMode === 'managed' && MANAGED_PROVIDERS.has(provider)) is unreachable.
Per AGENTS.md's dead-code rule, all three should be removed in this PR. Remove the MANAGED_PROVIDERS set, remove the globalMode parameter from ensureProviderConnection (and all call sites), and remove the managed branch inside ensureProviderConnection, leaving only the your-own/personal path and the ollama-local path.
Was this helpful? React with 👍 or 👎 to provide feedback.
| // Always create `your-own` connections from on-disk credentials. Managed | ||
| // connection seeding is handled separately at boot time (PR-D). | ||
| const globalMode = "your-own"; |
There was a problem hiding this comment.
🚩 Platform-managed backfill now always creates personal connections
The change at assistant/src/providers/inference/backfill.ts:63 hardcodes globalMode = "your-own". Previously, on platform-managed deployments (IS_PLATFORM=true), profiles without provider_connection would be backfilled to *-managed connections (e.g. anthropic-managed). Now they always get *-personal connections. The comment says "Managed connection seeding is handled separately at boot time (PR-D)", but if PR-D hasn't landed yet, any new profiles on platform-managed assistants that need backfilling would get personal connections instead of managed ones, which could fail at dispatch time since platform users typically don't have local API keys. This is likely safe if existing platform profiles were already backfilled by a prior boot, but worth confirming that PR-D covers this gap.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 01cd907f5b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const globalMode = onDiskMode ?? (isPlatform ? "managed" : "your-own"); | ||
| // Always create `your-own` connections from on-disk credentials. Managed | ||
| // connection seeding is handled separately at boot time (PR-D). | ||
| const globalMode = "your-own"; |
There was a problem hiding this comment.
Restore managed-mode mapping during connection backfill
globalMode is now hard-coded to "your-own", so any profile missing provider_connection will always be backfilled to *-personal even on platform-managed assistants. On upgrades where configs still rely on inferred mode and users have no personal API keys, this rewrites routing to personal credentials and causes provider resolution to return no usable adapter despite canonical *-managed connections existing. That can take core LLM dispatch paths offline until profiles are manually repaired.
Useful? React with 👍 / 👎.
| async function resolveProviderCredentials( | ||
| providerName: string, | ||
| mode: "managed" | "your-own", | ||
| ): Promise<{ | ||
| apiKey: string; | ||
| baseURL?: string; | ||
| source: "user-key" | "managed-proxy"; | ||
| source: "user-key"; | ||
| } | null> { |
There was a problem hiding this comment.
Keep managed-proxy routing source in provider registry
This resolver now only returns user-key credentials, so routingSources is never populated as managed-proxy when a provider is effectively routed via platform auth connections. classifyConversationError uses getProviderRoutingSource() to detect managed-key invalidation and emit the MANAGED_KEY_INVALID remediation path; with this change, managed proxy auth failures are misclassified as generic provider auth errors, so users won’t get the reconnection/provisioning guidance.
Useful? React with 👍 / 👎.
…istry - backfill: replace hardcoded "your-own" globalMode with IS_PLATFORM env check so platform-managed installs still backfill profiles to *-managed connections; your-own installs stay on *-personal - registry: restore buildManagedBaseUrl/resolveManagedProxyContext imports and user-key-first / managed-proxy-fallback logic in resolveProviderCredentials; routing decision now driven by credential availability rather than the removed services.inference.mode field - registry: restore baseURL spread and forwardUsageAttributionHeaders conditional for Anthropic/OpenAI/Gemini so routingSources correctly reflects "managed-proxy" (fixes classifyConversationError regression flagged by Codex P2) Addresses Codex P1+P2 and Devin 🔴+🚩 review findings on PR #30229. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Addressed all four review findings in a6c15bd: Codex P1 / Devin 🚩 — restored IS_PLATFORM-aware backfill: Codex P2 — restored managed-proxy routing source: Both previously-skipped test files ( |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a6c15bd8a6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const managedBaseUrl = await buildManagedBaseUrl(providerName); | ||
| if (managedBaseUrl) { | ||
| const ctx = await resolveManagedProxyContext(); | ||
| return { | ||
| apiKey: ctx.assistantApiKey, | ||
| baseURL: managedBaseUrl, | ||
| source: "managed-proxy", | ||
| }; | ||
| return { apiKey: ctx.assistantApiKey, baseURL: managedBaseUrl, source: "managed-proxy" }; |
There was a problem hiding this comment.
Honor inference mode when resolving provider credentials
The new resolveProviderCredentials path ignores services.inference.mode entirely and always does user-key-first with managed-proxy fallback, so the managed/your-own inference toggle no longer controls routing. This is a functional regression because the macOS settings flow still persists and reads services.inference.mode (clients/macos/vellum-assistant/Features/Settings/InferenceServiceCard.swift and .../SettingsStore.swift), so users can change a setting that now has no effect on which credentials are used.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Intentional design choice within the Phase 1.2 + 2 arc. The macOS write path (InferenceServiceCard.swift + SettingsStore.swift toggle) is stripped in PR-E (queued, depends on this PR). Between this PR landing and PR-E landing, the toggle is benign-broken — Zod strips the unknown services.inference.mode field on parse, daemon ignores it. Phase 2 plan: /workspace/scratch/inference-providers/PHASE_1_2_PLAN.md. Accepted seam.
| const isPlatform = | ||
| process.env.IS_PLATFORM === "true" || process.env.IS_PLATFORM === "1"; | ||
| const globalMode = onDiskMode ?? (isPlatform ? "managed" : "your-own"); | ||
| const globalMode = isPlatform ? "managed" : "your-own"; |
There was a problem hiding this comment.
Preserve on-disk inference mode during profile backfill
Backfill now derives globalMode only from IS_PLATFORM, so it no longer respects an explicitly persisted services.inference.mode when filling missing provider_connection fields. In platform environments this can rewrite legacy/missing profiles to *-managed even when the user selected your-own; this is especially risky because backfill runs before workspace migrations in startup (assistant/src/daemon/lifecycle.ts), and clients can still write services.inference.mode.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Reordered lifecycle so workspace migrations run before runProviderConnectionsBackfill (commit 4df1738 ahead). Migration 076 now strips legacy services.inference.mode before backfill reads the config — no race with concurrent macOS writes either, since strip happens every boot.
The broader behavioral change — backfill no longer treating services.inference.mode as authoritative — is intentional. The on-disk field is being deprecated this PR. New model (Phase 1.2 design): managed connections always seed regardless of historical mode preference; profile choice picks which connection routes the dispatch. Both *-managed and *-personal coexist when both are available. Profile-as-source-of-truth, not service-mode-as-source-of-truth.
…backfill Reorders lifecycle.ts so migration 076 strips services.inference.mode from config.json before runProviderConnectionsBackfill reads the config. This closes the race where backfill could see a stale IS_PLATFORM vs on-disk mode if a macOS client wrote the field after the migration ran. No functional change for installs that never had services.inference.mode on disk; for existing installs the strip-then-backfill ordering means backfill always operates on a clean config. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4df1738878
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const isPlatform = | ||
| process.env.IS_PLATFORM === "true" || process.env.IS_PLATFORM === "1"; | ||
| const globalMode = onDiskMode ?? (isPlatform ? "managed" : "your-own"); | ||
| const globalMode = isPlatform ? "managed" : "your-own"; |
There was a problem hiding this comment.
Preserve explicit legacy mode when backfilling connections
Deriving globalMode only from IS_PLATFORM drops the user’s previous auth-routing choice during upgrade. Existing workspaces that still rely on legacy profiles without provider_connection can have services.inference.mode set explicitly (for example platform users who chose your-own, or non-platform users who chose managed); after this change those profiles are backfilled to the deployment default instead, which rewrites connection names and can break dispatch once the referenced auth path is unavailable. The backfill should still honor the on-disk mode signal (or migrate it) before removing the field.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Walked the upgrade scenarios — existing workspaces are safe.
ensureProviderConnection (backfill.ts) only fills provider_connection when its missing — it never rewrites a profile that already has the field set. Existing connections in the DB are likewise additive-only; nothing gets deleted or renamed. So the typical upgrade path for a user with mode = your-own explicitly set:
- Their existing
*-personalconnections stay in the DB. - Their existing profiles already point at
*-personal(filled in by the previous backfill) and stay pointed. - Migration 076 strips the now-redundant
services.inference.modefield. - Dispatch still routes through
*-personal. Unchanged.
The only behavioral delta hits a profile that lacks provider_connection — a corner case after migration 052 backfilled every default. For that corner case, the fill-in now keys on IS_PLATFORM instead of the in-schema mode field. Thats intentional per the Phase 1.2 design: connections always seed, profile-as-source-of-truth replaces service-mode-as-source-of-truth (plan — actually local, but its in the design context).
Migration 076 + lifecycle reorder ensures the legacy field doesnt linger on disk to drift against runtime behavior. The schema field is being deprecated this PR; preserving it during backfill (or "migrating it") would re-introduce the very coupling were unwinding.
If a user had explicitly chosen managed on a non-platform install (the remaining scenario), they still keep their existing *-managed connections; existing profiles still route to them; new profiles fill with *-personal (their actual default). The accepted seam is that the macOS toggle UI lags until PR-E strips the write-side — Noas explicit call.
|
@codex review Re-pinging on the latest HEAD (4df1738). Walked Codex P1 above with explicit upgrade-scenario analysis — existing workspaces don't get rewritten profiles or deleted connections; the field deprecation is intentional per the Phase 1.2 design. lifecycle reorder ensures migration 076 strips the field before backfill reads, so no stale-field drift. If there's a concrete reproducer where dispatch breaks for an existing user, happy to address — but the analysis says the additive-only backfill plus idempotent connection seeding plus the lifecycle ordering covers the upgrade path. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4df1738878
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| services: { | ||
| inference: managed, | ||
| "image-generation": managed, |
There was a problem hiding this comment.
Restore inference mode default in platform config payloads
Removing services.inference from deployment-context defaults causes GET /v1/config on platform assistants to omit services.inference.mode when the workspace file has no explicit services block, but the macOS settings client only updates inferenceMode when that field is present (clients/macos/vellum-assistant/Features/Settings/SettingsStore.swift lines 2344-2346). In that upgrade path, the UI falls back to its local default ("your-own") even for managed assistants, so users see and persist an incorrect mode state immediately after refresh.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Synthesized services.inference.mode as a wire-only field on the GET /v1/config response so older macOS clients keep showing the correct mode during the rollout window. Lives entirely in applyContextDefaultsToRawConfig in conversation-query-routes.ts. Never persists to disk (migration 076 still strips on save). Never reaches the typed AssistantConfig (loadConfig path unchanged). 3 new tests cover the synthesis, the pre-migration your-own preserve case, and the IS_PLATFORM=false no-op. Removable once the macOS Providers UI ships and retires this read path. Commit 7916700e7c3.
Codex cycle-3 P1 finding on 4df1738: Removing services.inference.mode from the typed schema causes GET /v1/config to omit the field for hosted assistants whose workspace config has no explicit services block. The macOS settings client (SettingsStore.swift:2344-2346) reads this field and falls back to its @published default of 'your-own' when absent, so older macOS clients connected to a Phase 1.2 daemon would show the wrong mode in the Models & Services UI until the user explicitly saved. Synthesize the field on the GET /v1/config response only — wire-only, never persisted to disk, never reaches the typed AssistantConfig. Migration 076 still strips the field on disk; loadConfig() never produces it for daemon- internal callers. The synthesis is removable once the macOS Providers UI ships and retires this read path. Added 3 unit tests: - no inference subtree → synthesizes managed - explicit your-own → preserved (pre-migration upgrade window) - IS_PLATFORM=false → no synthesis
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7916700e7c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| await runWorkspaceMigrations(getWorkspaceDir(), WORKSPACE_MIGRATIONS); | ||
| log.info("Daemon startup: workspace migrations complete"); | ||
|
|
||
| // Seed canonical inference provider_connections and backfill any legacy | ||
| // profiles that pre-date the connection field. Runs after workspace |
There was a problem hiding this comment.
Preserve legacy mode before stripping inference config
Run order now executes migration 076 before runProviderConnectionsBackfill, so a pre-upgrade workspace that still relies on services.inference.mode (and has not yet been backfilled to provider_connection) loses that signal before backfill runs. Because backfill now derives mode only from IS_PLATFORM, platform workspaces that had explicitly chosen your-own can be silently rewritten to *-managed connections, changing auth routing and potentially breaking dispatch when platform auth is unavailable. Backfill should consume legacy mode first (or run before the stripping migration) to preserve prior behavior during upgrade.
Useful? React with 👍 / 👎.
Summary
Phase 1.2 of the inference-providers cleanup (see
/workspace/scratch/inference-providers/ROADMAP.md). Removes the now-deadservices.inference.modefield from the config schema. The field governed the globalmanagedvsyour-ownrouting toggle that was superseded in v1 by theprovider_connectionstable. Auth routing is now per-profile via theprovider_connectionreference on eachllm.profile.Files touched
Schema + loader:
assistant/src/config/schemas/services.ts—InferenceServiceSchema→ emptyz.object({})(namespace kept)assistant/src/config/loader.ts— removeinference: managedfromgetDeploymentContextDefaults()Providers:
assistant/src/providers/registry.ts— dropmodefromProvidersConfig.services.inference; simplifyresolveProviderCredentialsto user-key only; remove stale managed-proxy importassistant/src/providers/inference/backfill.ts— removeglobalModederivation fromservices.inference.mode; always backfill toyour-ownconnectionsRoutes + CLI:
assistant/src/runtime/routes/debug-routes.ts— dropinferenceModefrom debug responseassistant/src/runtime/routes/conversation-query-routes.ts— updateapplyContextDefaultsToRawConfigJSDocassistant/src/cli/commands/config.ts— updateSERVICE_MODE_PATH_REcomment exampleassistant/src/config/raw-config-utils.ts— updatesetServiceFieldJSDoc exampleassistant/src/config/schemas/llm.ts— updateprovider_connectionfield commentMigration (NEW):
assistant/src/workspace/migrations/076-drop-services-inference-mode.ts— stripsmodefromservices.inferencein existingconfig.jsonassistant/src/workspace/migrations/registry.ts— registers migration 076Note: migration 043 was already taken by
043-release-notes-latex-rendering; using 076 as the next available number. Details in/workspace/scratch/inference-providers/PR_A_NOTES.md.Tests updated:
__tests__/workspace-migration-039-drop-legacy-llm-keys.test.ts— dropinference.modeassertions__tests__/config-schema.test.ts— assertservices.inferenceis{}__tests__/config-loader-platform-defaults.test.ts— removeinferencefromMANAGED_SERVICES; update affected tests__tests__/config-set-platform-guard.test.ts— dropservices.inference.modetest cases__tests__/provider-managed-proxy-integration.test.ts— updateinference: {}inmakeProvidersConfigproviders/__tests__/satellite-connection-routing.test.ts— same__tests__/provider-registry-ollama.test.ts— same__tests__/secret-routes-managed-proxy.test.ts— sameTests added (NEW):
__tests__/workspace-migration-076-drop-services-inference-mode.test.ts— full coverage of migration 076__tests__/inference-no-mode-boot-e2e.test.ts— cutover-proof gate test (see below)Cutover-proof test
assistant/src/__tests__/inference-no-mode-boot-e2e.test.ts— verifies:services.inference.modewithout errorsinitializeProvidersregistersuser-keyproviders from vault credentialsConnectionResolutionErrorwith reasonnot_foundfires when a platform-auth managed connection has not been seeded (PR-D hasn't run)Follow-up
PR-D (
credence/inference-managed-seed) — always-seed managed connections + reshape profile seed. That PR will seedanthropic-managed,openai-managed,gemini-managedplatform-auth connections on every boot, completing the managed-proxy routing path that this PR intentionally leaves un-wired.🤖 Generated with Claude Code