Skip to content

Refactor LLM callsites to use provider model intents#7943

Merged
noanflaherty merged 1 commit into
mainfrom
codex/model-intent-identifiers
Feb 24, 2026
Merged

Refactor LLM callsites to use provider model intents#7943
noanflaherty merged 1 commit into
mainfrom
codex/model-intent-identifiers

Conversation

@noanflaherty
Copy link
Copy Markdown
Contributor

@noanflaherty noanflaherty commented Feb 24, 2026

Summary

  • add provider-level model intent identifiers: latency-optimized, quality-optimized, and vision-optimized
  • introduce centralized provider mappings in providers/model-intents.ts
  • normalize SendMessageOptions.config in RetryProvider so modelIntent is translated to a concrete provider-specific model (and stripped before provider SDK calls)
  • migrate key callsites from hardcoded Claude model IDs to intent-based config
  • add tests for intent mapping and model-intent normalization precedence

Validation

  • cd assistant && bunx tsc --noEmit
  • cd assistant && bun test src/__tests__/model-intents.test.ts src/__tests__/clarification-resolver.test.ts src/__tests__/contradiction-checker.test.ts src/__tests__/classifier.test.ts src/__tests__/skills.test.ts src/__tests__/entity-extractor.test.ts src/__tests__/hooks-watch.test.ts src/__tests__/no-direct-anthropic-sdk-imports.test.ts

Open with Devin

@noanflaherty noanflaherty merged commit e061d5f into main Feb 24, 2026
1 check failed
@noanflaherty noanflaherty deleted the codex/model-intent-identifiers branch February 24, 2026 15:50
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

{
config: {
model: 'claude-sonnet-4-6',
modelIntent: 'quality-optimized',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Summary generation silently upgraded from Sonnet to Opus due to intent taxonomy gap

The generateSummary function in watch-handler.ts previously used model: 'claude-sonnet-4-6' (mid-tier). The migration changes it to modelIntent: 'quality-optimized', which resolves to claude-opus-4-6 for the Anthropic provider (assistant/src/providers/model-intents.ts:17). This is a significant model upgrade — Opus is substantially more expensive and slower than Sonnet.

Root cause: intent taxonomy lacks a mid-tier mapping

The intent taxonomy only offers three tiers:

  • latency-optimized → Haiku (cheap/fast)
  • quality-optimized → Opus (expensive/best)
  • vision-optimized → Sonnet (mid-tier, but semantically tied to vision)

The original Sonnet was a deliberate choice for summary generation — good enough quality at moderate cost. The migration had no semantically appropriate intent for a "balanced/general-purpose" tier, so quality-optimized was chosen, but it maps to the most expensive model.

For comparison, the commentary generation in the same file correctly maps Haiku → latency-optimized → Haiku (1:1 equivalent at watch-handler.ts:129). But summary maps Sonnet → quality-optimized → Opus, which is NOT a 1:1 equivalent.

Impact: Every watch session summary now uses Opus instead of Sonnet, increasing API cost significantly (~5x per summary call) and potentially increasing latency for users.

Prompt for agents
In assistant/src/daemon/watch-handler.ts line 247, the modelIntent 'quality-optimized' maps to claude-opus-4-6 whereas the original code used claude-sonnet-4-6. To preserve the original behavior, either: (1) Add a new 'balanced' or 'general-purpose' intent to the ModelIntent type in assistant/src/providers/types.ts and map it to Sonnet-tier models in assistant/src/providers/model-intents.ts, then use that intent here. Or (2) If upgrading to Opus is intentional, keep this as-is but document the cost increase. Or (3) Use 'vision-optimized' which maps to Sonnet, though the name is misleading for a text summary use case.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant