Refactor LLM callsites to use provider model intents#7943
Conversation
| { | ||
| config: { | ||
| model: 'claude-sonnet-4-6', | ||
| modelIntent: 'quality-optimized', |
There was a problem hiding this comment.
🟡 Summary generation silently upgraded from Sonnet to Opus due to intent taxonomy gap
The generateSummary function in watch-handler.ts previously used model: 'claude-sonnet-4-6' (mid-tier). The migration changes it to modelIntent: 'quality-optimized', which resolves to claude-opus-4-6 for the Anthropic provider (assistant/src/providers/model-intents.ts:17). This is a significant model upgrade — Opus is substantially more expensive and slower than Sonnet.
Root cause: intent taxonomy lacks a mid-tier mapping
The intent taxonomy only offers three tiers:
latency-optimized→ Haiku (cheap/fast)quality-optimized→ Opus (expensive/best)vision-optimized→ Sonnet (mid-tier, but semantically tied to vision)
The original Sonnet was a deliberate choice for summary generation — good enough quality at moderate cost. The migration had no semantically appropriate intent for a "balanced/general-purpose" tier, so quality-optimized was chosen, but it maps to the most expensive model.
For comparison, the commentary generation in the same file correctly maps Haiku → latency-optimized → Haiku (1:1 equivalent at watch-handler.ts:129). But summary maps Sonnet → quality-optimized → Opus, which is NOT a 1:1 equivalent.
Impact: Every watch session summary now uses Opus instead of Sonnet, increasing API cost significantly (~5x per summary call) and potentially increasing latency for users.
Prompt for agents
In assistant/src/daemon/watch-handler.ts line 247, the modelIntent 'quality-optimized' maps to claude-opus-4-6 whereas the original code used claude-sonnet-4-6. To preserve the original behavior, either: (1) Add a new 'balanced' or 'general-purpose' intent to the ModelIntent type in assistant/src/providers/types.ts and map it to Sonnet-tier models in assistant/src/providers/model-intents.ts, then use that intent here. Or (2) If upgrading to Opus is intentional, keep this as-is but document the cost increase. Or (3) Use 'vision-optimized' which maps to Sonnet, though the name is misleading for a text summary use case.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
latency-optimized,quality-optimized, andvision-optimizedproviders/model-intents.tsSendMessageOptions.configinRetryProvidersomodelIntentis translated to a concrete provider-specific model (and stripped before provider SDK calls)Validation
cd assistant && bunx tsc --noEmitcd assistant && bun test src/__tests__/model-intents.test.ts src/__tests__/clarification-resolver.test.ts src/__tests__/contradiction-checker.test.ts src/__tests__/classifier.test.ts src/__tests__/skills.test.ts src/__tests__/entity-extractor.test.ts src/__tests__/hooks-watch.test.ts src/__tests__/no-direct-anthropic-sdk-imports.test.ts