Refactor LLM callsites to use provider model intents by noanflaherty · Pull Request #7943 · vellum-ai/vellum-assistant

noanflaherty · 2026-02-24T15:50:22Z

Summary

add provider-level model intent identifiers: latency-optimized, quality-optimized, and vision-optimized
introduce centralized provider mappings in providers/model-intents.ts
normalize SendMessageOptions.config in RetryProvider so modelIntent is translated to a concrete provider-specific model (and stripped before provider SDK calls)
migrate key callsites from hardcoded Claude model IDs to intent-based config
add tests for intent mapping and model-intent normalization precedence

Validation

cd assistant && bunx tsc --noEmit
cd assistant && bun test src/__tests__/model-intents.test.ts src/__tests__/clarification-resolver.test.ts src/__tests__/contradiction-checker.test.ts src/__tests__/classifier.test.ts src/__tests__/skills.test.ts src/__tests__/entity-extractor.test.ts src/__tests__/hooks-watch.test.ts src/__tests__/no-direct-anthropic-sdk-imports.test.ts

devin-ai-integration

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-02-24T15:54:27Z

      {
        config: {
-          model: 'claude-sonnet-4-6',
+          modelIntent: 'quality-optimized',


🟡 Summary generation silently upgraded from Sonnet to Opus due to intent taxonomy gap

The generateSummary function in watch-handler.ts previously used model: 'claude-sonnet-4-6' (mid-tier). The migration changes it to modelIntent: 'quality-optimized', which resolves to claude-opus-4-6 for the Anthropic provider (assistant/src/providers/model-intents.ts:17). This is a significant model upgrade — Opus is substantially more expensive and slower than Sonnet.

Root cause: intent taxonomy lacks a mid-tier mapping

The intent taxonomy only offers three tiers:

latency-optimized → Haiku (cheap/fast)

quality-optimized → Opus (expensive/best)

vision-optimized → Sonnet (mid-tier, but semantically tied to vision)

The original Sonnet was a deliberate choice for summary generation — good enough quality at moderate cost. The migration had no semantically appropriate intent for a "balanced/general-purpose" tier, so quality-optimized was chosen, but it maps to the most expensive model.

For comparison, the commentary generation in the same file correctly maps Haiku → latency-optimized → Haiku (1:1 equivalent at watch-handler.ts:129). But summary maps Sonnet → quality-optimized → Opus, which is NOT a 1:1 equivalent.

Impact: Every watch session summary now uses Opus instead of Sonnet, increasing API cost significantly (~5x per summary call) and potentially increasing latency for users.

Prompt for agents

In assistant/src/daemon/watch-handler.ts line 247, the modelIntent 'quality-optimized' maps to claude-opus-4-6 whereas the original code used claude-sonnet-4-6. To preserve the original behavior, either: (1) Add a new 'balanced' or 'general-purpose' intent to the ModelIntent type in assistant/src/providers/types.ts and map it to Sonnet-tier models in assistant/src/providers/model-intents.ts, then use that intent here. Or (2) If upgrading to Opus is intentional, keep this as-is but document the cost increase. Or (3) Use 'vision-optimized' which maps to Sonnet, though the name is misleading for a text summary use case.

Was this helpful? React with 👍 or 👎 to provide feedback.

refactor: route provider models via intent identifiers

272a287

noanflaherty requested a review from siddseethepalli as a code owner February 24, 2026 15:50

noanflaherty merged commit e061d5f into main Feb 24, 2026
1 check failed

noanflaherty deleted the codex/model-intent-identifiers branch February 24, 2026 15:50

devin-ai-integration Bot reviewed Feb 24, 2026

View reviewed changes

awlevin mentioned this pull request Feb 24, 2026

docs: add provider abstraction and approval resilience rules to AGENTS.md #8279

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor LLM callsites to use provider model intents#7943

Refactor LLM callsites to use provider model intents#7943
noanflaherty merged 1 commit into
mainfrom
codex/model-intent-identifiers

noanflaherty commented Feb 24, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noanflaherty commented Feb 24, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

noanflaherty commented Feb 24, 2026 •

edited by devin-ai-integration Bot

Loading