fix: use Anthropic image token estimation for OpenRouter anthropic/* models#25960
Conversation
…models When OpenRouter is configured with an anthropic/* model, the local token estimator was passing providerName: "openrouter", which skips the dimension-based Anthropic image sizing (~width*height/750) and falls back to the generic base64/4 heuristic. A 1MB screenshot would estimate as ~325k tokens vs. the ~1,600 Anthropic actually charges, causing shouldCompact() to trigger compaction on every turn. Add an optional `tokenEstimationProvider` to the Provider interface. OpenRouterProvider returns "anthropic" when its default model starts with `anthropic/`, so the estimator applies the matching upstream rules. window-manager reads this via a private getter across its 7 call sites.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a261af771c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| * the generic `base64/4` fallback. | ||
| */ | ||
| private get estimationProviderName(): string { | ||
| return this.provider.tokenEstimationProvider ?? this.provider.name; |
There was a problem hiding this comment.
Forward tokenEstimationProvider through provider wrappers
ContextWindowManager now relies on provider.tokenEstimationProvider, but runtime providers are wrapped (initializeProviders wraps OpenRouter in RetryProvider, and DaemonServer may add RateLimitProvider), and those wrapper classes only expose name/sendMessage. In that common path this expression falls back to provider.name (openrouter), so Anthropic image/PDF estimation is never selected and compaction behavior remains unchanged for OpenRouter anthropic/* conversations despite this fix.
Useful? React with 👍 / 👎.
|
Follow-up shipped in #26265 — addresses consolidated review feedback (chat concurrency mutex, dispose teardown, wake adapter timeout/ghost prevention, tokenEstimationProvider forwarding, E2E test flakiness). |
Summary
anthropic/*via OpenRouter, the local token estimator was using the genericbase64/4image heuristic instead of Anthropic's dimension-based rules. A typical screenshot estimated at ~325k tokens vs. ~1,600 actual, soshouldCompact()tripped every turn.tokenEstimationProviderto theProviderinterface.OpenRouterProvidernow returns"anthropic"foranthropic/*default models;window-managerconsumes it via a private getter across its 7 call sites.Original prompt
OpenRouter misestimates the number of tokens images take when using Anthropic models, causing compaction to trigger constantly. Can we use the same token estimation logic we use for normal calls to Anthropic?