Skip to content
9 changes: 7 additions & 2 deletions README.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ AI支援開発の時代において、**トークンは新しいエネルギー*
- リアルタイムフィルタリングとソート
- ゼロフリッカーレンダリング(ネイティブZigエンジン)
- **マルチプラットフォームサポート** - OpenCode、Claude Code、Codex CLI、Cursor IDE、Gemini CLI、Amp、Droid、OpenClaw、Pi全体の使用量追跡
- **リアルタイム価格** - 1時間ディスクキャッシュ付きでLiteLLMから現在の価格を取得
- **リアルタイム価格** - 1時間ディスクキャッシュ付きでLiteLLMから現在の価格を取得;OpenRouter自動フォールバックと新規モデル向けCursor価格サポート
- **詳細な内訳** - 入力、出力、キャッシュ読み書き、推論トークン追跡
- **ネイティブRustコア** - 10倍高速な処理のため、すべての解析と集計をRustで実行
- **Web可視化** - 2Dと3Dビューのインタラクティブ貢献グラフ
Expand Down Expand Up @@ -295,7 +295,8 @@ tokscale pricing "claude-3-5-sonnet" --provider litellm
3. **ティアサフィックス除去** - 品質ティアを削除(`gpt-5.2-xhigh` → `gpt-5.2`)
4. **バージョン正規化** - バージョン形式を処理(`claude-3-5-sonnet` ↔ `claude-3.5-sonnet`)
5. **プロバイダープレフィックスマッチング** - 一般的なプレフィックスを試行(`anthropic/`、`openai/`など)
6. **ファジーマッチング** - 部分モデル名の単語境界マッチング
6. **Cursorモデル価格** - LiteLLM/OpenRouterにまだ存在しないモデルのハードコード価格(例:`gpt-5.3-codex`)
7. **ファジーマッチング** - 部分モデル名の単語境界マッチング

**プロバイダー優先順位:**

Expand Down Expand Up @@ -911,6 +912,10 @@ model_changeイベントとアシスタントメッセージを含むセッシ

Tokscaleは[LiteLLMの価格データベース](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json)からリアルタイム価格を取得します。

**ダイナミックフォールバック**: LiteLLMにまだ存在しないモデル(例:最近リリースされたモデル)は、[OpenRouterのエンドポイントAPI](https://openrouter.ai/docs/api/api-reference/endpoints/list-endpoints)から自動的に価格を取得します。

**Cursorモデル価格**: LiteLLMとOpenRouterの両方にまだ存在しない最新モデル(例:`gpt-5.3-codex`)は、[Cursorモデルドキュメント](https://cursor.com/en-US/docs/models)から取得したハードコード価格を使用します。これらのオーバーライドはすべてのアップストリームソースの後、ファジーマッチングの前にチェックされるため、実際のアップストリーム価格が利用可能になると自動的に優先されます。

**キャッシュ**: 価格データは1時間TTLでディスクにキャッシュされ、高速な起動を確保します:
- LiteLLMキャッシュ: `~/.cache/tokscale/pricing-litellm.json`
- OpenRouterキャッシュ: `~/.cache/tokscale/pricing-openrouter.json`(増分式、使用したモデルのみキャッシュ)
Expand Down
9 changes: 7 additions & 2 deletions README.ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ AI 지원 개발 시대에 **토큰은 새로운 에너지**입니다. 토큰은
- 실시간 필터링 및 정렬
- 깜빡임 없는 렌더링 (네이티브 Zig 엔진)
- **멀티 플랫폼 지원** - OpenCode, Claude Code, Codex CLI, Cursor IDE, Gemini CLI, Amp, Droid, OpenClaw, Pi 사용량 통합 추적
- **실시간 가격 반영** - LiteLLM에서 최신 가격을 가져와(디스크 캐시 1시간) 비용 계산
- **실시간 가격 반영** - LiteLLM에서 최신 가격을 가져와(디스크 캐시 1시간) 비용 계산; OpenRouter 자동 폴백 및 신규 모델용 Cursor 가격 지원
- **상세 분석** - 입력, 출력, 캐시 읽기/쓰기, 추론 토큰까지 추적
- **네이티브 Rust 코어** - 모든 파싱과 집계를 Rust로 처리해 최대 10배 빠른 성능
- **웹 시각화** - 2D 및 3D 뷰의 인터랙티브 기여 그래프
Expand Down Expand Up @@ -294,7 +294,8 @@ tokscale pricing "claude-3-5-sonnet" --provider litellm
3. **티어 접미사 제거** - 품질 티어 제거 (`gpt-5.2-xhigh` → `gpt-5.2`)
4. **버전 정규화** - 버전 형식 처리 (`claude-3-5-sonnet` ↔ `claude-3.5-sonnet`)
5. **프로바이더 접두사 매칭** - 일반 접두사 시도 (`anthropic/`, `openai/` 등)
6. **퍼지 매칭** - 부분 모델 이름에 대한 단어 경계 매칭
6. **Cursor 모델 가격** - LiteLLM/OpenRouter에 아직 없는 모델의 하드코딩 가격 (예: `gpt-5.3-codex`)
7. **퍼지 매칭** - 부분 모델 이름에 대한 단어 경계 매칭

**프로바이더 우선순위:**

Expand Down Expand Up @@ -910,6 +911,10 @@ model_change 이벤트와 어시스턴트 메시지가 포함된 세션 JSONL

Tokscale은 [LiteLLM의 가격 데이터베이스](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json)에서 실시간 가격을 가져옵니다.

**동적 폴백**: LiteLLM에 아직 없는 모델(예: 최근 출시된 모델)은 [OpenRouter의 엔드포인트 API](https://openrouter.ai/docs/api/api-reference/endpoints/list-endpoints)에서 자동으로 가격을 가져옵니다.

**Cursor 모델 가격**: LiteLLM과 OpenRouter 모두에 없는 최신 모델(예: `gpt-5.3-codex`)은 [Cursor 모델 문서](https://cursor.com/en-US/docs/models)에서 가져온 하드코딩 가격을 사용합니다. 이 오버라이드는 모든 업스트림 소스 다음에, 퍼지 매칭 이전에 확인되므로 실제 업스트림 가격이 사용 가능해지면 자동으로 양보합니다.

**캐싱**: 가격 데이터는 1시간 TTL로 디스크에 캐시되어 빠른 시작을 보장합니다:
- LiteLLM 캐시: `~/.cache/tokscale/pricing-litellm.json`
- OpenRouter 캐시: `~/.cache/tokscale/pricing-openrouter.json` (증분 방식, 사용한 모델만 캐시)
Expand Down
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ In the age of AI-assisted development, **tokens are the new energy**. They power
- Real-time filtering and sorting
- Zero flicker rendering (native Zig engine)
- **Multi-platform support** - Track usage across OpenCode, Claude Code, Codex CLI, Cursor IDE, Gemini CLI, Amp, Droid, OpenClaw, and Pi
- **Real-time pricing** - Fetches current pricing from LiteLLM with 1-hour disk cache; automatic OpenRouter fallback for new models
- **Real-time pricing** - Fetches current pricing from LiteLLM with 1-hour disk cache; automatic OpenRouter fallback and Cursor model pricing for newly released models
- **Detailed breakdowns** - Input, output, cache read/write, and reasoning token tracking
- **Native Rust core** - All parsing and aggregation done in Rust for 10x faster processing
- **Web visualization** - Interactive contribution graph with 2D and 3D views
Expand Down Expand Up @@ -311,7 +311,8 @@ The pricing lookup uses a multi-step resolution strategy:
3. **Tier Suffix Stripping** - Removes quality tiers (`gpt-5.2-xhigh` → `gpt-5.2`)
4. **Version Normalization** - Handles version formats (`claude-3-5-sonnet` ↔ `claude-3.5-sonnet`)
5. **Provider Prefix Matching** - Tries common prefixes (`anthropic/`, `openai/`, etc.)
6. **Fuzzy Matching** - Word-boundary matching for partial model names
6. **Cursor Model Pricing** - Hardcoded pricing for models not yet in LiteLLM/OpenRouter (e.g., `gpt-5.3-codex`)
7. **Fuzzy Matching** - Word-boundary matching for partial model names

**Provider Preference:**

Expand Down Expand Up @@ -948,6 +949,8 @@ Tokscale fetches real-time pricing from [LiteLLM's pricing database](https://git

**Dynamic Fallback**: For models not yet available in LiteLLM (e.g., recently released models), Tokscale automatically fetches pricing from [OpenRouter's endpoints API](https://openrouter.ai/docs/api/api-reference/endpoints/list-endpoints). This ensures you get accurate pricing from the model's author provider (e.g., Z.AI for glm-4.7) without waiting for LiteLLM updates.

**Cursor Model Pricing**: For very recently released models not yet in either LiteLLM or OpenRouter (e.g., `gpt-5.3-codex`), Tokscale includes hardcoded pricing sourced from [Cursor's model docs](https://cursor.com/en-US/docs/models). These overrides are checked after all upstream sources but before fuzzy matching, so they automatically yield once real upstream pricing becomes available.

**Caching**: Pricing data is cached to disk with 1-hour TTL for fast startup:
- LiteLLM cache: `~/.cache/tokscale/pricing-litellm.json`
- OpenRouter cache: `~/.cache/tokscale/pricing-openrouter.json` (incremental, caches only models you've used)
Expand Down
9 changes: 7 additions & 2 deletions README.zh-cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@
- 实时筛选和排序
- 零闪烁渲染(原生 Zig 引擎)
- **多平台支持** - 跟踪 OpenCode、Claude Code、Codex CLI、Cursor IDE、Gemini CLI、Amp、Droid、OpenClaw 和 Pi 的使用情况
- **实时定价** - 从 LiteLLM 获取当前价格,带 1 小时磁盘缓存
- **实时定价** - 从 LiteLLM 获取当前价格,带 1 小时磁盘缓存;OpenRouter 自动回退和新模型的 Cursor 定价支持
- **详细分解** - 输入、输出、缓存读写和推理 Token 跟踪
- **原生 Rust 核心** - 所有解析和聚合在 Rust 中完成,处理速度提升 10 倍
- **Web 可视化** - 带 2D 和 3D 视图的交互式贡献图
Expand Down Expand Up @@ -295,7 +295,8 @@ tokscale pricing "claude-3-5-sonnet" --provider litellm
3. **层级后缀剥离** - 移除质量层级(`gpt-5.2-xhigh` → `gpt-5.2`)
4. **版本标准化** - 处理版本格式(`claude-3-5-sonnet` ↔ `claude-3.5-sonnet`)
5. **提供商前缀匹配** - 尝试常见前缀(`anthropic/`、`openai/` 等)
6. **模糊匹配** - 部分模型名称的词边界匹配
6. **Cursor 模型定价** - LiteLLM/OpenRouter 中尚未收录的模型的硬编码定价(例如:`gpt-5.3-codex`)
7. **模糊匹配** - 部分模型名称的词边界匹配

**提供商优先级:**

Expand Down Expand Up @@ -911,6 +912,10 @@ Cursor 数据使用您的会话令牌从 Cursor API 获取并本地缓存。运

Tokscale 从 [LiteLLM 的价格数据库](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json)获取实时价格。

**动态回退**:对于 LiteLLM 中尚未收录的模型(例如最近发布的模型),Tokscale 会自动从 [OpenRouter 的端点 API](https://openrouter.ai/docs/api/api-reference/endpoints/list-endpoints) 获取定价。

**Cursor 模型定价**:对于 LiteLLM 和 OpenRouter 中都尚未收录的最新模型(例如 `gpt-5.3-codex`),Tokscale 使用从 [Cursor 模型文档](https://cursor.com/en-US/docs/models)获取的硬编码定价。这些覆盖在所有上游来源之后、模糊匹配之前检查,因此当真正的上游定价可用时会自动让步。

**缓存**:价格数据以 1 小时 TTL 缓存到磁盘,确保快速启动:
- LiteLLM 缓存:`~/.cache/tokscale/pricing-litellm.json`
- OpenRouter 缓存:`~/.cache/tokscale/pricing-openrouter.json`(增量式,仅缓存您使用过的模型)
Expand Down
3 changes: 2 additions & 1 deletion packages/cli/src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1628,7 +1628,8 @@ async function handlePricingCommand(modelId: string, options: { json?: boolean;
},
}, null, 2));
} else {
const sourceLabel = result.source.toLowerCase() === "litellm" ? pc.blue("LiteLLM") : pc.magenta("OpenRouter");
const sourceLower = result.source.toLowerCase();
const sourceLabel = sourceLower === "litellm" ? pc.blue("LiteLLM") : sourceLower === "cursor" ? pc.yellow("Cursor") : pc.magenta("OpenRouter");
const inputCost = result.pricing.input_cost_per_token ?? 0;
const outputCost = result.pricing.output_cost_per_token ?? 0;
const cacheReadCost = result.pricing.cache_read_input_token_cost;
Expand Down
8 changes: 2 additions & 6 deletions packages/core/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -536,17 +536,13 @@ if (!nativeBinding || process.env.NAPI_RS_FORCE_WASI) {
wasiBindingError = err
}
}
if (!nativeBinding || process.env.NAPI_RS_FORCE_WASI) {
if (!nativeBinding) {
try {
wasiBinding = require('@tokscale/core-wasm32-wasi')
nativeBinding = wasiBinding
} catch (err) {
if (process.env.NAPI_RS_FORCE_WASI) {
if (!wasiBindingError) {
wasiBindingError = err
} else {
wasiBindingError.cause = err
}
wasiBindingError.cause = err
loadErrors.push(err)
}
}
Expand Down
87 changes: 72 additions & 15 deletions packages/core/src/pricing/lookup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -69,11 +69,13 @@ struct CachedResult {
pub struct PricingLookup {
litellm: HashMap<String, ModelPricing>,
openrouter: HashMap<String, ModelPricing>,
cursor: HashMap<String, ModelPricing>,
litellm_keys: Vec<String>,
openrouter_keys: Vec<String>,
litellm_lower: HashMap<String, String>,
openrouter_lower: HashMap<String, String>,
openrouter_model_part: HashMap<String, String>,
cursor_lower: HashMap<String, String>,
lookup_cache: RwLock<HashMap<String, Option<CachedResult>>>,
}

Expand All @@ -87,6 +89,7 @@ impl PricingLookup {
pub fn new(
litellm: HashMap<String, ModelPricing>,
openrouter: HashMap<String, ModelPricing>,
cursor: HashMap<String, ModelPricing>,
) -> Self {
let mut litellm_keys: Vec<String> = litellm.keys().cloned().collect();
litellm_keys.sort_by(|a, b| b.len().cmp(&a.len()));
Expand All @@ -111,14 +114,21 @@ impl PricingLookup {
}
}

let mut cursor_lower = HashMap::with_capacity(cursor.len());
for key in cursor.keys() {
cursor_lower.insert(key.to_lowercase(), key.clone());
}

Self {
litellm,
openrouter,
cursor,
litellm_keys,
openrouter_keys,
litellm_lower,
openrouter_lower,
openrouter_model_part,
cursor_lower,
lookup_cache: RwLock::new(HashMap::with_capacity(64)),
}
}
Expand Down Expand Up @@ -230,6 +240,15 @@ impl PricingLookup {
}
}

if let Some(result) = self.exact_match_cursor(model_id) {
return Some(result);
}
if let Some(version_normalized) = normalize_version_separator(model_id) {
if let Some(result) = self.exact_match_cursor(&version_normalized) {
return Some(result);
}
}

if !is_fuzzy_eligible(model_id) {
return None;
}
Expand Down Expand Up @@ -353,6 +372,28 @@ impl PricingLookup {
None
}

fn exact_match_cursor(&self, model_id: &str) -> Option<LookupResult> {
if let Some(key) = self.cursor_lower.get(model_id) {
return Some(LookupResult {
pricing: self.cursor.get(key).unwrap().clone(),
source: "Cursor".into(),
matched_key: key.clone(),
});
}
if let Some(model_part) = model_id.split('/').last() {
if model_part != model_id {
if let Some(key) = self.cursor_lower.get(model_part) {
return Some(LookupResult {
pricing: self.cursor.get(key).unwrap().clone(),
source: "Cursor".into(),
matched_key: key.clone(),
});
}
}
}
None
}

fn prefix_match_litellm(&self, model_id: &str) -> Option<LookupResult> {
for prefix in PROVIDER_PREFIXES {
let key = format!("{}{}", prefix, model_id);
Expand Down Expand Up @@ -451,19 +492,35 @@ impl PricingLookup {
None => return 0.0,
};

let p = &result.pricing;
let safe_price =
|opt: Option<f64>| opt.filter(|v| v.is_finite() && *v >= 0.0).unwrap_or(0.0);

let input_cost = input as f64 * safe_price(p.input_cost_per_token);
let output_cost = (output + reasoning) as f64 * safe_price(p.output_cost_per_token);
let cache_read_cost = cache_read as f64 * safe_price(p.cache_read_input_token_cost);
let cache_write_cost = cache_write as f64 * safe_price(p.cache_creation_input_token_cost);

input_cost + output_cost + cache_read_cost + cache_write_cost
compute_cost(
&result.pricing,
input,
output,
cache_read,
cache_write,
reasoning,
)
}
}

pub fn compute_cost(
pricing: &ModelPricing,
input: i64,
output: i64,
cache_read: i64,
cache_write: i64,
reasoning: i64,
) -> f64 {
let safe_price = |opt: Option<f64>| opt.filter(|v| v.is_finite() && *v >= 0.0).unwrap_or(0.0);

let input_cost = input as f64 * safe_price(pricing.input_cost_per_token);
let output_cost = (output + reasoning) as f64 * safe_price(pricing.output_cost_per_token);
let cache_read_cost = cache_read as f64 * safe_price(pricing.cache_read_input_token_cost);
let cache_write_cost = cache_write as f64 * safe_price(pricing.cache_creation_input_token_cost);

input_cost + output_cost + cache_read_cost + cache_write_cost
}

fn extract_model_family(model_id: &str) -> String {
let lower = model_id.to_lowercase();

Expand Down Expand Up @@ -1069,7 +1126,7 @@ mod tests {
}

fn create_lookup() -> PricingLookup {
PricingLookup::new(mock_litellm(), mock_openrouter())
PricingLookup::new(mock_litellm(), mock_openrouter(), HashMap::new())
}

// =========================================================================
Expand Down Expand Up @@ -1468,7 +1525,7 @@ mod tests {
);
// Note: gpt-5-codex is NOT in the pricing data

let lookup = PricingLookup::new(litellm, HashMap::new());
let lookup = PricingLookup::new(litellm, HashMap::new(), HashMap::new());

// Looking up gpt-5-codex should fall back to gpt-5
let result = lookup.lookup("gpt-5-codex").unwrap();
Expand All @@ -1495,7 +1552,7 @@ mod tests {
},
);

let lookup = PricingLookup::new(litellm, HashMap::new());
let lookup = PricingLookup::new(litellm, HashMap::new(), HashMap::new());

// gpt-5-codex-high should strip -high first, then fall back from gpt-5-codex to gpt-5
let result = lookup.lookup("gpt-5-codex-high").unwrap();
Expand Down Expand Up @@ -1531,7 +1588,7 @@ mod tests {
},
);

let lookup = PricingLookup::new(litellm, HashMap::new());
let lookup = PricingLookup::new(litellm, HashMap::new(), HashMap::new());

// Should use the exact match, not fall back
let result = lookup.lookup("gpt-5-codex").unwrap();
Expand Down Expand Up @@ -1635,7 +1692,7 @@ mod tests {
},
);

let lookup = PricingLookup::new(litellm, HashMap::new());
let lookup = PricingLookup::new(litellm, HashMap::new(), HashMap::new());
let result = lookup.lookup("grok-code").unwrap();

// Must prefer xai (original provider) over azure_ai (reseller)
Expand Down
Loading