Conversation
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
internal/apischema/openai/openai.go
Outdated
| // Cached tokens present in the prompt. | ||
| CachedTokens int `json:"cached_tokens,omitzero"` | ||
| // Tokens written to the cache. | ||
| CachedWriteTokens int `json:"cached_write_tokens,omitzero"` |
There was a problem hiding this comment.
litellm named it cache_creation_input_tokens
https://docs.litellm.ai/docs/completion/prompt_caching
There was a problem hiding this comment.
AWS calls it cacheWriteInputTokens and Anthropic calls it cache_creation_input_tokens. Can opt to name it creation instead of write
There was a problem hiding this comment.
I think either cache_creation_input_tokens or cache_write_input_tokens is fine.
There was a problem hiding this comment.
updated to use cache creation everywhere
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1719 +/- ##
==========================================
+ Coverage 81.01% 81.04% +0.03%
==========================================
Files 147 147
Lines 13288 13341 +53
==========================================
+ Hits 10765 10812 +47
- Misses 1872 1876 +4
- Partials 651 653 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| translator := NewAnthropicToAnthropicTranslator("", "") | ||
| require.NotNil(t, translator) | ||
| const responseBody = `{"model":"claude-sonnet-4-5-20250929","id":"msg_01J5gW6Sffiem6avXSAooZZw","type":"message","role":"assistant","content":[{"type":"text","text":"Hi! 👋 How can I help you today?"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":9,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":16,"service_tier":"standard"}}` | ||
| const responseBody = `{"model":"claude-sonnet-4-5-20250929","id":"msg_01J5gW6Sffiem6avXSAooZZw","type":"message","role":"assistant","content":[{"type":"text","text":"Hi! 👋 How can I help you today?"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":9,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cached_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":16,"service_tier":"standard"}}` |
There was a problem hiding this comment.
it should be cache_creation based on the anthropic doc
https://platform.claude.com/docs/en/build-with-claude/prompt-caching#1-hour-cache-duration
internal/apischema/openai/openai.go
Outdated
| // Cached tokens present in the prompt. | ||
| CachedTokens int `json:"cached_tokens,omitzero"` | ||
| // Tokens written to the cache. | ||
| CachedCreationTokens int `json:"cached_creation_input_tokens,omitzero"` |
There was a problem hiding this comment.
I think you are trying to align with the name cached_tokens but cache_creation_input_tokens reads better.
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
**Description** Anthropic cache writes cost different from cache reads. Cost calculation should be updated to account for writes vs reads. Adding a new cost type. Updated similarly for AWS. https://platform.claude.com/docs/en/build-with-claude/prompt-caching https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_TokenUsage.html Vertex AI and OpenAI themselves do not support cache write response so cache writes will be set to 0. **Changes** Dynamic metadata will now include cache writes. Separates cache reads and writes. User is returned new usage cached writes. Updated tests to match -- hopefully I caught them all. Updated wherever I saw CachedInputTokens. --------- Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
**Description** Anthropic cache writes cost different from cache reads. Cost calculation should be updated to account for writes vs reads. Adding a new cost type. Updated similarly for AWS. https://platform.claude.com/docs/en/build-with-claude/prompt-caching https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_TokenUsage.html Vertex AI and OpenAI themselves do not support cache write response so cache writes will be set to 0. **Changes** Dynamic metadata will now include cache writes. Separates cache reads and writes. User is returned new usage cached writes. Updated tests to match -- hopefully I caught them all. Updated wherever I saw CachedInputTokens. --------- Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
**Description** Anthropic cache writes cost different from cache reads. Cost calculation should be updated to account for writes vs reads. Adding a new cost type. Updated similarly for AWS. https://platform.claude.com/docs/en/build-with-claude/prompt-caching https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_TokenUsage.html Vertex AI and OpenAI themselves do not support cache write response so cache writes will be set to 0. **Changes** Dynamic metadata will now include cache writes. Separates cache reads and writes. User is returned new usage cached writes. Updated tests to match -- hopefully I caught them all. Updated wherever I saw CachedInputTokens. --------- Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Description
Anthropic cache writes cost different from cache reads. Cost calculation should be updated to account for writes vs reads. Adding a new cost type. Updated similarly for AWS.
https://platform.claude.com/docs/en/build-with-claude/prompt-caching
https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_TokenUsage.html
Vertex AI and OpenAI themselves do not support cache write response so cache writes will be set to 0.
Changes
Dynamic metadata will now include cache writes.
Separates cache reads and writes.
User is returned new usage cached writes.
Updated tests to match -- hopefully I caught them all. Updated wherever I saw CachedInputTokens.