Skip to content

feat: include cache creation/read tokens for AWS Bedrock explicit caching#1721

Merged
mathetake merged 7 commits intoenvoyproxy:mainfrom
yuzisun:cache_point
Jan 5, 2026
Merged

feat: include cache creation/read tokens for AWS Bedrock explicit caching#1721
mathetake merged 7 commits intoenvoyproxy:mainfrom
yuzisun:cache_point

Conversation

@yuzisun
Copy link
Contributor

@yuzisun yuzisun commented Jan 5, 2026

Description
Include cache creation and cache hit tokens to total input tokens as well as keep separate fields for cache miss/hit accounting. This is to unify the usage response to user for both implicit and explicit cache as the input tokens for gpt and gemini include the cache tokens.

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
@yuzisun yuzisun requested a review from a team as a code owner January 5, 2026 02:49
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jan 5, 2026
@codecov-commenter
Copy link

codecov-commenter commented Jan 5, 2026

Codecov Report

❌ Patch coverage is 98.11321% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 81.10%. Comparing base (488e668) to head (89ac98c).

Files with missing lines Patch % Lines
internal/translator/openai_awsbedrock.go 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1721      +/-   ##
==========================================
+ Coverage   81.05%   81.10%   +0.04%     
==========================================
  Files         147      147              
  Lines       13319    13327       +8     
==========================================
+ Hits        10796    10809      +13     
+ Misses       1873     1869       -4     
+ Partials      650      649       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
@mathetake mathetake enabled auto-merge (squash) January 5, 2026 19:12
@mathetake mathetake merged commit bcf4cdf into envoyproxy:main Jan 5, 2026
32 checks passed
mathetake pushed a commit that referenced this pull request Jan 7, 2026
…hing (#1721)

**Description**
Include cache creation and cache hit tokens to total input tokens as
well as keep separate fields for cache miss/hit accounting. This is to
unify the usage response to user for both implicit and explicit cache as
the input tokens for gpt and gemini include the cache tokens.

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Jan 7, 2026
…hing (envoyproxy#1721)

**Description**
Include cache creation and cache hit tokens to total input tokens as
well as keep separate fields for cache miss/hit accounting. This is to
unify the usage response to user for both implicit and explicit cache as
the input tokens for gpt and gemini include the cache tokens.

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: yxia216 <yxia216@bloomberg.net>
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Jan 13, 2026
…hing (envoyproxy#1721)

**Description**
Include cache creation and cache hit tokens to total input tokens as
well as keep separate fields for cache miss/hit accounting. This is to
unify the usage response to user for both implicit and explicit cache as
the input tokens for gpt and gemini include the cache tokens.

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: yxia216 <yxia216@bloomberg.net>
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Jan 29, 2026
…hing (envoyproxy#1721)

**Description**
Include cache creation and cache hit tokens to total input tokens as
well as keep separate fields for cache miss/hit accounting. This is to
unify the usage response to user for both implicit and explicit cache as
the input tokens for gpt and gemini include the cache tokens.

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Feb 2, 2026
…hing (envoyproxy#1721)

**Description**
Include cache creation and cache hit tokens to total input tokens as
well as keep separate fields for cache miss/hit accounting. This is to
unify the usage response to user for both implicit and explicit cache as
the input tokens for gpt and gemini include the cache tokens.

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Feb 5, 2026
…hing (envoyproxy#1721)

**Description**
Include cache creation and cache hit tokens to total input tokens as
well as keep separate fields for cache miss/hit accounting. This is to
unify the usage response to user for both implicit and explicit cache as
the input tokens for gpt and gemini include the cache tokens.

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants