feat: add InvokeModel API support for claude models in aws bedrock by hustxiayang · Pull Request #1648 · envoyproxy/ai-gateway

hustxiayang · 2025-12-11T18:15:53Z

Description
Add InvokeModel API support for claude models in aws bedrock. The motivation is to provide consistent services cross providers, #1644 for more details about the motivation.

Other Changes:
I put common codes related to anthropic into anthropic_helper.go, so that both aws and gcp can share these codes.

codecov-commenter · 2025-12-11T18:22:08Z

Codecov Report

❌ Patch coverage is 84.15366% with 132 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.48%. Comparing base (d851c0c) to head (0bc0bec).

Files with missing lines	Patch %	Lines
internal/translator/anthropic_helper.go	85.49%	74 Missing and 29 partials ⚠️
internal/translator/openai_awsanthropic.go	77.58%	14 Missing and 12 partials ⚠️
internal/translator/openai_gcpanthropic.go	40.00%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main    #1648    +/-   ##
========================================
  Coverage   83.48%   83.48%            
========================================
  Files         123      124     +1     
  Lines       16335    16438   +103     
========================================
+ Hits        13637    13723    +86     
+ Misses       1818     1814     -4     
- Partials      880      901    +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

yuzisun · 2026-01-20T04:23:07Z

@hustxiayang can you help resolve the conflicts?

api/v1alpha1/shared_types.go

internal/endpointspec/endpointspec.go

internal/translator/anthropic_helper.go

Signed-off-by: yxia216 <yxia216@bloomberg.net>

…nvoyproxy#1607) **Description** This PR is to fix the following issues: 1 Add reasoning content in the request is also missing in the gcp anthropic 2 The reasoning output of gcp anthropic is not parsed out In this way, reasoning claude models can have an unified interface. Other issues: 1 The assistant message of gcp anthropic did not cover the case of array. Fixed. --------- Signed-off-by: yxia216 <yxia216@bloomberg.net>

**Description** This replaces encoding/json with bytedance/sonic for faster json operations. This makes the data plane benchmark results drastically better, especially for the translation code path. One thing to note is that we could use goccy/go-json, but it comes with some incompatibility (no omitzero tag support, etc). On the other hand, bytesdance/sonic is 99% compatible with the current behavior except for the field order, which semantically doesn't matter in practice. ``` goos: linux goarch: amd64 pkg: github.com/envoyproxy/ai-gateway/tests/data-plane cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ChatCompletions/OpenAI_-_small-16 135.0µ ± 1% 107.7µ ± 2% -20.24% (p=0.000 n=10) ChatCompletions/OpenAI_-_medium-16 2.546m ± 1% 1.456m ± 1% -42.83% (p=0.000 n=10) ChatCompletions/OpenAI_-_large-16 28.50m ± 7% 17.41m ± 4% -38.91% (p=0.000 n=10) ChatCompletions/OpenAI_-_xlarge-16 141.24m ± 3% 71.07m ± 7% -49.68% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_small-16 155.5µ ± 1% 120.4µ ± 1% -22.57% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_medium-16 3.475m ± 1% 2.004m ± 1% -42.33% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_large-16 37.01m ± 2% 24.08m ± 3% -34.93% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_xlarge-16 346.1m ± 3% 100.2m ± 5% -71.04% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_small-16 172.3µ ± 1% 141.9µ ± 2% -17.63% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_medium-16 4.371m ± 1% 2.827m ± 1% -35.31% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_large-16 51.21m ± 3% 32.67m ± 3% -36.19% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_xlarge-16 344.9m ± 2% 102.3m ± 2% -70.33% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_small-16 232.4µ ± 1% 198.4µ ± 1% -14.64% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_medium-16 9.098m ± 1% 7.513m ± 2% -17.43% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_large-16 102.14m ± 4% 83.27m ± 3% -18.48% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_xlarge-16 2.014 ± 3% 1.450 ± 2% -27.99% (p=0.000 n=10) ChatCompletionsStreaming/OpenAI_Streaming-16 13.11m ± 0% 13.12m ± 1% ~ (p=0.190 n=10) ChatCompletionsStreaming/AWS_Streaming-16 13.11m ± 0% 13.10m ± 0% ~ (p=0.353 n=10) geomean 11.33m 7.424m -34.50% ``` --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

**Description** Anthropic cache writes cost different from cache reads. Cost calculation should be updated to account for writes vs reads. Adding a new cost type. Updated similarly for AWS. https://platform.claude.com/docs/en/build-with-claude/prompt-caching https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_TokenUsage.html Vertex AI and OpenAI themselves do not support cache write response so cache writes will be set to 0. **Changes** Dynamic metadata will now include cache writes. Separates cache reads and writes. User is returned new usage cached writes. Updated tests to match -- hopefully I caught them all. Updated wherever I saw CachedInputTokens. --------- Signed-off-by: Aaron Choo <achoo30@bloomberg.net>

…hing (envoyproxy#1721) **Description** Include cache creation and cache hit tokens to total input tokens as well as keep separate fields for cache miss/hit accounting. This is to unify the usage response to user for both implicit and explicit cache as the input tokens for gpt and gemini include the cache tokens. --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang · 2026-02-05T17:31:59Z

/retest

yuzisun · 2026-02-05T17:44:07Z

internal/filterapi/filterconfig.go

 	// Used for Gemini models hosted on Google Cloud Vertex AI.
 	APISchemaGCPVertexAI APISchemaName = "GCPVertexAI"
-	// APISchemaGCPAnthropic represents the Google Cloud Anthropic API schema.
+	// APISchemaGCPAnthropic represents the schema from OpenAI API to Google Cloud Anthropic API.


This is not correct, it is the same schema if you use message API directly not always translated from OpenAI API.

Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang · 2026-02-05T18:08:02Z

/retest

hustxiayang · 2026-02-05T18:46:09Z

/retest

…nvoyproxy#1648) **Description** Add InvokeModel API support for claude models in aws bedrock. The motivation is to provide consistent services cross providers, envoyproxy#1644 for more details about the motivation. Other Changes: I put common codes related to anthropic into `anthropic_helper.go`, so that both aws and gcp can share these codes. --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com> Signed-off-by: Aaron Choo <achoo30@bloomberg.net> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Takeshi Yoneda <t.y.mathetake@gmail.com> Co-authored-by: Aaron Choo <achoo30@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net>

hustxiayang requested a review from a team as a code owner December 11, 2025 18:15

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Dec 11, 2025

hustxiayang marked this pull request as draft December 11, 2025 18:16

hustxiayang changed the title ~~feat: Add InvokeModel API support for claude models in aws bedrock~~ feat: add InvokeModel API support for claude models in aws bedrock Dec 11, 2025

hustxiayang force-pushed the aws-anthropic branch 3 times, most recently from 467446a to 7d4ec3f Compare January 29, 2026 22:50

hustxiayang marked this pull request as ready for review January 29, 2026 22:53

yuzisun reviewed Feb 1, 2026

View reviewed changes

api/v1alpha1/shared_types.go Outdated Show resolved Hide resolved

yuzisun reviewed Feb 1, 2026

View reviewed changes

internal/endpointspec/endpointspec.go Outdated Show resolved Hide resolved

yuzisun reviewed Feb 1, 2026

View reviewed changes

internal/translator/anthropic_helper.go Outdated Show resolved Hide resolved

hustxiayang force-pushed the aws-anthropic branch 2 times, most recently from cd4cf90 to e5b0450 Compare February 2, 2026 16:19

hustxiayang and others added 12 commits February 5, 2026 11:22

init

35e1851

Signed-off-by: yxia216 <yxia216@bloomberg.net>

udpate

28f689e

Signed-off-by: yxia216 <yxia216@bloomberg.net>

init

fbe2a5f

Signed-off-by: yxia216 <yxia216@bloomberg.net>

update

f83c670

Signed-off-by: yxia216 <yxia216@bloomberg.net>

rebase

189d1cc

Signed-off-by: yxia216 <yxia216@bloomberg.net>

consistent

4e4f01b

Signed-off-by: yxia216 <yxia216@bloomberg.net>

address-comments

b1ccac9

Signed-off-by: yxia216 <yxia216@bloomberg.net>

fix-coverage

d480510

Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang force-pushed the aws-anthropic branch from 3f4ea09 to cc439e2 Compare February 5, 2026 17:05

rebase-again

0b36399

Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang force-pushed the aws-anthropic branch from cc439e2 to 0b36399 Compare February 5, 2026 17:15

yuzisun reviewed Feb 5, 2026

View reviewed changes

comment

bfdcb40

Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang force-pushed the aws-anthropic branch from fca0997 to bfdcb40 Compare February 5, 2026 17:53

Merge branch 'main' into aws-anthropic

ad29fc5

yuzisun approved these changes Feb 7, 2026

View reviewed changes

Merge branch 'main' into aws-anthropic

0bc0bec

yuzisun merged commit 4347d17 into envoyproxy:main Feb 7, 2026
34 checks passed

xiaolin593 mentioned this pull request Mar 1, 2026

fix: pass through Anthropic stream errors #1906

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add InvokeModel API support for claude models in aws bedrock#1648

feat: add InvokeModel API support for claude models in aws bedrock#1648
yuzisun merged 16 commits intoenvoyproxy:mainfrom
hustxiayang:aws-anthropic

hustxiayang commented Dec 11, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Dec 11, 2025 •

edited

Loading

Uh oh!

yuzisun commented Jan 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hustxiayang commented Feb 5, 2026

Uh oh!

yuzisun Feb 5, 2026

Uh oh!

hustxiayang Feb 5, 2026

Uh oh!

hustxiayang commented Feb 5, 2026

Uh oh!

hustxiayang commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

hustxiayang commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yuzisun commented Jan 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hustxiayang commented Feb 5, 2026

Uh oh!

yuzisun Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

hustxiayang Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

hustxiayang commented Feb 5, 2026

Uh oh!

hustxiayang commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

hustxiayang commented Dec 11, 2025 •

edited

Loading

codecov-commenter commented Dec 11, 2025 •

edited

Loading