feat: implement GCP Gemini request and response translation by sukumargaonkar · Pull Request #819 · envoyproxy/ai-gateway

sukumargaonkar · 2025-07-02T21:12:43Z

Description

This PR request and response translation for gcp-gemini models

Related Issues/PRs (if applicable)

Issue: #609

Special notes for reviewers (if applicable)

This PR only support basic requests with text and images
Future PRs will add support for tools, streaming-requests etc.

…I messages Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

internal/extproc/translator/util.go

internal/extproc/translator/openai_gcpvertexai.go

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake

Can you all add e2e test cases here: https://github.com/envoyproxy/ai-gateway/blob/main/tests/extproc/testupstream_test.go#L140

internal/extproc/translator/gemini_helper.go

- Add period in comments. - fix unnecessary variable exports. - remove role from systemInstruction. Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

api/v1alpha1/registry.go

internal/extproc/translator/openai_gcpvertexai.go

yuzisun · 2025-07-03T00:11:55Z

internal/extproc/translator/gemini_helper.go

+			devMsg := systemMsgToDeveloperMsg(msg)
+			inst, err := fromDeveloperMsg(devMsg)


Understand you are trying to use fromDeveloperMsg, but looks a bit unnecessary that it goes from system message -> developer message -> system instruction.

true, the alternative is to have two functions fromDeveloperMsg and fromSystemMsg which have pretty much identical function body.

prefer that approach?

yes, can you go ahead and have two different functions instead of obfuscate the actual logic by going through unnecessary code path

- Remove unnecessary comment updates - avoid extra copy when json-parsing req-body Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

# Conflicts: # internal/controller/rotators/gcp_oidc_token_rotator.go # internal/extproc/translator/gemini_helper.go # internal/extproc/translator/openai_gcpvertexai.go

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake · 2025-07-04T04:12:17Z

#752 @sukumargaonkar could you fix the remaining comments in this PR as well before this PR? it shouldn't take much cycles

sukumargaonkar · 2025-07-08T14:22:10Z

#752 @sukumargaonkar could you fix the remaining comments in this PR as well before this PR? it shouldn't take much cycles

yes, working on addressing #819 (review)

was having issues setting it up locally

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake · 2025-07-08T17:43:28Z

I am not going through the detail but one generic question: how are you going to translate "reasoning_effort" portion ? It is either low, medium or high (https://platform.openai.com/docs/api-reference/responses-streaming/response/incomplete) and I think it should be translated to the corresponding thinking_budget parameter in Vertex AI Gemini.

FWIW, Gemini on AI Studio's openai compatible endpoint translates

"low", "medium", and "high", which map to 1,024, 8,192, and 24,576 tokens, respectively.

according to their documentation here (https://ai.google.dev/gemini-api/docs/openai).

Can we do exactly the same thing or do you have different idea? We would love to see the parameter supported

sukumargaonkar · 2025-07-08T18:55:40Z

I am not going through the detail but one generic question: how are you going to translate "reasoning_effort" portion ? It is either low, medium or high (https://platform.openai.com/docs/api-reference/responses-streaming/response/incomplete) and I think it should be translated to the corresponding thinking_budget parameter in Vertex AI Gemini.

FWIW, Gemini on AI Studio's openai compatible endpoint translates

"low", "medium", and "high", which map to 1,024, 8,192, and 24,576 tokens, respectively.

according to their documentation here (https://ai.google.dev/gemini-api/docs/openai).

Can we do exactly the same thing or do you have different idea? We would love to see the parameter supported

This PR only handles basic text and image input
future PRs will address tools and thinking/reasoning requests

broke it down to make reviewing easier

But good point, will keep your comment in mind for future PRs

mathetake · 2025-07-08T21:03:51Z

internal/extproc/translator/openai_gcpvertexai.go

+	var openAIRespBytes []byte
+	if len(gcpResp.Candidates) > 0 {


I guess if we have the if block here, openAIRespBytes has the zero-length, hence the resulting the body mutation is also nil, which results in the raw GCP response will be returned to the downstream client who is likely to be using OpenAI SDK? I think that seems like problematic. So the question would be like

When len(gcpResp.Candidates) == 0 happens?

If the case can happen or we cannot be certain, should be make sure that the empty openai response will be constructed in the else block here so that the downstream OpenAI SDK client won't receive the GCP raw response?

good point.
removed the if condition
updated the corresponding test-case

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake · 2025-07-08T21:18:12Z

internal/extproc/translator/gemini_helper.go

 // buildGCPRequestMutations creates header and body mutations for GCP requests
 // It sets the ":path" header, the "content-length" header and the request body.
-func buildGCPRequestMutations(path string, reqBody []byte) (*ext_procv3.HeaderMutation, *ext_procv3.BodyMutation) {
+func buildGCPRequestMutations(path *string, reqBody []byte) (*ext_procv3.HeaderMutation, *ext_procv3.BodyMutation) {


people usually do not use a pointer to the string. This unnecessarily results in escaping the string header (a pair of the length of and pointer to the buffer) to heap. You can just use len(path) != 0 to check if the string empty or not since I believe the empty path is invalid anyways.

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

yuzisun · 2025-07-08T22:09:41Z

internal/extproc/translator/openai_gcpvertexai.go

+				// TODO: Parse GCP error response and convert to OpenAI error format.
+				// For now, just return error response as-is.


Can we prioritize this TODO, i think it is important translation to deliver user error response.

plan to do this in the next PR

mathetake · 2025-07-09T15:37:01Z

internal/controller/rotators/gcp_oidc_token_rotator.go

+//  1. Obtaining an OIDC token from the configured provider.
+//  2. Exchanging the OIDC token for a GCP STS token.
+//  3. Using the STS token to impersonate a GCP service account.
+//  4. Storing the resulting access token in a Kubernetes secret.


can you revert this ?

internal/extproc/translator/gemini_helper.go

mathetake · 2025-07-09T15:52:22Z

almost there!

Copilot

Pull Request Overview

Adds support for translating OpenAI ChatCompletion requests and responses to and from GCP Gemini (Vertex AI) models.

Introduces tests for GCP Vertex AI backend in tests/extproc
Implements translation logic in internal/extproc/translator
Updates Envoy test configuration to route to the new GCP Vertex AI upstream

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/extproc/testupstream_test.go	Adds GCP Vertex AI test cases and expected headers/host handling
tests/extproc/extproc_test.go	Defines `fakeGCPAuthToken` and GCP Vertex AI schema/backend
tests/extproc/envoy.yaml	Configures new `testupstream-gcp-vertexai` cluster and routes
internal/extproc/translator/util.go	Extracted `parseDataURI` helper for image handling
internal/extproc/translator/openai_gcpvertexai.go	Implements request/response translation for GCP Gemini
internal/extproc/translator/gemini_helper.go	Helper functions for building and parsing Gemini messages

Comments suppressed due to low confidence (2)

tests/extproc/testupstream_test.go:17

The test uses fmt.Sprintf but 'fmt' is not imported. Please add "fmt" to the import block.

	"strconv"

internal/extproc/translator/gemini_helper.go:487

The package alias ext_procv3 is not imported; the import should use the same alias (extprocv3) as elsewhere. Update the import to extprocv3 "github.com/envoyproxy/go-control-plane/envoy/service/ext_proc/v3" and adjust references accordingly.

func buildGCPRequestMutations(path string, reqBody []byte) (*ext_procv3.HeaderMutation, *ext_procv3.BodyMutation) {

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake

i am not going through the actual translation logic but LGTM on the other parts generally! Thanks! I will defer to @yuzisun for the final stamping

mathetake · 2025-07-09T23:35:52Z

on second thought it seems like there would be a conflict with @alexagriffith's PR #838, so I am going ahead and merging to unlock you guys to work on subsequent stuff

…xy#819) **Description** This PR request and response translation for gcp-gemini models. **Related Issues/PRs (if applicable)** Issue: envoyproxy#609 --------- Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net>

feat: implement GCP Gemini request and response translation for OpenA…

cc0f6a8

…I messages Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

sukumargaonkar requested a review from a team as a code owner July 2, 2025 21:12

mathetake reviewed Jul 2, 2025

View reviewed changes

internal/extproc/translator/util.go Outdated Show resolved Hide resolved

mathetake reviewed Jul 2, 2025

View reviewed changes

internal/extproc/translator/openai_gcpvertexai.go Outdated Show resolved Hide resolved

refactor: remove unnecessary string manipulation for GCP model prefix

cb12d44

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake reviewed Jul 2, 2025

View reviewed changes

internal/extproc/translator/gemini_helper.go Outdated Show resolved Hide resolved

internal/extproc/translator/gemini_helper.go Outdated Show resolved Hide resolved

internal/extproc/translator/gemini_helper.go Outdated Show resolved Hide resolved

sukumargaonkar mentioned this pull request Jul 2, 2025

Support GCP models #609

Closed

sukumargaonkar added 2 commits July 2, 2025 18:16

address pr comments

2088925

- Add period in comments. - fix unnecessary variable exports. - remove role from systemInstruction. Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

remove unnecessary nil check

da5ec61

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake reviewed Jul 2, 2025

View reviewed changes

api/v1alpha1/registry.go Show resolved Hide resolved

mathetake reviewed Jul 2, 2025

View reviewed changes

internal/extproc/translator/openai_gcpvertexai.go Outdated Show resolved Hide resolved

yuzisun reviewed Jul 3, 2025

View reviewed changes

sukumargaonkar added 3 commits July 3, 2025 11:01

address PR comments

0b2f21f

- Remove unnecessary comment updates - avoid extra copy when json-parsing req-body Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

Merge remote-tracking branch 'upstream/main' into gcp-basic-requests

df58fe8

# Conflicts: # internal/controller/rotators/gcp_oidc_token_rotator.go # internal/extproc/translator/gemini_helper.go # internal/extproc/translator/openai_gcpvertexai.go

add missing periods to comments

a10251e

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

Merge remote-tracking branch 'upstream/main' into gcp-basic-requests

06821c7

sukumargaonkar added 3 commits July 8, 2025 10:28

add GCP testcase in textWithUpstream

03d28bb

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

fix local config

4879b4e

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

fix testcase

ddc602e

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

sukumargaonkar added 2 commits July 8, 2025 15:39

Merge remote-tracking branch 'upstream/main' into gcp-basic-requests

806a2f8

Merge remote-tracking branch 'upstream/main' into gcp-basic-requests

5b5419e

mathetake reviewed Jul 8, 2025

View reviewed changes

handle empty gcp response

13e8926

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

mathetake reviewed Jul 8, 2025

View reviewed changes

mathetake requested review from wengyao04 and yuzisun July 8, 2025 21:25

mathetake requested a review from aabchoo July 8, 2025 21:25

address PR comment

39656fc

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

yuzisun reviewed Jul 8, 2025

View reviewed changes

mathetake reviewed Jul 9, 2025

View reviewed changes

internal/extproc/translator/gemini_helper.go Show resolved Hide resolved

mathetake requested a review from Copilot July 9, 2025 15:52

Copilot AI reviewed Jul 9, 2025

View reviewed changes

sukumargaonkar added 5 commits July 9, 2025 14:02

address PR comment

5300537

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

Merge remote-tracking branch 'upstream/main' into gcp-basic-requests

1febd58

fix test case

8393b25

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

refactor: rename conversion functions for clarity and consistency

2fb7051

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>

Merge remote-tracking branch 'upstream/main' into gcp-basic-requests

f61c4e7

mathetake approved these changes Jul 9, 2025

View reviewed changes

mathetake assigned yuzisun Jul 9, 2025

mathetake requested a review from yuzisun July 9, 2025 23:27

mathetake merged commit 7d293fa into envoyproxy:main Jul 9, 2025
24 checks passed

		devMsg := systemMsgToDeveloperMsg(msg)
		inst, err := fromDeveloperMsg(devMsg)

		// TODO: Parse GCP error response and convert to OpenAI error format.
		// For now, just return error response as-is.

Conversation

sukumargaonkar commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mathetake left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yuzisun Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sukumargaonkar Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

mathetake Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

mathetake commented Jul 4, 2025

Uh oh!

sukumargaonkar commented Jul 8, 2025

Uh oh!

mathetake commented Jul 8, 2025

Uh oh!

sukumargaonkar commented Jul 8, 2025

Uh oh!

mathetake Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sukumargaonkar Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

mathetake Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

yuzisun Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

sukumargaonkar Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

mathetake Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mathetake commented Jul 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

mathetake left a comment

Choose a reason for hiding this comment

Uh oh!

mathetake commented Jul 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sukumargaonkar commented Jul 2, 2025 •

edited

Loading

yuzisun Jul 3, 2025 •

edited

Loading

mathetake Jul 8, 2025 •

edited

Loading