feat: anthropic endpoint support and translation for openai backend by changminbark · Pull Request #1878 · envoyproxy/ai-gateway

changminbark · 2026-02-19T20:48:07Z

Description

This commit adds a translator that will convert a request sent to /anthropic/v1/messages and /v1/messages endpoints for OpenAI schema backends. It does not matter whether the OpenAI schema backend natively supports the endpoint (e.g. vLLM) as translating should be a light/fast enough process. This approach is also more versatile and future-proof than just passing through the Anthropic Message Request to a backend that natively supports it. It also follows the already-existing structure for adding translators, path processor factories, and schema translation.

A major example use case would be using AI Gateway to route requests from Claude Code to several AI backends like locally hosted vLLM models with LoRA adapters.

NOTE: vLLM is only used for local testing as I do not have access to compute. The intended goal for this PR is to support any OpenAI compatible backend/services using an Anthropic interface.

Related Issues/PRs (if applicable)

Fixes #1372
Fixes #1867

Special notes for reviewers (if applicable)
Claude Code was used to write most of the tests but were verified. It would also be nice if the maintainers could review the other PR #1843 as some of the Anthropic apischema here can be updated once #1843 is merged.

Functional Test Results

Test for anthropic endpoints for OpenAI schema backends that natively support it:

$ curl -v http://localhost:8080/v1/messages   -H "Content-Type: application/json"   -d '{
    "model": "Qwen/Qwen2.5-0.5B-Instruct",
    "messages": [
      {"role": "user", "content": "Say hello!"}
    ],
    "max_tokens": 100
  }'
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> POST /v1/messages HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.5.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 143
> 
< HTTP/1.1 200 OK
< date: Fri, 20 Feb 2026 18:46:04 GMT
< server: uvicorn
< content-type: application/json
< content-length: 331
< 
* Connection #0 to host localhost left intact
{"id":"chatcmpl-36ec5b3d-4273-41e9-966b-ed742f7a93d1","type":"message","role":"assistant","content":[{"type":"text","text":"Hello! How can I assist you today?"}],"model":"Qwen/Qwen2.5-0.5B-Instruct","stop_reason":"end_turn","usage":{"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"input_tokens":32,"output_tokens":10}}

$ curl -v http://localhost:8080/anthropic/v1/messages   -H "Content-Type: application/json"   -d '{
    "model": "Qwen/Qwen2.5-0.5B-Instruct",
    "messages": [
      {"role": "user", "content": "Say hello!"}
    ],
    "max_tokens": 100
  }'
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> POST /anthropic/v1/messages HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.5.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 143
> 
< HTTP/1.1 200 OK
< date: Fri, 20 Feb 2026 18:46:44 GMT
< server: uvicorn
< content-type: application/json
< content-length: 331
< 
* Connection #0 to host localhost left intact
{"id":"chatcmpl-f639ff32-4f89-48c5-b5b1-56878e641da6","type":"message","role":"assistant","content":[{"type":"text","text":"Hello! How can I assist you today?"}],"model":"Qwen/Qwen2.5-0.5B-Instruct","stop_reason":"end_turn","usage":{"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"input_tokens":32,"output_tokens":10}}

Port Forward logs

$ kubectl port-forward -n envoy-gateway-system svc/$ENVOY_SERVICE 8080:80
Forwarding from 127.0.0.1:8080 -> 10080
Forwarding from [::1]:8080 -> 10080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080

vLLM Logs (for both requests)

(APIServer pid=141923) INFO:     Started server process [141923]
(APIServer pid=141923) INFO:     Waiting for application startup.
(APIServer pid=141923) INFO:     Application startup complete.
(APIServer pid=141923) INFO:     172.18.0.2:46854 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=141923) INFO 02-20 13:46:05 [loggers.py:257] Engine 000: Avg prompt throughput: 3.2 tokens/s, Avg generation throughput: 1.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
(APIServer pid=141923) INFO 02-20 13:46:15 [loggers.py:257] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
(APIServer pid=141923) INFO:     172.18.0.2:47216 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=141923) INFO 02-20 13:46:45 [loggers.py:257] Engine 000: Avg prompt throughput: 3.2 tokens/s, Avg generation throughput: 1.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 25.0%
(APIServer pid=141923) INFO 02-20 13:46:55 [loggers.py:257] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 25.0%

Signed-off-by: Chang Min <changminbark@gmail.com>

ehfd · 2026-02-21T03:30:02Z

Is it possible to review if tool calling works in various models? I think this would be the key potential bug point.

changminbark · 2026-02-21T04:16:02Z

Sure, what would be a simple setup that would help emulate this?

ehfd · 2026-02-21T04:24:06Z

I think it would be the usage of Claude Code on several kinds of models (Kimi-K2, GLM-4.7/5, MiniMax-M2 series, Qwen3, etc.). It is apparent when it fails.

changminbark · 2026-02-21T04:39:13Z

@ehfd I do not have access to any compute or GPUs, so I'm not sure how to test this.

ehfd · 2026-02-21T05:20:17Z

I think we can test it, we'll take a look.

changminbark · 2026-02-21T21:36:14Z

@ehfd Thank you! Let me know how it goes.

…voyproxy#1878) Co-authored-by: Cursor <cursoragent@cursor.com>

yuzisun · 2026-02-23T08:20:55Z

cmd/extproc/mainlib/main.go

 	server.Register(path.Join(flags.rootPrefix, endpointPrefixes.Anthropic, "/v1/messages"), extproc.NewFactory(
 		messagesMetricsFactory, tracing.MessageTracer(), endpointspec.MessagesEndpointSpec{}))
+	// These are for OpenAI schema backends that support /v1/messages endpoint (no endpoint prefix as OpenAI prefix is '/')
+	server.Register(path.Join(flags.rootPrefix, endpointPrefixes.OpenAI, "/v1/messages"), extproc.NewFactory(


The root path / is intended for unified API endpoints which works for heterogeneous backends by translating between API schemas. For /v1/messages currently we only support pass through hence prefixed with anthropic, you could configure this root path on the client side to be compatible with the gateway path.

ah I see you want Claude Code to work with models using OpenAI API, but I think vllm actually now supports v1/messages natively.

#1867

Please follow through the conversations here.

@yuzisun Yes, I discussed the implementation details in the tagged issue. Please take a look and let me know what you think.

Hi @changminbark and @yuzisun
I work with @ehfd, just tested the PR in our cluster with curl and claude code, works perfectly! Looks great!

I believe the /v1/messages -> /v1/chat/completions translation will work with the existing registered endpoint, and the current change in this file is not required -

server.Register(path.Join(flags.rootPrefix, endpointPrefixes.Anthropic, "/v1/messages"), extproc.NewFactory( messagesMetricsFactory, tracing.MessageTracer(), endpointspec.MessagesEndpointSpec{}))

https://github.com/envoyproxy/ai-gateway/pull/1878/changes#diff-33280abdb776271e54b6d3fea30758bc5ea3f458cdf3cce1b4d57e3be8caa853R310

The anthropic prefix can be set to an empty string anthropic: "" in values.yaml, if you don't need it in the access URL

ai-gateway-extproc time=2026-02-14T07:39:50.432Z level=ERROR msg="error processing request message" error="rpc error: code = Internal desc = cannot set backend: failed to create translator for backend nrp-llm/envoy-ai-gateway-nrp-glm-4/route/envoy-ai-gateway-nrp-glm/rule/1/ref/0: /v1/messages endpoint only supports backends that return native Anthropic format (Anthropic, GCPAnthropic, AWSAnthropic). Backend OpenAI uses different model format"

In core, we want to get this rid and not have to duplicate APISchemaAnthropic, BackendSecurityPolicyAnthropicAPIKey, AnthropicAPIKey with the OpenAI schema. So the correct implementation is a translator like in https://github.com/envoyproxy/ai-gateway/tree/main/internal/translator?

Yes, adding OpenAI Translator for existing anthropic /v1/messages endpoint is correct

So I should get rid of registering the route with the OpenAI prefix?

dd9b9a9

Available here. I think this is resolved.

codecov-commenter · 2026-02-23T08:31:41Z

Codecov Report

❌ Patch coverage is 86.62420% with 63 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.30%. Comparing base (c07e7a8) to head (1be9091).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
internal/translator/openai_helper.go	86.68%	24 Missing and 21 partials ⚠️
internal/translator/anthropic_openai.go	86.15%	9 Missing and 9 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1878      +/-   ##
==========================================
+ Coverage   84.20%   84.30%   +0.10%     
==========================================
  Files         126      128       +2     
  Lines       17075    17545     +470     
==========================================
+ Hits        14378    14792     +414     
- Misses       1803     1827      +24     
- Partials      894      926      +32

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Chang Min <changminbark@gmail.com>

nacx · 2026-02-24T18:45:43Z

Thanks!
This PR will need some end-to-end tests that validate that translation happens properly. We have the testupstream_test.go file that contains a lot of end-to-end tests that verify the input request and the request that is being sent to the upstream, to verify translation and extproc behaves properly e2e.

Can you please add all the relevant cases to that, so that we are confident this is properly tested e2e?

ehfd · 2026-02-25T03:27:52Z

After further validation, the conversion from /v1/messages from the frontend to /v1/chat/completions to the backend doesn't work properly for tool calling in Claude Code. Almost no models work with tool calling.

Signed-off-by: Chang Min <changminbark@gmail.com>

changminbark · 2026-02-25T03:38:41Z

@ehfd I just fixed a bug while creating e2e tests. Basically, there was a problem where some of the OpenAI SSE were being passed by Envoy back to the client even though it had no Anthropic SSE equivalent. Do you mind trying it now?

Example of malformed streaming response (tool call) before:

Response body: event: message_start
        data: {"type":"message_start","message":{"stop_sequence":null,"id":"chatcmpl-stream","type":"message","role":"assistant","content":[],"model":"gpt-4o","usage":{"input_tokens":0,"output_tokens":0},"stop_reason":null}}
        
        event: content_block_start
        data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
        
        event: content_block_delta
        data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hi"}}
        
        event: content_block_delta
        data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" there!"}}
        
        data: {"id":"chatcmpl-stream","model":"gpt-4o","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":null}
        
        event: content_block_stop
        data: {"type":"content_block_stop","index":0}
        
        event: message_delta
        data: {"type":"message_delta","delta":{"stop_sequence":null,"stop_reason":"end_turn"},"usage":{"output_tokens":3}}
        
        event: message_stop
        data: {"type":"message_stop"}
        
        data: [DONE]

Example of correct streaming response (tool call) after:

Response body: event: message_start
data: {"type":"message_start","message":{"id":"chatcmpl-tool","type":"message","role":"assistant","content":[],"model":"gpt-4o","stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":0,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"tool_use","id":"call_abc","name":"get_weather","input":{}}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"{\"location\":"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"\"Paris\"}"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use","stop_sequence":null},"usage":{"output_tokens":15}}

event: message_stop
data: {"type":"message_stop"}

ehfd · 2026-02-25T06:01:04Z

@changminbark Looking into it, thanks.

ehfd · 2026-02-25T06:02:38Z

MiniMax-M2, only reproduced in the PR and not directly through vLLM.

⏺ Web Search("Super Bowl LX winner")
  ⎿  Did 0 searches in 2s                                                         
                                                          
⏺ The search isn't returning any results. This could be because:                  
  1. The web search tool isn't functioning properly                                           
  2. The information isn't available online               
                                                                                              
  Given that the system date is February 25, 2026, and Super Bowl 60 (LX) is typically played 
  in early February, it would have just happened recently. However, I don't have reliable     
  search results to provide an answer.                                                        
                  
  Let me be honest with the user about this limitation.


  I'm having trouble retrieving the search results. Based on today's date (February 25, 2026),
   the most recent Super Bowl would be Super Bowl LX (60), which is typically played in early
  February. However, I'm not getting search results to confirm the winner.

  Based on my knowledge, Super Bowl 60 (played February 2026) would have just concluded
  recently. Without working search results, I cannot provide definitive information on who
  won.

  Sources: None available due to search failure

✻ Cooked for 5m 1s

changminbark · 2026-02-25T13:19:26Z

@ehfd I'll try looking into this more, but do you have access to Claude Code's internal logs? That may be helpful to help debug this.

changminbark · 2026-02-25T17:53:27Z

I tried doing this on my local machine.

vLLM

Directly calling /v1/messages using vLLM

$ curl http://localhost:8000/v1/messages   -H "Content-Type: application/json"   -d '{"model":"Qwen/Qwen3-0.6B","max_tokens":1024,"messages":[{"role":"user","content":"search the web for the year america was founded"}],"tools":[{"name":"search","description":"search the web","input_schema":{"type":"object","properties":{"query":{"type":"string"}},"required":["query"]}}]}' | jq  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1519  100  1231  100   288   1510    353 --:--:-- --:--:-- --:--:--  1863
{
  "id": "chatcmpl-b0753bfdda7893af",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "<think>\nOkay, the user is asking to search the web for the year America was founded. Let me think about how to approach this.\n\nFirst, I need to figure out the correct historical date. America was founded on July 4, 1776. But the user wants to know the year, not the date. So, the answer should be 1776. However, the user is asking for a web search, so I should use the search function provided.\n\nI should construct a query to search for \"year america was founded\" to get relevant information. The function requires a \"query\" parameter, so I'll input that. Let me make sure there are no typos. The function name is \"search\", and the parameters are a JSON object with \"query\". \n\nI think that's all. Just need to format the tool call correctly within the XML tags. Let me double-check the syntax to ensure it's valid JSON. Once confirmed, the tool call should return the result.\n</think>\n\n"
    },
    {
      "type": "tool_use",
      "id": "chatcmpl-tool-a02c6f995701b000",
      "name": "search",
      "input": {
        "query": "year america was founded"
      }
    }
  ],
  "model": "Qwen/Qwen3-0.6B",
  "stop_reason": "tool_use",
  "usage": {
    "input_tokens": 152,
    "output_tokens": 229
  }
}

Streaming

$ curl http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen/Qwen3-0.6B","max_tokens":1024,"stream":true,"messages":[{"role":"user","content":"search the web for the year america was founded"}],"tools":[{"name":"search","description":"search the web","input_schema":{"type":"object","properties":{"query":{"type":"string"}},"required":["query"]}}]}'
event: message_start
data: {"type":"message_start","message":{"id":"chatcmpl-854bb730eed1434f","content":[],"model":"Qwen/Qwen3-0.6B","usage":{"input_tokens":152,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","content_block":{"type":"text","text":""},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"<think>"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"\n"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Okay"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":","},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" the"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" user"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" wants"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" to"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" know"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" the"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" year"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" America"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" was"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" founded"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"."},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" Let"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" me"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" think"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"."},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" I"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" need"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" to"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" search"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" the"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" web"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" for"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" that"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" information"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"."},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" The"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" available"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" tool"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" is"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" the"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" '"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"search"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"'"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" function"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":","},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" which"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" takes"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" a"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" query"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" parameter"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"."},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" The"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" query"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" here"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" would"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" be"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" \""},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"year"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" america"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" was"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" founded"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"\"."},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" I"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" should"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" make"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" sure"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" to"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" use"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" the"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" correct"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" function"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" name"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" and"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" parameters"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"."},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" Let"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" me"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" construct"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" the"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" tool"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" call"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":" properly"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":".\n"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"</think>"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"\n\n"},"index":0}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: content_block_start
data: {"type":"content_block_start","content_block":{"type":"tool_use","id":"chatcmpl-tool-b5b2602282b39cb5","name":"search","input":{}},"index":1}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"{\"query\": \""},"index":1}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"year"},"index":1}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":" america"},"index":1}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":" was"},"index":1}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":" founded"},"index":1}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"\"}"},"index":1}

event: content_block_stop
data: {"type":"content_block_stop","index":1}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use"},"usage":{"input_tokens":152,"output_tokens":101}}

event: message_stop
data: {"type":"message_stop"}

data: [DONE]

AI Gateway

$ curl http://localhost:8000/v1/messages   -H "Content-Type: application/json"   -d '{"model":"Qwen/Qwen2.5-0.5B-Instruct","max_tokens":1024,"messages":[{"role":"user","content":"search the web for the year america was founded"}],"tools":[{"name":"search","description":"search the web","input_schema":{"type":"object","properties":{"query":{"type":"string"}},"required":["query"]}}]}' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   636  100   337  100   299   2378   2109 --:--:-- --:--:-- --:--:--  4510
{
  "id": "chatcmpl-be92bf95eb4f5d6d",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": ""
    },
    {
      "type": "tool_use",
      "id": "chatcmpl-tool-8f41db8c125e33eb",
      "name": "search",
      "input": {
        "query": "year america was founded"
      }
    }
  ],
  "model": "Qwen/Qwen2.5-0.5B-Instruct",
  "stop_reason": "tool_use",
  "usage": {
    "input_tokens": 168,
    "output_tokens": 22
  }
}

$ curl http://localhost:8000/v1/messages \ 
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen/Qwen2.5-0.5B-Instruct","max_tokens":1024,"stream":true,"messages":[{"role":"user","content":"search the web for the year america was founded"}],"tools":[{"name":"search","description":"search the web","input_schema":{"type":"object","properties":{"query":{"type":"string"}},"required":["query"]}}]}'
event: message_start
data: {"type":"message_start","message":{"id":"chatcmpl-a2a79c35b0a88cfe","content":[],"model":"Qwen/Qwen2.5-0.5B-Instruct","usage":{"input_tokens":168,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","content_block":{"type":"tool_use","id":"chatcmpl-tool-b846494e0a70f02e","name":"search","input":{}},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"{\"query\": \""},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"America"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":" was"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":" founded"},"index":0}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"\"}"},"index":0}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use"},"usage":{"input_tokens":168,"output_tokens":21}}

event: message_stop
data: {"type":"message_stop"}

data: [DONE]

changminbark · 2026-02-25T18:04:06Z

@ehfd It seems that the format that Envoy is outputting is identical to the format the vLLM is outputting. I am not too sure why your tool call is not working. Isn't Claude Code also executing the tool call in the screenshot that you showed? Doesn't that mean that it is properly processing the tool call request from the model? Maybe it's a problem with Claude Code?

My thoughts: I think that web search tool in Claude Code requires a specific streaming SSE format as seen below. However, other tool calls (like ones that you custom define should work as intended). You can see that the type of the content block in the SSE for web search tool is server_tool_use instead of tool_use here.

Note the type being server_tool_use instead of tool_use and the last content block start having a type web_search_tool_result and compare it to the output with non-web search tool usage here.

event: content_block_start
data: {"type":"content_block_start","index":1,"content_block":{"type":"server_tool_use","id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","name":"web_search","input":{}}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"{\"query"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"\":"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" \"weather"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" NY"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"C to"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"day\"}"}}

event: content_block_stop
data: {"type":"content_block_stop","index":1 }

event: content_block_start
data: {"type":"content_block_start","index":2,"content_block":{"type":"web_search_tool_result","tool_use_id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","content":[{"type":"web_search_result","title":"Weather in New York City in May 2025 (New York) - detailed Weather Forecast for a month","url":"https://world-weather.info/forecast/usa/new_york/may-2025/","encrypted_content":"Ev0DCioIAxgCIiQ3NmU4ZmI4OC1k...","page_age":null},...]}}

changminbark · 2026-02-25T19:22:18Z

@ehfd It looks like web search tool is a Anthropic-specific tool that only Claude models can generate as part of their output SSE, so even if the request comes from Claude Code that says that this tool is available, the OpenAI backend cannot generate the web search tool specific SSE response needed for Claude Code to do the web search. Claude Code can only conduct web searches with this specific tool defined by Anthropic. The way LiteLLM goes around this is that they intercept the user-defined (not Anthropic-defined) web search tool and then executes it server-side as described here. Please correct me if my understanding is wrong.

groundsada · 2026-02-26T02:14:09Z

@changminbark okay, I ran some tests. You're 100% right. This is a Claude Code specific issue. Web search is failing on both vLLM and Envoy. IMO, PR looks great to merge. Can always ask CC to use a DDG MCP server and disable web search tool and it works flawlessly. One thing that is happening is that on vLLM, web search fails and it moves to different tools (fetch), via Envoy it remains stuck on a loop of failing.

ehfd · 2026-02-26T03:17:26Z

One thing that is happening is that on vLLM, web search fails and it moves to different tools (fetch), via Envoy it remains stuck on a loop of failing.

This part does look worth addressing, though.

Thanks for all your efforts troubleshooting! @changminbark @groundsada

changminbark · 2026-02-26T03:56:48Z

One thing that is happening is that on vLLM, web search fails and it moves to different tools (fetch), via Envoy it remains stuck on a loop of failing.

This part does look worth addressing, though.

It might be better to open up a new issue for this. I think this PR is getting quite big.

Thanks for all your efforts troubleshooting! @changminbark @groundsada

Thank you!

ehfd · 2026-02-26T03:58:12Z

It might be better to open up a new issue for this. I think this PR is getting quite big.

OK, if possible we could get this merged first. Thank you!

changminbark · 2026-02-26T04:08:04Z

@ehfd #1843 still has to get merged because I introduce some updates to the Anthropic messages apischema and need to validate the JSON marshaling/unmarshaling.

ehfd · 2026-02-26T06:11:23Z

xref vllm-project/vllm#34887

Signed-off-by: Chang Min <changminbark@gmail.com>

nacx

Thanks!

Overall LGTM. Just a couple minor comments and one comment about the test completeness.

@johnugeorge @yuzisun would you wanna do another review?

examples/basic/anthropic-openai-local.yaml

tests/data-plane/testupstream_test.go

…ropic-openai-local.yaml Signed-off-by: Chang Min <changminbark@gmail.com>

changminbark · 2026-02-26T23:54:03Z

@nacx I think the failing e2e tests are not related to my changes. Please correct me if I'm wrong.

nacx · 2026-02-28T21:25:38Z

/retest

johnugeorge · 2026-03-01T13:43:05Z

LGTM

gavrissh

lgtm

changminbark requested a review from a team as a code owner February 19, 2026 20:48

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 19, 2026

changminbark marked this pull request as draft February 19, 2026 20:48

feat: anthropic endpoint compatible openai backend support

50907fb

Signed-off-by: Chang Min <changminbark@gmail.com>

changminbark force-pushed the anthropic-support-for-openai branch from b3b16e3 to 50907fb Compare February 19, 2026 20:52

changminbark marked this pull request as ready for review February 20, 2026 19:06

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Feb 20, 2026

changminbark mentioned this pull request Feb 20, 2026

/anthropic/v1/messages should be available for OpenAI schema when /v1/messages is available with same API provider #1867

Closed

feat: anthropic native to openai translator

e1fe7a0

Signed-off-by: Chang Min <changminbark@gmail.com>

changminbark force-pushed the anthropic-support-for-openai branch from 16efcfd to e1fe7a0 Compare February 20, 2026 19:15

ehfd mentioned this pull request Feb 21, 2026

Anthropic /messages to OpenAI /chat/completions translator #1372

Closed

groundsada added a commit to groundsada/ai-gateway that referenced this pull request Feb 23, 2026

ci: add workflow to build controller + extproc images for GHCR (PR en…

1b02fa4

…voyproxy#1878) Co-authored-by: Cursor <cursoragent@cursor.com>

yuzisun reviewed Feb 23, 2026

View reviewed changes

changminbark and others added 2 commits February 23, 2026 11:49

test: added unit tests for code coverage

e2b9c10

Signed-off-by: Chang Min <changminbark@gmail.com>

Merge branch 'main' into anthropic-support-for-openai

fb724c3

fix: irrelevant OpenAI SSE no longer passed; test: added e2e test

48593d3

Signed-off-by: Chang Min <changminbark@gmail.com>

changminbark added 2 commits February 26, 2026 13:06

Merge branch 'main' into anthropic-support-for-openai

c5f6682

fix: updated Anthropic ToolUnion and ToolChoice schema translation

ed7ec70

Signed-off-by: Chang Min <changminbark@gmail.com>

changminbark requested review from gavrissh, johnugeorge and yuzisun February 26, 2026 18:40

Merge branch 'main' into anthropic-support-for-openai

f67d92e

nacx reviewed Feb 26, 2026

View reviewed changes

examples/basic/anthropic-openai-local.yaml Outdated Show resolved Hide resolved

examples/basic/anthropic-openai-local.yaml Outdated Show resolved Hide resolved

tests/data-plane/testupstream_test.go Show resolved Hide resolved

test: added expectedRequestBody to upstream test; chore: updated anth…

c672bac

…ropic-openai-local.yaml Signed-off-by: Chang Min <changminbark@gmail.com>

changminbark requested a review from nacx February 26, 2026 23:41

changminbark mentioned this pull request Feb 28, 2026

api: anthropic messages schema tracing #1843

Merged

nacx approved these changes Feb 28, 2026

View reviewed changes

Merge branch 'main' into anthropic-support-for-openai

e911f4a

gavrissh reviewed Mar 2, 2026

View reviewed changes

nacx enabled auto-merge (squash) March 2, 2026 17:01

Merge branch 'main' into anthropic-support-for-openai

1be9091

nacx merged commit 4ad94b0 into envoyproxy:main Mar 2, 2026
34 checks passed

mathetake mentioned this pull request Mar 3, 2026

Add header mutation for anthropic version in sample #1909

Closed

ehfd mentioned this pull request Mar 3, 2026

Tool calling behavior discrepancy between vLLM and Envoy AI Gateway for Anthropic to OpenAI translator #1911

Open

Conversation

changminbark commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ehfd commented Feb 21, 2026

Uh oh!

changminbark commented Feb 21, 2026

Uh oh!

ehfd commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changminbark commented Feb 21, 2026

Uh oh!

ehfd commented Feb 21, 2026

Uh oh!

changminbark commented Feb 21, 2026

Uh oh!

yuzisun Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

yuzisun Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehfd Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

changminbark Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

groundsada Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

gavrissh Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehfd Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gavrissh Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

changminbark Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

ehfd Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

nacx commented Feb 24, 2026

Uh oh!

ehfd commented Feb 25, 2026

Uh oh!

changminbark commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ehfd commented Feb 25, 2026

Uh oh!

ehfd commented Feb 25, 2026

Uh oh!

changminbark commented Feb 25, 2026

Uh oh!

changminbark commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

vLLM

AI Gateway

Uh oh!

changminbark commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changminbark commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

groundsada commented Feb 26, 2026

changminbark commented Feb 19, 2026 •

edited

Loading

ehfd commented Feb 21, 2026 •

edited

Loading

yuzisun Feb 23, 2026 •

edited

Loading

gavrissh Feb 25, 2026 •

edited

Loading

ehfd Feb 25, 2026 •

edited

Loading

ehfd Feb 25, 2026 •

edited

Loading

codecov-commenter commented Feb 23, 2026 •

edited

Loading

changminbark commented Feb 25, 2026 •

edited

Loading

changminbark commented Feb 25, 2026 •

edited

Loading

changminbark commented Feb 25, 2026 •

edited

Loading

changminbark commented Feb 25, 2026 •

edited

Loading

ehfd commented Feb 26, 2026 •

edited

Loading