feat: use bytedance/sonic instead of stdlib by mathetake · Pull Request #1701 · envoyproxy/ai-gateway

mathetake · 2025-12-29T19:17:47Z

Description

This replaces encoding/json with bytedance/sonic for faster json operations. This makes the data plane benchmark results drastically better, especially for the translation code path. One thing to note is that we could use goccy/go-json, but it comes with some incompatibility (no omitzero tag support, etc). On the other hand, bytesdance/sonic is 99% compatible with the current behavior except for the field order, which semantically doesn't matter in practice.

goos: linux
goarch: amd64
pkg: github.com/envoyproxy/ai-gateway/tests/data-plane
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics     
                                             │   old.txt    │               new.txt               │
                                             │    sec/op    │   sec/op     vs base                │
ChatCompletions/OpenAI_-_small-16               135.0µ ± 1%   107.7µ ± 2%  -20.24% (p=0.000 n=10)
ChatCompletions/OpenAI_-_medium-16              2.546m ± 1%   1.456m ± 1%  -42.83% (p=0.000 n=10)
ChatCompletions/OpenAI_-_large-16               28.50m ± 7%   17.41m ± 4%  -38.91% (p=0.000 n=10)
ChatCompletions/OpenAI_-_xlarge-16             141.24m ± 3%   71.07m ± 7%  -49.68% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_small-16          155.5µ ± 1%   120.4µ ± 1%  -22.57% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_medium-16         3.475m ± 1%   2.004m ± 1%  -42.33% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_large-16          37.01m ± 2%   24.08m ± 3%  -34.93% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_xlarge-16         346.1m ± 3%   100.2m ± 5%  -71.04% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_small-16         172.3µ ± 1%   141.9µ ± 2%  -17.63% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_medium-16        4.371m ± 1%   2.827m ± 1%  -35.31% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_large-16         51.21m ± 3%   32.67m ± 3%  -36.19% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_xlarge-16        344.9m ± 2%   102.3m ± 2%  -70.33% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_small-16      232.4µ ± 1%   198.4µ ± 1%  -14.64% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_medium-16     9.098m ± 1%   7.513m ± 2%  -17.43% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_large-16     102.14m ± 4%   83.27m ± 3%  -18.48% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_xlarge-16      2.014 ± 3%    1.450 ± 2%  -27.99% (p=0.000 n=10)
ChatCompletionsStreaming/OpenAI_Streaming-16    13.11m ± 0%   13.12m ± 1%        ~ (p=0.190 n=10)
ChatCompletionsStreaming/AWS_Streaming-16       13.11m ± 0%   13.10m ± 0%        ~ (p=0.353 n=10)
geomean                                         11.33m        7.424m       -34.50%

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

codecov-commenter · 2025-12-29T19:27:25Z

Codecov Report

❌ Patch coverage is 83.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.90%. Comparing base (1737691) to head (532e994).

Files with missing lines	Patch %	Lines
internal/json/json.go	50.00%	1 Missing and 1 partial ⚠️

❌ Your project status has failed because the head coverage (80.90%) is below the target coverage (86.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1701      +/-   ##
==========================================
- Coverage   80.91%   80.90%   -0.01%     
==========================================
  Files         146      147       +1     
  Lines       13374    13378       +4     
==========================================
+ Hits        10821    10823       +2     
- Misses       1890     1891       +1     
- Partials      663      664       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake · 2025-12-29T19:58:54Z

🤞

mathetake · 2025-12-29T20:00:18Z

internal/json/json.go

+// The full text of the Apache license is available in the LICENSE file at
+// the root of the repo.
+
+package json


centralize the json functions here so that we can switch the impl later (if we ever want to) easily

nacx

This is nice and LGTM!

Is there a Linter rule we can configure to prevent using encoding/json and make sure we always use our internal package? It is very easy to start using encoding/json in the future by mistake.

mathetake · 2025-12-30T15:41:52Z

Already added a linter rule ;)

# Conflicts: # tests/internal/testupstreamlib/server_test.go # tests/internal/testupstreamlib/testupstream/main.go

nacx · 2025-12-30T16:38:52Z

Already added a linter rule ;)

Oh, true, I missed it :)

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

**Description** This replaces encoding/json with bytedance/sonic for faster json operations. This makes the data plane benchmark results drastically better, especially for the translation code path. One thing to note is that we could use goccy/go-json, but it comes with some incompatibility (no omitzero tag support, etc). On the other hand, bytesdance/sonic is 99% compatible with the current behavior except for the field order, which semantically doesn't matter in practice. ``` goos: linux goarch: amd64 pkg: github.com/envoyproxy/ai-gateway/tests/data-plane cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ChatCompletions/OpenAI_-_small-16 135.0µ ± 1% 107.7µ ± 2% -20.24% (p=0.000 n=10) ChatCompletions/OpenAI_-_medium-16 2.546m ± 1% 1.456m ± 1% -42.83% (p=0.000 n=10) ChatCompletions/OpenAI_-_large-16 28.50m ± 7% 17.41m ± 4% -38.91% (p=0.000 n=10) ChatCompletions/OpenAI_-_xlarge-16 141.24m ± 3% 71.07m ± 7% -49.68% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_small-16 155.5µ ± 1% 120.4µ ± 1% -22.57% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_medium-16 3.475m ± 1% 2.004m ± 1% -42.33% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_large-16 37.01m ± 2% 24.08m ± 3% -34.93% (p=0.000 n=10) ChatCompletions/AWS_Bedrock_-_xlarge-16 346.1m ± 3% 100.2m ± 5% -71.04% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_small-16 172.3µ ± 1% 141.9µ ± 2% -17.63% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_medium-16 4.371m ± 1% 2.827m ± 1% -35.31% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_large-16 51.21m ± 3% 32.67m ± 3% -36.19% (p=0.000 n=10) ChatCompletions/GCP_VertexAI_-_xlarge-16 344.9m ± 2% 102.3m ± 2% -70.33% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_small-16 232.4µ ± 1% 198.4µ ± 1% -14.64% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_medium-16 9.098m ± 1% 7.513m ± 2% -17.43% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_large-16 102.14m ± 4% 83.27m ± 3% -18.48% (p=0.000 n=10) ChatCompletions/GCP_AnthropicAI_-_xlarge-16 2.014 ± 3% 1.450 ± 2% -27.99% (p=0.000 n=10) ChatCompletionsStreaming/OpenAI_Streaming-16 13.11m ± 0% 13.12m ± 1% ~ (p=0.190 n=10) ChatCompletionsStreaming/AWS_Streaming-16 13.11m ± 0% 13.10m ± 0% ~ (p=0.353 n=10) geomean 11.33m 7.424m -34.50% ``` --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake added 2 commits December 29, 2025 11:15

feat: use bytedance/sonic instead of stdlib

c3b5f3d

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

more

f548e29

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

more

d7cbd87

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake commented Dec 29, 2025

View reviewed changes

mathetake marked this pull request as ready for review December 29, 2025 20:11

mathetake requested a review from a team as a code owner December 29, 2025 20:11

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 29, 2025

nacx reviewed Dec 30, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into sonic

c31be08

# Conflicts: # tests/internal/testupstreamlib/server_test.go # tests/internal/testupstreamlib/testupstream/main.go

nacx approved these changes Dec 30, 2025

View reviewed changes

fix

532e994

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake enabled auto-merge (squash) December 30, 2025 16:48

mathetake merged commit 3c0951e into main Dec 30, 2025
32 checks passed

mathetake deleted the sonic branch December 30, 2025 16:57

hustxiayang mentioned this pull request Mar 19, 2026

fix: make sure choices is not nil or None to make it compatible with OpenAI #1975

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use bytedance/sonic instead of stdlib#1701

feat: use bytedance/sonic instead of stdlib#1701
mathetake merged 5 commits intomainfrom
sonic

mathetake commented Dec 29, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Dec 29, 2025 •

edited

Loading

Uh oh!

mathetake commented Dec 29, 2025

Uh oh!

mathetake Dec 29, 2025

Uh oh!

nacx left a comment

Uh oh!

mathetake commented Dec 30, 2025

Uh oh!

nacx commented Dec 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mathetake commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mathetake commented Dec 29, 2025

Uh oh!

mathetake Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

nacx left a comment

Choose a reason for hiding this comment

Uh oh!

mathetake commented Dec 30, 2025

Uh oh!

nacx commented Dec 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mathetake commented Dec 29, 2025 •

edited

Loading

codecov-commenter commented Dec 29, 2025 •

edited

Loading