Skip to content

feat: use bytedance/sonic instead of stdlib#1701

Merged
mathetake merged 5 commits intomainfrom
sonic
Dec 30, 2025
Merged

feat: use bytedance/sonic instead of stdlib#1701
mathetake merged 5 commits intomainfrom
sonic

Conversation

@mathetake
Copy link
Member

@mathetake mathetake commented Dec 29, 2025

Description

This replaces encoding/json with bytedance/sonic for faster json operations. This makes the data plane benchmark results drastically better, especially for the translation code path. One thing to note is that we could use goccy/go-json, but it comes with some incompatibility (no omitzero tag support, etc). On the other hand, bytesdance/sonic is 99% compatible with the current behavior except for the field order, which semantically doesn't matter in practice.

goos: linux
goarch: amd64
pkg: github.com/envoyproxy/ai-gateway/tests/data-plane
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics     
                                             │   old.txt    │               new.txt               │
                                             │    sec/op    │   sec/op     vs base                │
ChatCompletions/OpenAI_-_small-16               135.0µ ± 1%   107.7µ ± 2%  -20.24% (p=0.000 n=10)
ChatCompletions/OpenAI_-_medium-16              2.546m ± 1%   1.456m ± 1%  -42.83% (p=0.000 n=10)
ChatCompletions/OpenAI_-_large-16               28.50m ± 7%   17.41m ± 4%  -38.91% (p=0.000 n=10)
ChatCompletions/OpenAI_-_xlarge-16             141.24m ± 3%   71.07m ± 7%  -49.68% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_small-16          155.5µ ± 1%   120.4µ ± 1%  -22.57% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_medium-16         3.475m ± 1%   2.004m ± 1%  -42.33% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_large-16          37.01m ± 2%   24.08m ± 3%  -34.93% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_xlarge-16         346.1m ± 3%   100.2m ± 5%  -71.04% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_small-16         172.3µ ± 1%   141.9µ ± 2%  -17.63% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_medium-16        4.371m ± 1%   2.827m ± 1%  -35.31% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_large-16         51.21m ± 3%   32.67m ± 3%  -36.19% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_xlarge-16        344.9m ± 2%   102.3m ± 2%  -70.33% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_small-16      232.4µ ± 1%   198.4µ ± 1%  -14.64% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_medium-16     9.098m ± 1%   7.513m ± 2%  -17.43% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_large-16     102.14m ± 4%   83.27m ± 3%  -18.48% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_xlarge-16      2.014 ± 3%    1.450 ± 2%  -27.99% (p=0.000 n=10)
ChatCompletionsStreaming/OpenAI_Streaming-16    13.11m ± 0%   13.12m ± 1%        ~ (p=0.190 n=10)
ChatCompletionsStreaming/AWS_Streaming-16       13.11m ± 0%   13.10m ± 0%        ~ (p=0.353 n=10)
geomean                                         11.33m        7.424m       -34.50%

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
@codecov-commenter
Copy link

codecov-commenter commented Dec 29, 2025

Codecov Report

❌ Patch coverage is 83.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.90%. Comparing base (1737691) to head (532e994).

Files with missing lines Patch % Lines
internal/json/json.go 50.00% 1 Missing and 1 partial ⚠️

❌ Your project status has failed because the head coverage (80.90%) is below the target coverage (86.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1701      +/-   ##
==========================================
- Coverage   80.91%   80.90%   -0.01%     
==========================================
  Files         146      147       +1     
  Lines       13374    13378       +4     
==========================================
+ Hits        10821    10823       +2     
- Misses       1890     1891       +1     
- Partials      663      664       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
@mathetake
Copy link
Member Author

🤞

// The full text of the Apache license is available in the LICENSE file at
// the root of the repo.

package json
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

centralize the json functions here so that we can switch the impl later (if we ever want to) easily

@mathetake mathetake marked this pull request as ready for review December 29, 2025 20:11
@mathetake mathetake requested a review from a team as a code owner December 29, 2025 20:11
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 29, 2025
Copy link
Member

@nacx nacx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice and LGTM!

Is there a Linter rule we can configure to prevent using encoding/json and make sure we always use our internal package? It is very easy to start using encoding/json in the future by mistake.

@mathetake
Copy link
Member Author

Already added a linter rule ;)

# Conflicts:
#	tests/internal/testupstreamlib/server_test.go
#	tests/internal/testupstreamlib/testupstream/main.go
@nacx
Copy link
Member

nacx commented Dec 30, 2025

Already added a linter rule ;)

Oh, true, I missed it :)

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
@mathetake mathetake enabled auto-merge (squash) December 30, 2025 16:48
@mathetake mathetake merged commit 3c0951e into main Dec 30, 2025
32 checks passed
@mathetake mathetake deleted the sonic branch December 30, 2025 16:57
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Jan 29, 2026
**Description**

This replaces encoding/json with bytedance/sonic for faster json
operations. This makes the data plane benchmark results drastically
better, especially for the translation code path. One thing to note is
that we could use goccy/go-json, but it comes with some incompatibility
(no omitzero tag support, etc). On the other hand, bytesdance/sonic is
99% compatible with the current behavior except for the field order,
which semantically doesn't matter in practice.

```
goos: linux
goarch: amd64
pkg: github.com/envoyproxy/ai-gateway/tests/data-plane
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics
                                             │   old.txt    │               new.txt               │
                                             │    sec/op    │   sec/op     vs base                │
ChatCompletions/OpenAI_-_small-16               135.0µ ± 1%   107.7µ ± 2%  -20.24% (p=0.000 n=10)
ChatCompletions/OpenAI_-_medium-16              2.546m ± 1%   1.456m ± 1%  -42.83% (p=0.000 n=10)
ChatCompletions/OpenAI_-_large-16               28.50m ± 7%   17.41m ± 4%  -38.91% (p=0.000 n=10)
ChatCompletions/OpenAI_-_xlarge-16             141.24m ± 3%   71.07m ± 7%  -49.68% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_small-16          155.5µ ± 1%   120.4µ ± 1%  -22.57% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_medium-16         3.475m ± 1%   2.004m ± 1%  -42.33% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_large-16          37.01m ± 2%   24.08m ± 3%  -34.93% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_xlarge-16         346.1m ± 3%   100.2m ± 5%  -71.04% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_small-16         172.3µ ± 1%   141.9µ ± 2%  -17.63% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_medium-16        4.371m ± 1%   2.827m ± 1%  -35.31% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_large-16         51.21m ± 3%   32.67m ± 3%  -36.19% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_xlarge-16        344.9m ± 2%   102.3m ± 2%  -70.33% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_small-16      232.4µ ± 1%   198.4µ ± 1%  -14.64% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_medium-16     9.098m ± 1%   7.513m ± 2%  -17.43% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_large-16     102.14m ± 4%   83.27m ± 3%  -18.48% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_xlarge-16      2.014 ± 3%    1.450 ± 2%  -27.99% (p=0.000 n=10)
ChatCompletionsStreaming/OpenAI_Streaming-16    13.11m ± 0%   13.12m ± 1%        ~ (p=0.190 n=10)
ChatCompletionsStreaming/AWS_Streaming-16       13.11m ± 0%   13.10m ± 0%        ~ (p=0.353 n=10)
geomean                                         11.33m        7.424m       -34.50%
```

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Feb 2, 2026
**Description**

This replaces encoding/json with bytedance/sonic for faster json
operations. This makes the data plane benchmark results drastically
better, especially for the translation code path. One thing to note is
that we could use goccy/go-json, but it comes with some incompatibility
(no omitzero tag support, etc). On the other hand, bytesdance/sonic is
99% compatible with the current behavior except for the field order,
which semantically doesn't matter in practice.

```
goos: linux
goarch: amd64
pkg: github.com/envoyproxy/ai-gateway/tests/data-plane
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics
                                             │   old.txt    │               new.txt               │
                                             │    sec/op    │   sec/op     vs base                │
ChatCompletions/OpenAI_-_small-16               135.0µ ± 1%   107.7µ ± 2%  -20.24% (p=0.000 n=10)
ChatCompletions/OpenAI_-_medium-16              2.546m ± 1%   1.456m ± 1%  -42.83% (p=0.000 n=10)
ChatCompletions/OpenAI_-_large-16               28.50m ± 7%   17.41m ± 4%  -38.91% (p=0.000 n=10)
ChatCompletions/OpenAI_-_xlarge-16             141.24m ± 3%   71.07m ± 7%  -49.68% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_small-16          155.5µ ± 1%   120.4µ ± 1%  -22.57% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_medium-16         3.475m ± 1%   2.004m ± 1%  -42.33% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_large-16          37.01m ± 2%   24.08m ± 3%  -34.93% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_xlarge-16         346.1m ± 3%   100.2m ± 5%  -71.04% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_small-16         172.3µ ± 1%   141.9µ ± 2%  -17.63% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_medium-16        4.371m ± 1%   2.827m ± 1%  -35.31% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_large-16         51.21m ± 3%   32.67m ± 3%  -36.19% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_xlarge-16        344.9m ± 2%   102.3m ± 2%  -70.33% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_small-16      232.4µ ± 1%   198.4µ ± 1%  -14.64% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_medium-16     9.098m ± 1%   7.513m ± 2%  -17.43% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_large-16     102.14m ± 4%   83.27m ± 3%  -18.48% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_xlarge-16      2.014 ± 3%    1.450 ± 2%  -27.99% (p=0.000 n=10)
ChatCompletionsStreaming/OpenAI_Streaming-16    13.11m ± 0%   13.12m ± 1%        ~ (p=0.190 n=10)
ChatCompletionsStreaming/AWS_Streaming-16       13.11m ± 0%   13.10m ± 0%        ~ (p=0.353 n=10)
geomean                                         11.33m        7.424m       -34.50%
```

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
hustxiayang pushed a commit to hustxiayang/ai-gateway that referenced this pull request Feb 5, 2026
**Description**

This replaces encoding/json with bytedance/sonic for faster json
operations. This makes the data plane benchmark results drastically
better, especially for the translation code path. One thing to note is
that we could use goccy/go-json, but it comes with some incompatibility
(no omitzero tag support, etc). On the other hand, bytesdance/sonic is
99% compatible with the current behavior except for the field order,
which semantically doesn't matter in practice.

```
goos: linux
goarch: amd64
pkg: github.com/envoyproxy/ai-gateway/tests/data-plane
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics
                                             │   old.txt    │               new.txt               │
                                             │    sec/op    │   sec/op     vs base                │
ChatCompletions/OpenAI_-_small-16               135.0µ ± 1%   107.7µ ± 2%  -20.24% (p=0.000 n=10)
ChatCompletions/OpenAI_-_medium-16              2.546m ± 1%   1.456m ± 1%  -42.83% (p=0.000 n=10)
ChatCompletions/OpenAI_-_large-16               28.50m ± 7%   17.41m ± 4%  -38.91% (p=0.000 n=10)
ChatCompletions/OpenAI_-_xlarge-16             141.24m ± 3%   71.07m ± 7%  -49.68% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_small-16          155.5µ ± 1%   120.4µ ± 1%  -22.57% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_medium-16         3.475m ± 1%   2.004m ± 1%  -42.33% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_large-16          37.01m ± 2%   24.08m ± 3%  -34.93% (p=0.000 n=10)
ChatCompletions/AWS_Bedrock_-_xlarge-16         346.1m ± 3%   100.2m ± 5%  -71.04% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_small-16         172.3µ ± 1%   141.9µ ± 2%  -17.63% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_medium-16        4.371m ± 1%   2.827m ± 1%  -35.31% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_large-16         51.21m ± 3%   32.67m ± 3%  -36.19% (p=0.000 n=10)
ChatCompletions/GCP_VertexAI_-_xlarge-16        344.9m ± 2%   102.3m ± 2%  -70.33% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_small-16      232.4µ ± 1%   198.4µ ± 1%  -14.64% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_medium-16     9.098m ± 1%   7.513m ± 2%  -17.43% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_large-16     102.14m ± 4%   83.27m ± 3%  -18.48% (p=0.000 n=10)
ChatCompletions/GCP_AnthropicAI_-_xlarge-16      2.014 ± 3%    1.450 ± 2%  -27.99% (p=0.000 n=10)
ChatCompletionsStreaming/OpenAI_Streaming-16    13.11m ± 0%   13.12m ± 1%        ~ (p=0.190 n=10)
ChatCompletionsStreaming/AWS_Streaming-16       13.11m ± 0%   13.10m ± 0%        ~ (p=0.353 n=10)
geomean                                         11.33m        7.424m       -34.50%
```

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants