Improve large buffers and demonstrate with OpenAI protocol support by grcevski · Pull Request #1353 · open-telemetry/opentelemetry-ebpf-instrumentation

grcevski · 2026-02-24T01:22:37Z

This PR improves the large buffer support by capturing responses in large buffers too and demonstrates this with implementing the first GenAI protocol - OpenAI.

There are couple of things that I had to do to make this happen:

We now delay the HTTPS requests just like the HTTP requests. I need to see if we can pass cleanly our SSL test suite. I believe we had resolved all issues with finding the end of TLS requests, but we'll see.
We count correctly the request and response sizes.
I enabled us to capture larger than 32K buffers by splitting them and shipping more than one.
OpenAI reponds with gzip bodies, so I had to add generic parsing for HTTP requests for compressed packets (gzip, brotli, deflate and zstd).

And finally, since GenAI support in OTel SDKs in general is in infancy, we can really spearhead the OTel support and get across language GenAI observabilty with OBI. We can extend what I did with OpenAI to Anthropic and AWS and others and we'll have a pretty complete solutions and more and more GenAI workloads are being developed and used.

This PR adds only traces support. GenAI spec Metrics will follow.

Big chunk of this PR is just tests. I had to create a mock OpenAI server and wrapper client programs to ensure we capture the payloads correctly.

Relates to #1134

codecov · 2026-02-24T02:23:53Z

Codecov Report

❌ Patch coverage is 36.31436% with 235 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.67%. Comparing base (08677b8) to head (f5a8208).
⚠️ Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
...tegration/components/ai/openai/mock-server/main.go	0.00%	145 Missing ⚠️
pkg/appolly/app/request/span_getters.go	0.00%	18 Missing ⚠️
pkg/appolly/app/request/span.go	67.34%	16 Missing ⚠️
pkg/ebpf/common/http/openai.go	55.55%	13 Missing and 3 partials ⚠️
pkg/export/otel/tracesgen/tracesgen.go	68.29%	11 Missing and 2 partials ⚠️
pkg/ebpf/common/http_transform.go	52.00%	10 Missing and 2 partials ⚠️
pkg/ebpf/common/http/responses.go	73.17%	9 Missing and 2 partials ⚠️
pkg/ebpf/common/tcp_large_buffer.go	0.00%	2 Missing ⚠️
internal/test/integration/red_test_python_aws.go	0.00%	1 Missing ⚠️
pkg/internal/ebpf/generictracer/generictracer.go	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1353      +/-   ##
==========================================
- Coverage   43.75%   43.67%   -0.08%     
==========================================
  Files         308      311       +3     
  Lines       33495    33851     +356     
==========================================
+ Hits        14656    14786     +130     
- Misses      17897    18116     +219     
- Partials      942      949       +7

Flag	Coverage Δ
integration-test	`21.74% <5.08%> (+0.07%)`	⬆️
integration-test-arm	`0.00% <0.00%> (ø)`
integration-test-vm-x86_64-5.15.152	`0.00% <0.00%> (ø)`
integration-test-vm-x86_64-6.10.6	`0.00% <0.00%> (ø)`
k8s-integration-test	`2.31% <0.00%> (-0.02%)`	⬇️
oats-test	`0.00% <0.00%> (ø)`
unittests	`44.56% <40.55%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

NimrodAvni78 · 2026-02-24T08:51:08Z

@grcevski this is really great!
do you have an example of a span with all the attributes we add?
i can try to infer from the tests but an example will really help see all of it

grcevski · 2026-02-24T14:55:59Z

@grcevski this is really great! do you have an example of a span with all the attributes we add? i can try to infer from the tests but an example will really help see all of it

Sure, let me paste some screenshots here:

Co-authored-by: Mattia Meleleo <melmat@tuta.io>

…etry-ebpf-instrumentation into improve_large_buffers

grcevski · 2026-02-25T21:00:01Z

@rafaelroquetto @mmat11 I believe I've addressed the feedback, please check when you can again. I added a unit test for the loop. Comes back now with:

========================================
Test Summary
========================================
Total Tests:  55
Passed:       55
Failed:       0
========================================
✓ All tests passed!

rafaelroquetto

Sorry, I missed a few details since last review.

rafaelroquetto · 2026-02-25T21:05:34Z


 #pragma once

+#include <bpfcore/utils.h>


one last nit - this may be passing because protocol_http.h is being indirectly included, but theoretically this needs to go under vmlinux.h as it is what defines types such as u16 and what not.

I'd just move this after line 9

rafaelroquetto · 2026-02-25T21:17:36Z

+    // limit by the userspace requested size
+    if (available_bytes > http_buffer_size) {
+        available_bytes = http_buffer_size;
    }


I might be misunderstanding so please bear with me.

http_buffer_size is always meant to be less than k_large_buf_payload_max_size (i.e. k_large_buf_payload_max_size is a ceiling).

So capping available_bytes to http_buffer_size means that you will always end up sending a single large buffer (niter == 1) and I am assuming the intent here is to slice available_bytes into N large buffers, so I think this block should be removed - then see below.

not necessarily, it's set by userspace and while there's a cap on the config setting, I don't want to leave it up to userspace to decide.

User space can set 2K, or 200K. If it's 2K we should only send 2K. If it's 200K, it should send 64K.

rafaelroquetto · 2026-02-25T21:22:11Z

-    req->has_large_buffers = true;
+    int b = 0;
+    for (; b < niter; b++) {
+        const u32 offset = b * k_large_buf_payload_max_size;


and then this becomes

Suggested change

const u32 offset = b * k_large_buf_payload_max_size;

const u32 offset = b * http_buffer_size;

otherwise your stride is potentially larger than http_buffer_size and you skip bytes. I think this only worked so far because we are consistently using k_large_buf_payload_max_size to read, meaning we are not respecting http_buffer_size and always sending the maximum number of bytes

rafaelroquetto · 2026-02-25T21:24:01Z

+    int b = 0;
+    for (; b < niter; b++) {
+        const u32 offset = b * k_large_buf_payload_max_size;
+        if (offset >= k_large_buffer_read_limit) {


Suggested change

if (offset >= k_large_buffer_read_limit) {

if (offset + read_size >= k_large_buffer_read_limit) {

if we can read at most k_large_buffer_read_limit in total, we need to account for the bytes already read (i.e. offset bytes) + the bytes we are about to read, otherwise we can overflow.

rafaelroquetto

Discussed offline, follow up PRs to come.

grcevski added 10 commits February 19, 2026 19:43

WIP: Improvements to large buffer handling

8e872fb

implement support for compressed responses

4004bc7

add traces support for openAI

e5c2621

add test code

8aad6d4

add support for conversations, items, data

e4269bf

more fixes

7cf980b

make sensitive attributes optional, add tests

b966704

more tests

d56db26

integration test

8d60460

lints and fixes

51985af

grcevski requested a review from a team as a code owner February 24, 2026 01:22

grcevski added 7 commits February 24, 2026 09:58

Merge branch 'main' into improve_large_buffers

14a3af9

fix bugs

2bb9d35

update config schema

80a40e2

update notices

9c3cd13

fix merge issue

18af86a

fix icoverage grep failure

5ef78b6

better fix

eb58d99

mmat11 reviewed Feb 25, 2026

View reviewed changes

Comment thread pkg/internal/ebpf/generictracer/generictracer.go

rafaelroquetto reviewed Feb 25, 2026

View reviewed changes

Comment thread bpf/generictracer/protocol_http.h Outdated

Comment thread pkg/ebpf/common/http/openai.go

Comment thread pkg/ebpf/common/http/openai.go Outdated

grcevski added 2 commits February 25, 2026 15:17

fix bug, unit test, code review comments

1b19f9f

improve loop

ef580fb

mmat11 reviewed Feb 25, 2026

View reviewed changes

grcevski and others added 3 commits February 25, 2026 15:46

Update pkg/ebpf/common/http/responses.go

dc3966c

Co-authored-by: Mattia Meleleo <melmat@tuta.io>

more review feedback

7b44dbb

Merge branch 'improve_large_buffers' of github.com:grcevski/opentelem…

7056ae9

…etry-ebpf-instrumentation into improve_large_buffers

grcevski added 2 commits February 25, 2026 15:55

review feedback

9a41f80

fix

8fd63df

remove binary file

5d81bcf

mmat11 reviewed Feb 25, 2026

View reviewed changes

Comment thread pkg/config/ebpf_tracer.go Outdated

mmat11 reviewed Feb 25, 2026

View reviewed changes

Comment thread bpf/generictracer/protocol_http.h

mmat11 approved these changes Feb 25, 2026

View reviewed changes

update comment

bf6ec0e

rafaelroquetto reviewed Feb 25, 2026

View reviewed changes

update schema

9ebf022

rafaelroquetto approved these changes Feb 25, 2026

View reviewed changes

fix parsing issue

f5a8208

grcevski merged commit ac770dc into open-telemetry:main Feb 25, 2026
82 of 84 checks passed

grcevski deleted the improve_large_buffers branch February 25, 2026 23:04

MrAlias added this to the v0.6.0 milestone Mar 2, 2026

MrAlias mentioned this pull request Mar 5, 2026

Release v0.6.0 #1478

Merged

	const u32 offset = b * k_large_buf_payload_max_size;
	const u32 offset = b * http_buffer_size;

	if (offset >= k_large_buffer_read_limit) {
	if (offset + read_size >= k_large_buffer_read_limit) {

Conversation

grcevski commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

NimrodAvni78 commented Feb 24, 2026

Uh oh!

grcevski commented Feb 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grcevski commented Feb 25, 2026

Uh oh!

Uh oh!

Uh oh!

rafaelroquetto left a comment

Choose a reason for hiding this comment

Uh oh!

rafaelroquetto Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

rafaelroquetto Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

grcevski Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rafaelroquetto Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

rafaelroquetto Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rafaelroquetto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

grcevski commented Feb 24, 2026 •

edited

Loading

codecov Bot commented Feb 24, 2026 •

edited

Loading