Improve large buffers and demonstrate with OpenAI protocol support#1353
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1353 +/- ##
==========================================
- Coverage 43.75% 43.67% -0.08%
==========================================
Files 308 311 +3
Lines 33495 33851 +356
==========================================
+ Hits 14656 14786 +130
- Misses 17897 18116 +219
- Partials 942 949 +7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@grcevski this is really great! |
Sure, let me paste some screenshots here:
|
Co-authored-by: Mattia Meleleo <melmat@tuta.io>
…etry-ebpf-instrumentation into improve_large_buffers
|
@rafaelroquetto @mmat11 I believe I've addressed the feedback, please check when you can again. I added a unit test for the loop. Comes back now with: |
rafaelroquetto
left a comment
There was a problem hiding this comment.
Sorry, I missed a few details since last review.
|
|
||
| #pragma once | ||
|
|
||
| #include <bpfcore/utils.h> |
There was a problem hiding this comment.
one last nit - this may be passing because protocol_http.h is being indirectly included, but theoretically this needs to go under vmlinux.h as it is what defines types such as u16 and what not.
I'd just move this after line 9
| // limit by the userspace requested size | ||
| if (available_bytes > http_buffer_size) { | ||
| available_bytes = http_buffer_size; | ||
| } |
There was a problem hiding this comment.
I might be misunderstanding so please bear with me.
http_buffer_size is always meant to be less than k_large_buf_payload_max_size (i.e. k_large_buf_payload_max_size is a ceiling).
So capping available_bytes to http_buffer_size means that you will always end up sending a single large buffer (niter == 1) and I am assuming the intent here is to slice available_bytes into N large buffers, so I think this block should be removed - then see below.
There was a problem hiding this comment.
not necessarily, it's set by userspace and while there's a cap on the config setting, I don't want to leave it up to userspace to decide.
User space can set 2K, or 200K. If it's 2K we should only send 2K. If it's 200K, it should send 64K.
| req->has_large_buffers = true; | ||
| int b = 0; | ||
| for (; b < niter; b++) { | ||
| const u32 offset = b * k_large_buf_payload_max_size; |
There was a problem hiding this comment.
and then this becomes
| const u32 offset = b * k_large_buf_payload_max_size; | |
| const u32 offset = b * http_buffer_size; |
otherwise your stride is potentially larger than http_buffer_size and you skip bytes. I think this only worked so far because we are consistently using k_large_buf_payload_max_size to read, meaning we are not respecting http_buffer_size and always sending the maximum number of bytes
| int b = 0; | ||
| for (; b < niter; b++) { | ||
| const u32 offset = b * k_large_buf_payload_max_size; | ||
| if (offset >= k_large_buffer_read_limit) { |
There was a problem hiding this comment.
| if (offset >= k_large_buffer_read_limit) { | |
| if (offset + read_size >= k_large_buffer_read_limit) { |
if we can read at most k_large_buffer_read_limit in total, we need to account for the bytes already read (i.e. offset bytes) + the bytes we are about to read, otherwise we can overflow.
rafaelroquetto
left a comment
There was a problem hiding this comment.
Discussed offline, follow up PRs to come.



This PR improves the large buffer support by capturing responses in large buffers too and demonstrates this with implementing the first GenAI protocol - OpenAI.
There are couple of things that I had to do to make this happen:
And finally, since GenAI support in OTel SDKs in general is in infancy, we can really spearhead the OTel support and get across language GenAI observabilty with OBI. We can extend what I did with OpenAI to Anthropic and AWS and others and we'll have a pretty complete solutions and more and more GenAI workloads are being developed and used.
This PR adds only traces support. GenAI spec Metrics will follow.
Big chunk of this PR is just tests. I had to create a mock OpenAI server and wrapper client programs to ensure we capture the payloads correctly.
Relates to #1134