Add support for python asyncio#966
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #966 +/- ##
==========================================
+ Coverage 43.10% 43.12% +0.01%
==========================================
Files 299 299
Lines 32120 32134 +14
==========================================
+ Hits 13846 13858 +12
- Misses 17380 17383 +3
+ Partials 894 893 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
de423f7 to
f0310ae
Compare
|
@marctc, I think what would really help is some simple diagram showing what happens at what point and how do we track the context, or an explanation step by step. The reason that I'm confused is that the original logs shown by the bpftrace program that Aaron wrote, things are not the same here in this implementation. What I understood from those logs is that:
|
| int BPF_UPROBE(obi_uprobe_context_run, void *context) { | ||
| (void)ctx; | ||
|
|
||
| u64 id = bpf_get_current_pid_tgid(); |
There was a problem hiding this comment.
| u64 id = bpf_get_current_pid_tgid(); | |
| const u64 id = bpf_get_current_pid_tgid(); |
There was a problem hiding this comment.
Looks like it is not resolved
1ae8db3 to
71e2500
Compare
be92eb8 to
13af79b
Compare
rafaelroquetto
left a comment
There was a problem hiding this comment.
Please correct me if I am wrong, but my understanding is that you try to build a parent/child relationship between python threads.
If that's the case, I think it may be possible to rely only on bpf_get_pid_tgid() aka id.
So in context_new_from_vars_ret (I am assuming this is where a new context is created in the parent thread) you can associated the new context with the parent, i.e.
map 1 <ctx_ptr, id (of the parent)>
then when the context runs (in context_run) this is where you map the context pointer to its own thread id, so:
map 2 <id, ctx_ptr>
then to find the parent trace:
- get the current thread id via
bpf_get_pid_tgid()(akaid) - build t_key and query server_traces
- if nothing yields, then go one level deep: get the
ctx_ptrusingid(map 1), then use thisctx_ptrto get theidof the parent (map 2). Then goto 2.
Hopefully I got it right.
| if (context) { | ||
| bpf_dbg_printk( | ||
| "extra_runtime_id: LOOKUP python_current_context[host_id=%llx] = %llx", id, *context); | ||
| return (u64)(*context); |
There was a problem hiding this comment.
| return (u64)(*context); | |
| return *context; |
| #include <maps/python_thread_context.h> | ||
| #include <maps/python_context_trace.h> |
There was a problem hiding this comment.
nit: includes are ordered alphabetically (at least in a group)
| static __always_inline u64 extra_runtime_id() { | ||
| const u64 id = bpf_get_current_pid_tgid(); | ||
|
|
||
| u64 *context = bpf_map_lookup_elem(&python_current_context, &id); |
There was a problem hiding this comment.
I reckon this is only relevant for Python, so I'd move this into make_trace_key in python.c and leave this alone.
Rationale:
- improves readability by making it explicit these are orthogonal
- improves performance by not wasting time on map lookups when extra_runtime_id is called for non-python processes
- even though it should be impossible in practice, it would conceptually rule out a non python process from interacting with
python_current_context
| if (context) { | ||
| t_key.extra_id = (u64)context; | ||
| } else { | ||
| t_key.extra_id = extra_runtime_id(); |
There was a problem hiding this comment.
so with regards to my comments above, here you can lookup python_current_context explicitly, and then fall back to extra_runtime_id() if it is null
|
|
||
| #include <pid/pid.h> | ||
|
|
||
| static __always_inline trace_key_t make_trace_key(void *context) { |
There was a problem hiding this comment.
| static __always_inline trace_key_t make_trace_key(void *context) { | |
| static __always_inline trace_key_t make_trace_key(const void *context) { |
| int BPF_UPROBE(obi_uprobe_context_run, void *context) { | ||
| (void)ctx; | ||
|
|
||
| u64 id = bpf_get_current_pid_tgid(); |
There was a problem hiding this comment.
Looks like it is not resolved
| bpf_map_update_elem(&python_thread_context, &t_key, &context, BPF_ANY); | ||
| bpf_map_update_elem(&python_current_context, &id, &context, BPF_ANY); |
There was a problem hiding this comment.
do we really need 2 maps here? id is basically tid_tgid (so unique) mapped to a context pointer.
But then t_key is basically the same information (the thread/task id encoded differently) mapped to a context (and more confusingly the extra_id is the context itself) - so basically, to look up the value in this map (the context) you need a key that contains the context -> so logically you don't need this map because you already know the context.
I hope that makes sense, perhaps I am misunderstanding it.
|
|
||
| SEC("uprobe/libpython3.so:context_new_from_vars_ret") | ||
| int obi_uprobe_context_new_from_vars_ret(struct pt_regs *ctx) { | ||
| u64 id = bpf_get_current_pid_tgid(); |
There was a problem hiding this comment.
| u64 id = bpf_get_current_pid_tgid(); | |
| const u64 id = bpf_get_current_pid_tgid(); |
| return 0; | ||
| } | ||
|
|
||
| const trace_key_t t_key = make_trace_key(0); |
There was a problem hiding this comment.
If I understand this correctly, this is where you associate this context with its parent?
If so, couldn't you do something like:
- bpf_map_lookup_elem(&python_thread_context, id, ...) to lookup the context for this thread? I feel this is exactly what the current implementation ends up doing once it calls
extra_runtime_id, just in a more convoluted fashion.
| } | ||
|
|
||
| // Not this thread's server request, try Python context chain | ||
| u64 *context_ptr = bpf_map_lookup_elem(&python_thread_context, &key); |
There was a problem hiding this comment.
ok so here instead of key, I believe we could just pass id (pid_tgid)
|
Closing for now. Other approached rathen this are being discussed and this doesn't seem to be the more solid. |
This PR adds support to track context creation of python asyncio framework in order to add this information of trace context propagation.
For this case it only uses basic asyncio with aiohttp.
Thanks @aabmass for the guidance.
Paired with @grcevski