-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop hello_world return 0 #139
Comments
this has been done. closing ticket. |
Actually, this is probably working on accident. I was delinquent in not responding to this issue with an explanation earlier, and I didn't notice this in the code change a few days back. It is not advised to return an undefined value, since it will be used by the different kernel hookpoints (socket, kprobe, etc.) in different ways. In the case of socket, the return value is used to denote whether to capture the packet to userspace or not. In the case of kprobe, the return value is used to determine whether the perf event that may be attached to the same kprobe should be recorded or not. So, an undefined return value may cause quirky perf behavior, if they are running concurrently. I would propose that a void return value should only be allowed for function types that have special handling to mangle the return value to be 0 or whichever sane default is appropriate for that function type. I can add this support for kprobes. |
Not sure I understand what you mean: so the k[ret]probe__ functions could be made void safe, but others should return something properly? |
Let me rephrase. All BPF programs must leave a valid value in R0. All kernel hooks that run a BPF program interpret R0 in some way. If we let BCC allow void programs, it will have to internally stuff in a default return value, since we don't want undefined behavior. The choice of 0 for the default value is the most universal one, and since we are focusing on probes I feel comfortable allowing this default for those types of programs (non-zero is the error/skip case for probes), but not as comfortable for the others. For instance, for SCHED_CLS, the return value currently has a specific meaning depending on how the program is stitched in the tc qdisc/filter/action chain. Additionally, non-kprobe program types usually aren't one-liners, so it isn't a great burden to keep program return type required. |
I'm fine with returning 0 for now, eg, changing hello_world (and others) back to: from bcc import BPF
BPF(text='int kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); return 0; }').trace_print() We'd just need to be clear that it has value and what it does. I'm still not clear what happens if I return 1 in this example. hello_world.py needs to be 100% explainable, and make 100% sense. :) |
I just got a better explanation from Alexei, so now hopefully my explanation makes more sense as well. Let's try an example with hello_world. $ python -c "from bcc import BPF; BPF(text='int kprobe__sys_clone(void *ctx) { return 0; }').trace_print()" Now in another shell: $ perf list | grep sys_clone
kprobes:p_sys_clone [Tracepoint event]
$ perf record -e kprobes:p_sys_clone -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.360 MB perf.data ]
$ perf report --stdio
Error:
The perf.data file has no samples! Return 1 instead. $ python -c "from bcc import BPF; BPF(text='int kprobe__sys_clone(void *ctx) { return 1; }').trace_print()" $ perf report --stdio
...
# Samples: 47 of event 'kprobes:p_sys_clone'
...
78.72% byobu-status [kernel.vmlinux] [k] sys_clone
17.02% tmux [kernel.vmlinux] [k] sys_clone
4.26% bash [kernel.vmlinux] [k] sys_clone So what changed? Similar to attaching a bpf program to a socket, a kprobe bpf program acts like a filter for other subscribers to the same event. Returning 0 tells the kprobe infrastructure to return immediately after running the bpf program, whereas non-zero allows the other subscribers to run, thus paying the cost of allocating space in the trace buffer for this event and submitting it (for details see If we are using bpf programs in a standalone mode, where they do all of the work and coordinate with userspace purely through bpf maps and trace_printks, then it is not necessary or performant to pay the cost of submitting the event to the buffer only to discard it later. Currently this is the case for all of our tools and examples. Having said that, there is a great future opportunity to use this return value in conjunction with other perf events to intelligently trace with the full power of perf. I don't think this answer gets towards a 100% explainable and 100% sensical user story, but hopefully it makes sense to you. How should we spin it for the one-liner case without getting into the technical weeds? |
I'm going to return the "return 0"s, given the explanations above, and close this ticket. |
Add a USDT module collecting metrics from JVM thread starts/stops. This module does not require any specific Java parameters to work. Could be used as an example / reference or for collecting metrics. Wrt the return value in the BPF program even in the unlikely case that lookup fails, iovisor/bcc#139 has explanation on BPF program values and it should be OK for BCC PMDA modules to use 0 (by default, an admin could of course change it).
* create `do_exit` specific probes that guarantee exit_exit() * add a function to grab inode information from task_struct * grab deeper nested paths, up to 150 directories deep * return 0 in all cases per iovisor/bcc#139
From examples/hello_world.py:
Is the int ... return 0 necessary? I don't see how it's handled. If it isn't necessary, it would be good to drop it from the hello world example, and make this void.
The text was updated successfully, but these errors were encountered: