-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf_probe_write_user helper function is locked down since linux kernel 5.14-rc6
#290
Comments
Problem also encountered and reported here: #237 |
Yes, this is definitely an issue and we should add code to make sure context propagation is disabled when the Linux security lockdown is set to anything other than [none]. It's really only a problem with context propagation, because it's the only time the bpf_probe_write_user helper is used. This is typically an issue when SecureBoot is enabled, which is why most users don't see it in regular VM environments. The Linux kernel will automatically enter integrity mode when SecureBoot is there. I haven't thought through deeply on how we can fix this but one way would be to add an |
From 3-Mar-24 sig meeting, at @grcevski started a thread on the Linux kernel mailing list to see about unlocking this: https://www.uwsg.indiana.edu/hypermail/linux/kernel/2403.0/03026.html It doesn't look like that thread has any replies yet, is there any way to follow up with that @grcevski ? |
From SIG call today:
|
Based on this article, it seems that the lockdown LSM policies are static and cannot be modified or configured: https://lwn.net/Articles/791863/ Proposals to make them more configurable appears to have been rejected. |
Relevant very new patch proposal https://lore.kernel.org/bpf/[email protected]/ |
Based on the latest comments on the thread I previously posted, it appears that the proposal for the new helper will not get accepted. However, as of kernel 6.9 (to be released yet) there's a new feature called |
Thanks @grcevski for the updates! I mentioned this on the call, but I think it's clear that we need to find an alternative approach. At this point I would consider the old helper DOA |
I wonder if it would be possible instead to use a uprobe in some place like If we need to write into data structures, it may be possible to use a ptrace attached routine to push additional data in. That will require permissions (as the current case) but I don't think it should be prevented by secureboot / lsm (but it is more restrictive because it is properly a debugging interface). That's a bit more work, but I think it is doable. |
I apologize if I misunderstood the context, but this already happens, except the current code reads the headers information after the TLS decryption, in a TLS agnostic way. Essentially, he headers are read after they are parsed from the incoming request, regardless of TLS.
This is a very interesting idea, I don't think it will be limited by secureboot, since we already use ptrace to attach a shared memory segment to the instrumented process. I'd like to find out more about how you think this might work. If I understand correctly what you are saying, it will mean not using eBPF at all to inject the header values, but using a injected function hook to do the work? |
Yes, exactly. |
After a lot of research I believe we might have a partial solution to this problem. There are two approaches we can implement here, which will give us partial coverage for the inability to write memory with Approach without TLS
Approach without TLS
This is a lot more complex to what we do today, but it's a way forward. We discussed this approach with @damemi, @MrAlias and some of the kernel maintainers at KubeCon NA 2024 and they suggested the approach described above without TLS is the right way to go. |
This will mean that the auto-instrumented service will not be a part of distributed traces with other non-auto-instrumented service for things like HTTPS. Can we fallback to using bpf_probe_write_user when TLS is used? Is KTLS still a possibility? It looks like there is an accepted proposal to add this to the Go stdlib as a debug option: golang/go#44506 |
Yes, absolutely. We can try to use bpf_probe_write_user and fall back to the other approach if it's not allowed.
I think so, kTLS should work with the proposed approach 1. |
@grcevski Thanks for describing the solutions.
|
👋 Looks like this may have stalled. I am hearing reports of more folks having trouble using the auto-instrumentation in the operator because of lockdown integrity enabled.
This check was suggested higher in the thread and seems like it might be a reasonable starting point to unblock folks now, and allow time to determine the best long term solution for this scenario. What do you think? |
To clarify, I think the safe version if possible might be disabling context propagation entirely for folks with lockdown integrity, if it means allowing startup and instrumentation. |
Yes you are right, I will work on a PR. The work has been done for HTTP, but we need to apply the same approach for gRPC and Kafka. |
Describe the bug
Since this commit in linux kernel repository, bpf_probe_write_user is locked down and this results in
unknown func bpf_probe_write_user
error during opentelemetry-go-instrumentation startup on systems with lockdown integrity enabled.Environment
To Reproduce
Steps to reproduce the behavior:
Expected behavior
opentelemetry-go-instrumentation starts up without any issues.
Additional context
bpf_probe_write_user should be probably retired from being used in opentelemetry-go-instrumentation as for many cloud providers/on demand use cases editing startup kernel parameters is not possible.
The text was updated successfully, but these errors were encountered: