-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backends/ebpf: Track header offset in bytes rather than bits. #4327
Conversation
Can you point out the concrete follow-ups? @tatry @osinstom tagging you for this PR since there is no dedicated maintainer for the eBPF core. Let me know if this code is not part of your responsibility. |
I can give these a quick review if you factor them out. |
47a9de1
to
32ff0db
Compare
18cd9e9
to
c8e6301
Compare
With the latest update, I believe that all issues in the code have been fixed now. The only remaining CI failure is a transient network error which is unrelated to the changes here.
|
@thomascalvert-xlnx I poked Github CI to force it to re-run the failed job. Hopefully it will work this time. |
@vbnogueira Could you please check if this impacts the TC tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look good to me.
I would prefer to wait for @vbnogueira to confirm that TC tests work fine before we merge this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.template output from TC backend is empty in many cases. It is not expected to e empty. Can you please look into this @thomascalvert-xlnx ?
@usha1830 could you please provide a reproducer cmdline? In my |
I see empty files in this PR only. |
c8e6301
to
2409ad5
Compare
@usha1830 good spot! In fact the files aren't empty, the issue is that for some reason the commit sets their executable bit (without modifying contents) and that is Github's way of showing this permissions change. I'm not sure why this happened, sorry, I have edited the last commit to clear executable bit on all template files. Note that there are still a handful of |
Yes I did 'View file' and saw those files had some content but wasn't sure why it was showing empty. Thanks for clarifying! |
2409ad5
to
13dc467
Compare
Thanks for looking Usha. Rebased to top of main branch - there were a couple minor merge conflicts around the |
I'll run the tests and check here |
There are some failures in the latest test runs - these are new p4tc tests added since the rebase, I can easily update their outputs too. However when doing so I noticed the template permissions thing cropping up again, and traced it down to this bit of code which explicitly sets their executable bit: https://github.com/p4lang/p4c/blob/main/backends/tc/backend.cpp#L128-L132 so I guess it would be correct to update all the expected files to have the executable bit set. This happens as a matter of course with the new |
This patch also sets the executable bit on all .template files. This is done by the TC backend when they are created, however for some reason never made it to the reference outputs, and the checker script doesn't look at permissions.
Head branch was pushed to by a user without write access
8235fe0
to
b3a7860
Compare
Rebased to top of tree. |
From overview, I dont this change would affect functionality of P4TC but we wanted to do some quick testing before the merge and look at the assembler diff. No harm done, we can respond if issues surface. |
I'll apologize, I misread vbnoguiera's comment as approval. I can revert this PR and reopen it. The process is as you would expect. Only once every stakeholder approves code can get merged. We can add a CODEOWNERS file to actually enforce this: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners But this can lead to a lot of email spam. |
No need. If we notice anything unusual we will open another PR. So far it does look from the assembler that the change removes about 1 instruction and should have no impact on functionality. But we will test.
True. Maybe just tag the stakeholders in the issue and they will get email notifications? |
Who should we tag on PRs concerning P4TC and the eBPF back end? |
Okay, we could add code owners for the respective folders but without requiring reviews. |
|
Argh, we are now finding this broke us. Sorry we were busy on other things to catch this sooner. The better solution is to probably to restrict this change to just the ebpf backend (and leave the tc side alone). @vbnogueira can we try reverting this on the tc side and see if that resolves our issue? @usha1830 @komaljai the end results in most tests is the change chops off the payload and only emits the headers. We do generate the parser as a separate ebpf program whereas the ebpf backend has everything in one ebpf program - perhaps thats why it wasnt obvious earlier. |
@thomascalvert-xlnx there is a regression unfortunately. We can revert this commit and try to patch things separately. Or we could try to remove the parts affecting the TC back end.
I can help with getting this bootstrapped. :) |
Appreciated! |
@jhsmt it's not obvious to me how this change could cause the symptom you describe. Could you please provide more details and ideally a reproducer?
Don't think that's easy because the TC backend calls the eBPF backend directly for the code generation. If you look at the changes in this PR, the stuff in In my opinion the best solution would be to find and fix the cause of the aforementioned regression. Ideally even add a (automated!) test for it. I am happy to help but would need a reproducer in order to investigate. |
If you take a look at, for example, the |
We also are having issues loading the generated calculator parser program after this patch (https://github.com/p4lang/p4c/blob/main/testdata/p4tc_samples_outputs/calculator_parser.c). Reproducing it is simple, just compile the tc actions add action bpf obj generated/calculator_parser.o section p4tc/parse Here is a sample of the verifier output:
|
@vbnogueira explained the issue. The simple_exact_example prog diff may show why the payload chopping happens. The parser example is showing a different sympton. |
BTW, neither @vbnogueira nor myself mentioned this earlier: But reverting the patch resolves the issues for us |
@vbnogueira thank you for the explanation. Does the following patch help solve the issue? --- a/backends/tc/ebpfCodeGen.cpp
+++ b/backends/tc/ebpfCodeGen.cpp
@@ -268,6 +268,9 @@ void TCIngressPipelinePNA::emit(EBPF::CodeBuilder *builder) {
builder->newline();
builder->emitIndent();
builder->appendFormat("unsigned %s = hdrMd->%s;", offsetVar.c_str(), offsetVar.c_str());
+ builder->newline();
+ builder->emitIndent();
+ builder->appendFormat("%s = %s + BYTES(%s);", headerStartVar.c_str(), packetStartVar.c_str(), offsetVar.c_str());
}
builder->newline();
emitHeadersFromCPUMAP(builder); It results in adding one line near the start of the unsigned ebpf_packetOffsetInBits = hdrMd->ebpf_packetOffsetInBits;
hdr_start = pkt + BYTES(ebpf_packetOffsetInBits); <--- THIS LINE ADDED |
Yes, it does. |
Excellent - thanks for testing. I believe that I have a fix for the calculator verifier failure too, just trying to validate it now. Will post a PR with both patches once done. |
Issue reported in p4lang#4327.
Issue reported in p4lang#4327.
In summary, this is an optimisation which replaces the parser/deparser's
unsigned ebpf_packetOffsetInBits
variable (which tracks the current field offset from start of packet) with au8* ebpf_headerStart
pointer (which points to the start of the current header). There are a couple advantages to this approach:Note that restricting headers to be byte-aligned is not new, in fact it's already enforced by the eBPF backend.
The eBPF backend code is leveraged by other backends: uBPF & TC. For uBPF it was not possible to convert similarly due to it supporting
advance()
calls with arbitrary bit values - not sure whether it would be acceptable to restrict this too. TC looks like it would be amenable to these changes however since it is in active development I wanted to avoid the potential for conflicts. If we do want to port these changes to those backends it should be fairly mechanical since the code looks almost the same.