-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use rustc
directly instead of cargo
#52
Conversation
Looks like this got hit by some merge conflicts :-( @joshtriplett I'm curious what your view on this is. (I'll try to get a real review later today) |
You merged #44 right on time! ;-) Let me solve it, and see if I got all the CI changes correct.
No rush! I tried to ease the pain with the description -- (see the "Things to note" section for Rust-related bits). |
16f6e04
to
ac19154
Compare
Done -- rebased and CI passing. |
Yes, I would like to take a look, but should first concentrate on my exams ;-) Hopefully have time to check on the weekend. |
I really like the idea of using I'm getting the following error when building this (without any changes on my side):
|
Thanks, and for trying it out too!
Hmm... that's the other magic symbol like Could you please try to define it in |
I'm using
It works if I define it as a C function (extern "C", no mangle). Any idea why you don't get the same? While experimenting with the function definition here I noticed something else: |
Thanks for testing (although it should work without
We have a different
Sorry, I should have warned you. Yeah, that is expected at the moment, since the |
How hard is it to fix this before merging? I'm afraid not detecting the need to recompile a file will make development on top of this too challenging. |
Let me give it a go and let you know :-) |
ac19154
to
4d5b10e
Compare
Pushed v2:
[*] This last one relies on the same mechanism as the C side, which means it isn't perfect; but it works well enough (and, in our case, most people working in the Rust side shouldn't be touching too much of the C side at the same time). Things are more An example of how it looks now:
|
While trying to use this, I ran into some unexpected cases when libs were not being rebuilt even though I did make changes to them. This is one way to reproduce it:
Touching other files like For reference, here is the output of running step 3 for me:
And here's the output of step 4, with no changes to
|
My bad, I never tried breaking changes; I was testing the different scenarios with
Yeah... |
Thanks! Another question. I see that you've added debugging symbols, but it doesn't seem to be working when I set CONFIG_DEBUG_INFO=y, I get the following:
Do I need to do anything else to get the right symbols? When I was experimenting with shoving a staticlib into the kernel build, I was getting proper symbols for rust code. For example, I was getting the following (which is also what I was hoping for this):
|
That's because your GDB doesn't know how to demangle the new mangling scheme (which is used for the |
I guess I can put together a GDB fork with support for it that we can use meanwhile. I will also ask the GDB maintainers if they are planning to import the changes soon (i.e. a week or two). |
Indeed, without Don't worry about it if it's too much work, as it's not really a regression introduced by this PR. |
Yeah, I think it is a good idea to add an option to the Rust hacking menu for that. I will add a mention to this to the guide, too.
AFAIU I only need to sync a few files, so it shouldn't be a big deal. |
Actually, the debug info comes with demangled symbols. At least, my vanilla 10.1 gdb is able to show the demangled one from DWARF even if it doesn't know to demangle them itself. It seems in your case the first snippet you posted was for the Anyway, if you want a version with proper demangling for all cases (e.g. when debug info is disabled) and/or or you want the support in binutils etc., I backported the commit on top of the latest releases:
e.g. No debug info, vanilla GDB 10.1:
No debug info, patched GDB 10.1:
Vanilla binutils 2.35.1 (either with or without debug info):
Patched binutils 2.35.1 (either with or without debug info):
|
4d5b10e
to
d0e0681
Compare
Pushed v3:
|
d0e0681
to
3739ce7
Compare
Rebased after the v5.10 sync merge. |
@bjorn3 By the way, in case it is not clear (written technical language can be rough at times), all your comments are welcome and I am enjoying the discussion a lot! We setup from time to time an informal group call to discuss technical bits "face to face", you are welcome to join us! (also consider subscribing to the ML too). |
Thanks for letting me know!
Protonmail doesn't seem to have a way to use plain-text emails instead of MIME without using the paid bridge. I prefer to not use my personal mail though. |
I don't know if those MLs are supposed to work with MIME or not. We could ask to see what can be done; I'd assume you are not the first user hitting that problem. A partial solution is polling the archive (it is public). |
The subscribe bot interpreted every line of the MIME message as a separate command, so it doesn't work.
From what I gathered, they really don't want you to use MIME messages.
👍 |
A quick question, but is |
It is an experimental option so far. |
9ea67cb
to
15a406b
Compare
15a406b
to
ccc71df
Compare
Pushed v7:
|
With v7 things should be pretty good now that even more people gave it a look since the last call + the CI is now testing even more variations. Since already at least a couple people are basing their work on top of this PR, I think I will merge this in a couple days or so to simplify life for everyone and go back to small PRs. There are a handful of things still to improve around this PR, but those can wait. |
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Automatic generation of the `exports.c` list: + Instead of manually handling the list, all non-local functions available in `core`, `alloc` and `kernel` are exported, so all modules should work, regardless of what they need, and without failing linking due to symbols in the manual list not existing (e.g. due to differences in config options). + They are a lot, though: * ~6k Rust symbols vs. ~4k C symbols in release. * However, 4k of those are `bindings_raw` (i.e. duplicated C ones), which shouldn't be exported. Thus we should look into making `bindings_raw` private to the crate (at the moment, the (first) Rust example requires `<kernel::bindings...::miscdevice as Default>::default`). + Licensing: * `kernel`'s symbols are exported as GPL. * `core`'s and `alloc`'s symbols are exported as non-GPL so that third-parties can build Rust modules as long as they write their own kernel support infrastructure, i.e. without taking advantage of `kernel`. This seemed to make the most sense compared to other exports from the kernel, plus it follows more closely the original licence of the crates. - Support for GCC-compiled kernels: + The generated bindings do not have meaningful differences in our release config, between GCC 10.1 and Clang 11. + Other configs (e.g. our debug one) may add/remove types and functions. That is fine unless we use them form our bindings. + However, there are config options that may not work (e.g. the randstruct GCC plugin if we use one of those structs). - Support for `arm64` architecture: + Added to the CI: BusyBox is cross-compiled on the fly (increased timeout a bit to match). + Requires weakening of a few compiler builtins and adding `copy_{from,to}_user` helpers. - Support for custom `--sysroot` via `KRUSTCFLAGS`. - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - "Big" kallsyms symbol support: + I already raised ksym names from 128 to 256 back when I wrote the first integration. However, Rust symbols can be huge in debug/non-optimized, so I increased it again to 512; plus the module name from 56 to 248. + In turn, this required tuning the table format to support 2-byte lengths for ksyms. Compression at generation and kernel decompression is covered, although it may be the case that some script/tool also requires changes to understand the new table format. - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), by default following what the user selected for C, but customizable through a Kconfig menu. As well as options for tweaking overflow checks and debug assertions. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Do not export any helpers' symbols. - Updated the quick start guide. - Updated CI: + Now we always test with 2 builtins and 2 loadables Rust example drivers, removing the matrix test for builtin/loadable. + Added `toolchain` to matrix: now we test building with GCC, Clang or a full LLVM toolchain. + Added `arch` to matrix: now we test both arm64 and x86_64. + Added `rustc` to matrix: now we test with a very recent nightly as well. + Only build `output == build` once to reduce the number of combinations. + Debug x86_64 config: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. Also enable `-C opt-level=0` to test that such an extreme works and also to see how much bloated everything becomes. + Release x86_64 config: disabled `EXPERT` and changed a few things to make it look more like a normal desktop configuration, although it is still pretty minimal. + The configs for arm64 are `EXPERT` and `EMBEDDED` ones, very minimal, for the particular CPU we are simulating. + Update configs to v5.10. + Use `$GITHUB_ENV` to simplify. - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. - The hack to get the proper flags for bindgen on GCC builds. Signed-off-by: Miguel Ojeda <[email protected]>
ccc71df
to
7c6155b
Compare
Pushed a small rewording of the "Arch Support" docs page I forgot to do to avoid someone having to review it twice later. |
It allows us to save a bit of space, ignore the duplicate object files, and the archiving steps; e.g.: 643544 libcompiler_builtins.rlib 64171752 libcore.rlib vs. 530004 libcompiler_builtins.rmeta 63679866 libcore.rmeta We couldn't do it right away in [1] because `rustc` required a fix [2,3]. The fix is now in [4] and available since the 2021-01-21 nightly, so now we can go ahead and make the change. Fixes #75. [1] #52 [2] rust-lang/rust#81117 [3] rust-lang/rust#81118 [4] rust-lang/rust@f9275e1 Suggested-by: bjorn3 Signed-off-by: Miguel Ojeda <[email protected]>
It allows us to save a bit of space, ignore the duplicate object files, and the archiving steps; e.g.: 643544 libcompiler_builtins.rlib 64171752 libcore.rlib vs. 530004 libcompiler_builtins.rmeta 63679866 libcore.rmeta We couldn't do it right away in [1] because `rustc` required a fix [2,3]. The fix is now in [4] and available since the 2021-01-21 nightly, so now we can go ahead and make the change. Fixes #75. [1] #52 [2] rust-lang/rust#81117 [3] rust-lang/rust#81118 [4] rust-lang/rust@f9275e1 Suggested-by: bjorn3 Signed-off-by: Miguel Ojeda <[email protected]>
// | ||
// This requires the Rust's new/future `v0` mangling scheme because the default | ||
// one ("legacy") 1) uses a hash suffix which cannot be predicted across | ||
// compiler versions and 2) uses invalid characters for C identifiers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this, but the v0
mangling scheme is also unpredictable across compiler versions. The symbol hash at the end has been replaced with a crate id near the start of the symbol. For example in _RNvCs21hi0yVfW1J_8rust_out4main
the crate id is Cs21hi0yVfW1J
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will remove the first part of the sentence. It is actually a bit misleading, because the "hard" requirement is the second one: mixing rustc
compiler versions is not something we will support. IIRC I was thinking about possible patch level fixes to rustc
stable versions when I wrote this.
@bjorn3 helpfully noted that symbols do not stay the same across `rustc` compiler versions. The overall sentence is anyway a bit misleading, because the "hard" requirement is the second one: mixing `rustc` compiler versions is not something we will support. IIRC I was thinking about possible patch level fixes to `rustc` stable versions when I wrote that. Link: Rust-for-Linux#52 (comment) Signed-off-by: Miguel Ojeda <[email protected]>
This sounds weird to me, as splitting into multiple crates make compilation more parallel. If the amount of Rust code grows I think each driver would probably have to be in their own crate, otherwise compilation time would suffer. |
That comment was not about parallelization but about having less total code (previously, each driver had the entirety of Also, please note it is already the case that each module is its own crate (and they are built in parallel, as usual). |
The inline assembly for arm64's cmpxchg_double*() implementations use a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a pointer to unsigned long, and thus the hazard only applies to the first 8 bytes of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, leading to a number of potential problems. This is similar to what we fixed back in commit: fee960b ("arm64: xchg: hazard against entire exchange variable") ... but we forgot to adjust cmpxchg_double*() similarly at the same time. The same problem applies, as demonstrated with the following test: | struct big { | u64 lo, hi; | } __aligned(128); | | unsigned long foo(struct big *b) | { | u64 hi_old, hi_new; | | hi_old = b->hi; | cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78); | hi_new = b->hi; | | return hi_old ^ hi_new; | } ... which GCC 12.1.0 compiles as: | 0000000000000000 <foo>: | 0: d503233f paciasp | 4: aa0003e4 mov x4, x0 | 8: 1400000e b 40 <foo+0x40> | c: d2800240 mov x0, #0x12 // #18 | 10: d2800681 mov x1, #0x34 // #52 | 14: aa0003e5 mov x5, x0 | 18: aa0103e6 mov x6, x1 | 1c: d2800ac2 mov x2, #0x56 // #86 | 20: d2800f03 mov x3, #0x78 // #120 | 24: 48207c82 casp x0, x1, x2, x3, [x4] | 28: ca050000 eor x0, x0, x5 | 2c: ca060021 eor x1, x1, x6 | 30: aa010000 orr x0, x0, x1 | 34: d2800000 mov x0, #0x0 // #0 <--- BANG | 38: d50323bf autiasp | 3c: d65f03c0 ret | 40: d2800240 mov x0, #0x12 // #18 | 44: d2800681 mov x1, #0x34 // #52 | 48: d2800ac2 mov x2, #0x56 // #86 | 4c: d2800f03 mov x3, #0x78 // #120 | 50: f9800091 prfm pstl1strm, [x4] | 54: c87f1885 ldxp x5, x6, [x4] | 58: ca0000a5 eor x5, x5, x0 | 5c: ca0100c6 eor x6, x6, x1 | 60: aa0600a6 orr x6, x5, x6 | 64: b5000066 cbnz x6, 70 <foo+0x70> | 68: c8250c82 stxp w5, x2, x3, [x4] | 6c: 35ffff45 cbnz w5, 54 <foo+0x54> | 70: d2800000 mov x0, #0x0 // #0 <--- BANG | 74: d50323bf autiasp | 78: d65f03c0 ret Notice that at the lines with "BANG" comments, GCC has assumed that the higher 8 bytes are unchanged by the cmpxchg_double() call, and that `hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and LL/SC versions of cmpxchg_double(). This patch fixes the issue by passing a pointer to __uint128_t into the +Q constraint, ensuring that the compiler hazards against the entire 16 bytes being modified. With this change, GCC 12.1.0 compiles the above test as: | 0000000000000000 <foo>: | 0: f9400407 ldr x7, [x0, #8] | 4: d503233f paciasp | 8: aa0003e4 mov x4, x0 | c: 1400000f b 48 <foo+0x48> | 10: d2800240 mov x0, #0x12 // #18 | 14: d2800681 mov x1, #0x34 // #52 | 18: aa0003e5 mov x5, x0 | 1c: aa0103e6 mov x6, x1 | 20: d2800ac2 mov x2, #0x56 // #86 | 24: d2800f03 mov x3, #0x78 // #120 | 28: 48207c82 casp x0, x1, x2, x3, [x4] | 2c: ca050000 eor x0, x0, x5 | 30: ca060021 eor x1, x1, x6 | 34: aa010000 orr x0, x0, x1 | 38: f9400480 ldr x0, [x4, #8] | 3c: d50323bf autiasp | 40: ca0000e0 eor x0, x7, x0 | 44: d65f03c0 ret | 48: d2800240 mov x0, #0x12 // #18 | 4c: d2800681 mov x1, #0x34 // #52 | 50: d2800ac2 mov x2, #0x56 // #86 | 54: d2800f03 mov x3, #0x78 // #120 | 58: f9800091 prfm pstl1strm, [x4] | 5c: c87f1885 ldxp x5, x6, [x4] | 60: ca0000a5 eor x5, x5, x0 | 64: ca0100c6 eor x6, x6, x1 | 68: aa0600a6 orr x6, x5, x6 | 6c: b5000066 cbnz x6, 78 <foo+0x78> | 70: c8250c82 stxp w5, x2, x3, [x4] | 74: 35ffff45 cbnz w5, 5c <foo+0x5c> | 78: f9400480 ldr x0, [x4, #8] | 7c: d50323bf autiasp | 80: ca0000e0 eor x0, x7, x0 | 84: d65f03c0 ret ... sampling the high 8 bytes before and after the cmpxchg, and performing an EOR, as we'd expect. For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note that linux-4.9.y is oldest currently supported stable release, and mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run on my machines due to library incompatibilities. I've also used a standalone test to check that we can use a __uint128_t pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM 3.9.1. Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double") Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU") Reported-by: Boqun Feng <[email protected]> Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/ Reported-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Mark Rutland <[email protected]> Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Steve Capper <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
This is a big PR, but most of it is interdependent to the rest.
Shared Rust infrastructure:
libkernel
,libmodule
,libcore
,liballoc
,libcompiler_builtins
..ko
on release is just 12 KiB, down from 1.3 MiB. For reference (i.e. the bulk is now shared):vmlinux
on release with Rust is 23 MiB (compressed: 2.1 MiB)vmlinux
on release without Rust is 22 MiB (compressed: 1.9 MiB).rlib
s (and the.so
for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the.ko
s.Simpler, flat file structure: now a small driver may only need a single file like
drivers/char/rust_example.rs
, like in C.rust/*
anddriver/char/rust_example/*
files moved to fit in the new structure: less files around.Only
rust-lang/{rust,rust-bindgen,compiler-builtins}
as dependencies.Dependency handling integration with
Kbuild
/fixdep
.rust/
source files,.rs
changes, command-line changes, flag changes, etc. all trigger recompilation as needed.-j
).Support for custom
--sysroot
viaKRUSTCFLAGS
.Proper
make clean
support.Offline builds by default (there is no "online compilation" anymore; fixes Create a way for offline building #17).
No interleaved Cargo output (fixes Makefile ui #29).
No nightly dependency on Cargo's
build-std
; since now we manage the cross-compilation ourselves (should fix Compilation error: can't find crate forstd
; thex86_64-linux-kernel
target may not be installed #27).Since now a kernel can be "Rust-enabled", a new
CONFIG_RUST
option is added to enable/disable it manually, regardless of whether one hasrustc
available or not (CONFIG_HAS_RUST
).Improved handling of
rustc
flags (opt-level
,debuginfo
, etc.), following what the user selected for C (no Cargo profiles).Added Kconfig menu for tweaking relevant
rustc
options, like overflow checks, debug assertions, optimization level, etc.This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more
shell readlink
forO=
).Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time.
Updated the quick start guide.
Updated CI
.config
s:EXPERT
and changed a few things to make it look more like a normal configuration.(I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it).
Less
extern crate
s needed since we pass it viarustc
(closer to idiomatic 2018 edition Rust code).Things to note:
There is two more nightly features used:
The hack at
exports.c
to export symbols to loadable modules.The hack at
allocator.rs
to get the__rust_*()
functions.Signed-off-by: Miguel Ojeda [email protected]