Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use global_asm! instead of external assembly files #4306

Merged
merged 6 commits into from
Jun 27, 2022

Conversation

alexcrichton
Copy link
Member

This commit moves the external assembly files of the wasmtime-fiber
crate into global_asm! blocks defined in Rust. The motivation for
doing this is not very strong at this time, but the points in favor of
this are:

  • One less tool needed to cross-compile Wasmtime. A linker is still
    needed but perhaps one day that will improve as well.
  • A "modern" assembler, built-in to LLVM, is used instead of whatever
    appears on the system.

The first point hasn't really cropped up that much and typically getting
an assembler is just as hard as getting a linker nowadays. The second
point though has us using hint #xx in aarch64 assembly instead of the
actual instructions for assembler compatibility, and I believe that's no
longer necessary because the LLVM assembler supports the modern
instruction names.

The translation of the x86/x86_64 assembly has been done to Intel
syntax as well as opposed to the old AT&T syntax since that's Rust's
default. Additionally s390x still remains in an external assembler file
because global_asm! is still unstable in Rust on that platform.

This commit moves the external assembly files of the `wasmtime-fiber`
crate into `global_asm!` blocks defined in Rust. The motivation for
doing this is not very strong at this time, but the points in favor of
this are:

* One less tool needed to cross-compile Wasmtime. A linker is still
  needed but perhaps one day that will improve as well.
* A "modern" assembler, built-in to LLVM, is used instead of whatever
  appears on the system.

The first point hasn't really cropped up that much and typically getting
an assembler is just as hard as getting a linker nowadays. The second
point though has us using `hint #xx` in aarch64 assembly instead of the
actual instructions for assembler compatibility, and I believe that's no
longer necessary because the LLVM assembler supports the modern
instruction names.

The translation of the x86/x86_64 assembly has been done to Intel
syntax as well as opposed to the old AT&T syntax since that's Rust's
default. Additionally s390x still remains in an external assembler file
because `global_asm!` is still unstable in Rust on that platform.
Comment on lines -24 to -46
// We need to tell whatever loads the following code (e.g. the dynamic linker)
// that it is compatible with BTI, so that the corresponding executable memory
// pages have the necessary attribute set (if supported by the environment). To
// this end, we follow the ELF for the Arm® 64-bit Architecture standard, and
// use a special metadata section. Further details are in section 6.2 of the
// specification:
//
// https://github.com/ARM-software/abi-aa/blob/2022Q1/aaelf64/aaelf64.rst#program-property
//
// We also set the PAuth (PAC) property, even though it is optional, for the
// sake of completeness.
.pushsection .note.gnu.property, "a";
.p2align 3;
.word 4;
.word 16;
.word 5;
.asciz "GNU";
.word 0xc0000000; // GNU_PROPERTY_AARCH64_FEATURE_1_AND
.word 4;
.word 3; // GNU_PROPERTY_AARCH64_FEATURE_1_BTI | GNU_PROPERTY_AARCH64_FEATURE_1_PAC
.word 0;
.popsection
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akirilov-arm I have temporarily dropped this block in the hopes that it's only required for external assembly files rather than global_asm! blocks in Rust itself. I am not super familiar with global_asm! though so I could be wrong here. Could you help by taking a peek at the artifacts of this PR (when they're available) and see if everything looks correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of other people who might look at this PR - we are also discussing this issue in the Zulip chat.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok while very roundabout I have confirmed that I don't believe it's necessary to include this section in the global_asm! anywhere. The global_asm! gets shoved directly into the final object file and the object files coming out of rustc otherwise have this section already configured with the -Zbranch-protection=bti flag.

As I mentioned in Zulip there's a whole mess of other reason why the note won't show up but they're all unrelated to this stanza in this assembly (the other issues being that (a) your local rust precompiled standard library needs the note and (b) your local glibc toolchain with crt*.o also needs the note, neither of which have it by default)

@@ -0,0 +1,82 @@
// A WORD OF CAUTION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that since PR #3799 Cranelift no longer supports the 32-bit Arm architecture, do we need to keep this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true yeah, although knowing that I originally wrote this file and the x86.S file when I first made the wasmtime-fiber crate. My thinking was that I had all the context of fibers booted into my head and the CFI directives and such in particular are really tricky so it was easy for me to bang out some other "maybe one day we'll support these" architectures here. These aren't actually tested on CI at all and there's no actual way to use them unless you use the wasmtime-fiber crate directly (and why would you).

Personally I'd err on the side of leaving them here in the sense that I've tested them to make sure they work (even with this PR) and otherwise it's one less thing to add if anyone ever wants to add support for arm/x86. If anyone else spends even a crumb of thought on updating these files though they should definitely be deleted immediately as they're not worth that much.

Comment on lines -24 to -46
// We need to tell whatever loads the following code (e.g. the dynamic linker)
// that it is compatible with BTI, so that the corresponding executable memory
// pages have the necessary attribute set (if supported by the environment). To
// this end, we follow the ELF for the Arm® 64-bit Architecture standard, and
// use a special metadata section. Further details are in section 6.2 of the
// specification:
//
// https://github.com/ARM-software/abi-aa/blob/2022Q1/aaelf64/aaelf64.rst#program-property
//
// We also set the PAuth (PAC) property, even though it is optional, for the
// sake of completeness.
.pushsection .note.gnu.property, "a";
.p2align 3;
.word 4;
.word 16;
.word 5;
.asciz "GNU";
.word 0xc0000000; // GNU_PROPERTY_AARCH64_FEATURE_1_AND
.word 4;
.word 3; // GNU_PROPERTY_AARCH64_FEATURE_1_BTI | GNU_PROPERTY_AARCH64_FEATURE_1_PAC
.word 0;
.popsection
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of other people who might look at this PR - we are also discussing this issue in the Zulip chat.

@alexcrichton
Copy link
Member Author

I just went down a really long rabbit hole to justify the addition of 6a02949 here, namely adding .cfi_def_cfa_offset 0 to all the wasmtime_fiber_start functions. Long story short this assembly file:

my_function:
        .cfi_startproc simple
        .cfi_rel_offset w29, -16
        ret
        .cfi_endproc

foo:
        .cfi_startproc
        .cfi_def_cfa_offset 16
        ret
        .cfi_endproc

is a massive simplification of the wasmtime_fiber crate. The my_function here is equivalent to wasmtime_fiber_start. When compiling this with clang and looking at the unwind tables though:

$ clang -c bad.s -o bad.o2 && objdump -W bad.o2
...
0000003c 0000000000000010 00000018 FDE cie=00000028 pc=0000000000000000..0000000000000004
  DW_CFA_offset: r29 (x29) at cfa-32
  DW_CFA_nop

Here the cfa-32 is unexpected, that's supposed to be -16. This appears to be affected by the .cfi_def_cfa_offset 16 directive in the foo function. The reason this only showed up on CI is because oddly enough the bug only arose when incremental compliation was turned off. This shuffled around CGUs such that the global asm bits added here ended up getting shoved into a CGU with another function, which internally in LLVM had this .cfi_def_cfa_offset directive.

Overall I don't know why there's this "global state" here or what this offset is. I suppose the word "global" in "global_asm" should have tipped me off. Anyway my hope here was that with the addition of .cfi_def_cfa_offset 0 directives we can work around this to "reset" the directive at the start of the function. Afterwards things appear to work.

This is probably a bug in LLVM though. The same assembly file above going through gcc yields:

$ gcc -c bad.s -o bad.o2 && objdump -W bad.o2

bad.o2:     file format elf64-littleaarch64

Contents of the .eh_frame section:


00000000 0000000000000010 00000000 CIE
  Version:               1
  Augmentation:          "zR"
  Code alignment factor: 4
  Data alignment factor: -8
  Return address column: 30
  Augmentation data:     1b
  DW_CFA_offset: r29 (x29) at cfa-16
  DW_CFA_nop

00000014 0000000000000010 00000018 FDE cie=00000000 pc=0000000000000000..0000000000000004
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop

...

Here it looks like gcc folded the FDE entry directly into the CIE because only one FDE references the CIE (I guess? unsure?). In any case the cfa-16 is what we want, unlike the cfa-32 which is what clang produces. I don't really have the energy though to file an issue on LLVM at this time though.

crates/fiber/src/unix.rs Outdated Show resolved Hide resolved
crates/fiber/src/unix/aarch64.rs Outdated Show resolved Hide resolved
crates/fiber/src/unix/aarch64.rs Show resolved Hide resolved
Copy link
Contributor

@akirilov-arm akirilov-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but note that out of the architecture-specific bits I have looked only at the AArch64 ones.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@alexcrichton alexcrichton merged commit 4543a07 into bytecodealliance:main Jun 27, 2022
@alexcrichton alexcrichton deleted the global-asm branch June 27, 2022 18:20
@cfallin
Copy link
Member

cfallin commented Jun 27, 2022

This unfortunately seems to have broken compilation on macOS/aarch64 with latest stable Rust (1.61.0):

   Compiling num-rational v0.2.4
   Compiling wasmtime-fiber v0.39.0 (/Users/cfallin/work/wasmtime2/crates/fiber)
   Compiling proptest v1.0.0
   Compiling egg v0.6.0
   Compiling capstone-sys v0.13.0
   Compiling shuffling-allocator v1.1.2
   Compiling cranelift-bforest v0.86.0 (/Users/cfallin/work/wasmtime2/cranelift/bforest)
   Compiling unicode-linebreak v0.1.2
error: macro expansion ends with an incomplete expression: expected expression
  --> crates/fiber/src/unix/aarch64.rs:41:5
   |
25 |         macro_rules! paciasp { () => (); }
   |                                      -- in this macro arm
...
41 |     paciasp!(),
   |     ^^^^^^^^^^ expected expression
   |
   = note: the macro call doesn't expand to an expression, but it can expand to a statement
help: add `;` to interpret the expansion as a statement
   |
41 |     paciasp!();,
   |               +

error: macro expansion ends with an incomplete expression: expected expression
  --> crates/fiber/src/unix/aarch64.rs:42:5
   |
23 |         macro_rules! cfi_window_save { () => (); }
   |                                              -- in this macro arm
...
42 |     cfi_window_save!(),
   |     ^^^^^^^^^^^^^^^^^^ expected expression
   |
   = note: the macro call doesn't expand to an expression, but it can expand to a statement
help: add `;` to interpret the expansion as a statement
   |
42 |     cfi_window_save!();,
   |                       +

error: macro expansion ends with an incomplete expression: expected expression
  --> crates/fiber/src/unix/aarch64.rs:78:5
   |
26 |         macro_rules! autiasp { () => (); }
   |                                      -- in this macro arm
...
78 |     autiasp!(),
   |     ^^^^^^^^^^ expected expression
   |
   = note: the macro call doesn't expand to an expression, but it can expand to a statement
help: add `;` to interpret the expansion as a statement
   |
78 |     autiasp!();,
   |               +

error: macro expansion ends with an incomplete expression: expected expression
  --> crates/fiber/src/unix/aarch64.rs:79:5
   |
23 |         macro_rules! cfi_window_save { () => (); }
   |                                              -- in this macro arm
...
79 |     cfi_window_save!(),
   |     ^^^^^^^^^^^^^^^^^^ expected expression
   |
   = note: the macro call doesn't expand to an expression, but it can expand to a statement
help: add `;` to interpret the expansion as a statement
   |
79 |     cfi_window_save!();,
   |                       +

error: macro expansion ends with an incomplete expression: expected expression
   --> crates/fiber/src/unix/aarch64.rs:117:5
    |
24  |         macro_rules! pacia1716 { () => (); }
    |                                        -- in this macro arm
...
117 |     pacia1716!(),
    |     ^^^^^^^^^^^^ expected expression
    |
    = note: the macro call doesn't expand to an expression, but it can expand to a statement
help: add `;` to interpret the expansion as a statement
    |
117 |     pacia1716!();,
    |                 +

error: macro expansion ends with an incomplete expression: expected expression
   --> crates/fiber/src/unix/aarch64.rs:148:5
    |
23  |         macro_rules! cfi_window_save { () => (); }
    |                                              -- in this macro arm
...
148 |     cfi_window_save!(),
    |     ^^^^^^^^^^^^^^^^^^ expected expression
    |
    = note: the macro call doesn't expand to an expression, but it can expand to a statement
help: add `;` to interpret the expansion as a statement
    |
148 |     cfi_window_save!();,
    |                       +

error: could not compile `wasmtime-fiber` due to 6 previous errors
warning: build failed, waiting for other jobs to finish...

I haven't dug into this at all but I can play with it more or test changes tomorrow if needed to help...

@cfallin
Copy link
Member

cfallin commented Jun 27, 2022

(N.B.: all errors from ./ci/run-tests.sh; just cargo build for wasmtime-cli works fine still, I guess because it's not async?)

The following patch seems to push it further:

diff --git a/crates/fiber/src/unix/aarch64.rs b/crates/fiber/src/unix/aarch64.rs
index a0b4201b9..28b257974 100644
--- a/crates/fiber/src/unix/aarch64.rs
+++ b/crates/fiber/src/unix/aarch64.rs
@@ -20,10 +20,10 @@

 cfg_if::cfg_if! {
     if #[cfg(target_os = "macos")] {
-        macro_rules! cfi_window_save { () => (); }
-        macro_rules! pacia1716 { () => (); }
-        macro_rules! paciasp { () => (); }
-        macro_rules! autiasp { () => (); }
+        macro_rules! cfi_window_save { () => (""); }
+        macro_rules! pacia1716 { () => (""); }
+        macro_rules! paciasp { () => (""); }
+        macro_rules! autiasp { () => (""); }
     } else {
         macro_rules! cfi_window_save { () => (".cfi_window_save\n"); }
         macro_rules! pacia1716 { () => ("pacia1716\n"); }

unfortunately it then bottoms out at:

error: unknown AArch64 fixup kind!
   |
note: instantiated into assembly here
  --> <inline asm>:53:9
   |
53 |         adr x17, _wasmtime_fiber_start
   |

@alexcrichton
Copy link
Member Author

Oh oops this reminds me that I meant to poke you before I merged this and see if it compiles on your mac and completely forgot. I'll dig in tomorrow but feel free to revert in the meantime.

@cfallin
Copy link
Member

cfallin commented Jun 27, 2022

No worries, as long as we dig in (I'm happy to help test) I don't think there's a need to revert immediately. Hopefully something simple to fix...

@cfallin
Copy link
Member

cfallin commented Jun 27, 2022

OK, the following is sufficient to get everything to build and pass tests again (conditionalizing this just for macOS left as exercise for the reader):

diff --git a/crates/fiber/src/unix/aarch64.rs b/crates/fiber/src/unix/aarch64.rs
index a0b4201b9..f242a0bc1 100644
--- a/crates/fiber/src/unix/aarch64.rs
+++ b/crates/fiber/src/unix/aarch64.rs
@@ -20,10 +20,10 @@
 
 cfg_if::cfg_if! {
     if #[cfg(target_os = "macos")] {
-        macro_rules! cfi_window_save { () => (); }
-        macro_rules! pacia1716 { () => (); }
-        macro_rules! paciasp { () => (); }
-        macro_rules! autiasp { () => (); }
+        macro_rules! cfi_window_save { () => (""); }
+        macro_rules! pacia1716 { () => (""); }
+        macro_rules! paciasp { () => (""); }
+        macro_rules! autiasp { () => (""); }
     } else {
         macro_rules! cfi_window_save { () => (".cfi_window_save\n"); }
         macro_rules! pacia1716 { () => ("pacia1716\n"); }
@@ -112,7 +112,8 @@ asm_func!(
         .cfi_startproc
         hint #34 // bti c
         sub x16, x0, #16
-        adr x17, ", asm_sym!("wasmtime_fiber_start"), "
+        adrp x17, ", asm_sym!("wasmtime_fiber_start"), "@PAGE
+        add x17, x17, ", asm_sym!("wasmtime_fiber_start"), "@PAGEOFF
     ",
     pacia1716!(),
     "

basically I figured that the linker is probably putting the functions in different sections now (whereas the .S file put everything contiguously in one section) so we can't rely on the 21-bit adr relocs now, we need the +/- 4GiB range of adrp + an imm12. I'd actually be surprised if this doesn't silently (or just in a delayed way) become an issue on aarch64-linux too, depending on the link order one happens to get...

@alexcrichton
Copy link
Member Author

Ok I ended up cooking up #4341 which I'm hoping will work

afonso360 pushed a commit to afonso360/wasmtime that referenced this pull request Jun 30, 2022
…e#4306)

* Use `global_asm!` instead of external assembly files

This commit moves the external assembly files of the `wasmtime-fiber`
crate into `global_asm!` blocks defined in Rust. The motivation for
doing this is not very strong at this time, but the points in favor of
this are:

* One less tool needed to cross-compile Wasmtime. A linker is still
  needed but perhaps one day that will improve as well.
* A "modern" assembler, built-in to LLVM, is used instead of whatever
  appears on the system.

The first point hasn't really cropped up that much and typically getting
an assembler is just as hard as getting a linker nowadays. The second
point though has us using `hint #xx` in aarch64 assembly instead of the
actual instructions for assembler compatibility, and I believe that's no
longer necessary because the LLVM assembler supports the modern
instruction names.

The translation of the x86/x86_64 assembly has been done to Intel
syntax as well as opposed to the old AT&T syntax since that's Rust's
default. Additionally s390x still remains in an external assembler file
because `global_asm!` is still unstable in Rust on that platform.

* Simplify alignment specification

* Temporarily disable fail-fast

* Add `.cfi_def_cfa_offset 0` to fix CI

* Turn off fail-fast

* Review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants