New ABI: "wasm"

Lang team issue
Initial implementation PR
Tracking Rust issue
Watch the recording

Background - multi-value

When WebAssembly was first introduced to the world (the "MVP" of WebAssembly) functions could have any number of parameters but could have at most one result. Recently, in the last year, WebAssembly has stabilized the multi-value proposal which means that functions are now capable of returning multiple return values. While purportedly supported by all modern engines adoption of multiple returns has not yet reached C/Rust.

This feature of returning multiple values is not the default everywhere and notably is not on by default in LLVM (but there is an LLVM +multivalue feature for it). Additionally the original C ABI for WebAssembly was defined without multi-value support. There has been a small amount of dicussion about a new C ABI which leverages multiple return values, but I'm at least not personally aware of much progress on this front or what will be happening here in the future.

Background - ABI mismatch

Another important piece of background information (which is orthogonal to the multi-value above) is that Rust's wasm32-unknown-unknown target does not implement the same C ABI that clang implements. When the wasm32-unknown-unknown target was first introduced I didn't really understand what I was doing and then accidentally started relying on what was implemented. The disagreement between the wasm32-unknown-unknown ABI and the clang definition of an ABI largely boils down to how aggregates are passed as parameters. For example:

#[repr(C)]
pub struct A(i32, i32);

#[no_mangle]
pub extern fn foo(_: A) {}

will differ on wasm32-unknown-unknown and other wasm targets:

$ rustc foo.rs --crate-type cdylib --target wasm32-unknown-unknown
$ wasm2wat foo.wasm | grep foo
  (func $foo (type 0) (param i32 i32)
  (export "foo" (func $foo))

$ rustc foo.rs --crate-type cdylib --target wasm32-wasi
$ wasm2wat foo.wasm | grep foo
  (func $foo (type 0) (param i32)
  (export "foo" (func $foo))

Notably here the wasm32-unknown-unknown target passes A as two i32 parameters, where the wasm32-wasi target (and other wasm targets like wasm32-unknown-emscripten) will pass A by reference indirectly.

The consequence of this ABI mismatch is that it is more difficult to mix C and Rust code in the same wasm project. The same signature won't agree when languages call each other which can produce linking errors. The reason that this has not been fixed is that the wasm-bindgen project relies on the behavior of wasm32-unknown-unknown and is not easily fixed.

Future of `extern "C"`

It's expected that at some point in the future the canonical definition of the C ABI will be updated to incorporate multiple return-values into it. Users of C/C++ will, presumably, have the desire to write functions that have multiple return values and this'll get implemented. Some uncertainty, though, about this still remains:

It's not clear when this will happen. I don't really follow the C side of things too too closely and it's mostly Googlers who have been maintaining and evolving LLVM and the impression I get is that this isn't super high on their priority list.
It's not clear what a new C ABI might look like. Historically decisions in clang (I believe) have been evaluated by benchmarking Emscripten output in v8. Given this it's not obvious (to me at least) that a new C ABI will suit the needs of all wasm users if it's driven by performance rather than expressibility.

Overall I am personally pretty uncertain about what will happen with the C ABI in the future. The resolution of a "wasm" ABI for Rust might be "Alex just go talk to people and figure out what C will do first".

Motivation of `extern "wasm"`

If, for a moment, we assume that the C ABI will either move too slowly or will not suit Rust's wasm needs in the future, then this leaves an expressibility gap that needs to be filled. This was the original motivation for the wasm ABI in Rust. Notably:

The "wasm" ABI would be defined exclusively in terms of WebAssembly signatures. There would be a definition from types in Rust and how they would appear in the WebAssembly function signature when the function is compiled (both for multiple parameters and multiple results)
The "wasm" ABI would enable switching the wasm32-unknown-unknown target to match C's default ABI without breaking projects like wasm-bindgen. The wasm-bindgen project would migrate to use "wasm" for its purposes and then by switching the target's defaults it would hopefully fix more than it would break in the ecosystem.

As mentioned above in the "future of extern "C"", however, this motivation is all moot if the C ABI magically changes tomorrow to incorporate multiple returns as well as changes to suit wasm-bindgen.

Definition of `extern "wasm"`

The actual literal definition of extern "wasm" is a bit up in the air. Right now it's a bit hand-wavy and simply says "what if all values were splatted into their components and then that was all passed by value". This means that Rust would still have to define things like what the ABI for u128 is, and the definition would be Rust-specific. It's expected that this could largely lift from the existing ABI for C in terms of definition but would likely need Rust-specific bits as well.

Another unclear aspect of a hypothetical "wasm" ABI would be whether any other language would ever want to use it. C users might be content in waiting for whatever the multi-result support looks like in C, and other languages may just match that rather than doing something specific like Rust.

Notes from meeting

What makes wasm-bindgen reliant on the existing C ABI?
- Wasm-bindgen is all about communication between JS
- JS and tool need to agree
- The way this works today:
  - There is a macro that lowers Rust types down to simple types that JS understands
  - Code assumes that we can use traits to map rust types to simpler Rust types that the ABI natively understands
- Changing the compiler will mean that the bindgen version and the compiler version must be tightly coordinated
- Why can't you use #[cfg(version)]?
  - Because the JS code is produced by the CLI
  - The wasm-bindgen CLI would have to understand what version of rust made the binary
- Goal: to have a version of Rust that supports both old and new so that we can make the transition happen
Rough plan:
- We would make wasm-bindgen use extern "wasm" once it hits stable; this would only be compatible with newer compilers, but you'd get a nice clean error.
Would you change back to the C ABI later, once it is updated to match clang?
- No, clang's behavior is not desirable.
Regarding breaking the ecosystem:
- only Alex cares ;)
- changing the C ABI would be a bug fix
pnkfelix: To be concrete, is the proposal that the code example from the top would change to this:
```
#[repr(C)]
pub struct A(i32, i32);

#[no_mangle]
pub extern "wasm" fn foo(_: A) {}
```

in order to get the same effect for the wasm32-unknown-unknown target?

Will the compiler actually then have the same output on both wasm32-unknown-unknown and wasm32-wasi, both with extern "wasm" and the way it was written up above? Or will wasm32-wasi just not support extern "wasm"?

Yes! That would be how the ABI works.
Felix: Isn't that what extern C would do?
- Alex: clang passes all structs by reference
- Alex: but rustc presently does splat for wasm32-unknown-unknown
  - every other wasm target matches clang (e.g., wasm32-emscripten and friends)
Precise proposal:
- "C" would always match clang
- "wasm" would do the thing Alex wants (splat, multiple return values)
pnkfelix: I’ll claim ignorance: Can you summarize different purposes of wasm32-unknown-unknown vs wasm32-wasi targets?
- Only difference is the stdlib
- unknown-unknown ignores prints, returns errors, etc
- wasm32-wasi is built against the WASI APIs for interacting with the "operating system"
Josh: Is the alternative of a different target (that changes extern "C") worth exploring? That would allow one Rust version to support both ABIs, as well. And that would have different stability guarantees, in that we can deprecate targets.
- e.g., wasmwasm-unknown-unknown could make "C" use the proposed "wasm" abi
- Alex: Disadvantage would be that C and Rust would not agree
  - So projects that want to intermix code generated by clang and Rust would be unable to do so
- Josh: So goal is not just to be able to have both ABIs be around for a transition
  - It's that we want both ABIs for interop between C and wasm-bindgen-generated code
Josh: Do we expect that extern "wasm" can continue to evolve with WebAssembly as WebAssembly gains more features, or will it become static as people use it? For instance, if WebAssembly gains native support for passing u128, can we evolve extern "wasm" safely over time, without breaking existing users? How can we make sure extern "wasm" doesn't become stale just as extern "C" has?
- extern wasm ABI could be limited to things that translate directly to WASM, along with aggregates
- things like u128 for example would not be supported
- this way, when extern wasm adds u128, we can add support for those
- if we were to stabilize extern wasm, this would be a conservative and suitable approach
- scottmcm: like the ctypes lint, but different rules, hard error?
  - alex: yes.
- josh: so you can add new items when wasm/Rust/wasm-bindgen agree?
  - Alex: new items would probably not be added unless there is no future path to support it
  - e.g., we might permit u8 because wasm will never add a type for that
  - but we would avoid u128 since wasm might do it
- niko: and you would expect rust users to map e.g. u128 to u64
  - alex: yes. but aggregates would work.
- cramertj: you'd be checking the field types of aggregates too, then
  - alex: yes. I imagine we'd require repr(C), though we could add repr(wasm)
  - cramertj: repr(wasm) would be good because it's an add'l semver requirement that one cannot add non-wasm-enabled types to your type
  - alex: field order matters, so repr(C) makes sense, but I'm not sure if repr(wasm) is needed; we would be checking generics post-monomorphization
  - cramertj: my point is that changing a "C" struct would cause wasm things to just plain stop compiling, since you're using a type that is not wasm-compatible
  - alex: that's true. If you have e.g. a generic function that has a WASM ABI, then some instantiations will error out, but others won't. That's funky.
Niko: is it an accurate summary that the motivation to add extern "wasm" is:
- extern "C" is unlikely to be updated on a reasonable timeframe
- we can migrate folks over gradually to "extern wasm" and then modify or deprecate the old behavior of the C ABI to match clang
- Alex: I don't really know how likely it is for "C" to change
- pnkfelix: Even if it did change, would it change how we think is best?
- Alex: unclear. But if C does have an ABI that supports multi-value, wasm-bindgen should probably match it eventually.
Taylor: what types would we expect to see allowed in the signatures of extern "wasm" functions? #[repr(C)] types? #[repr(wasm) types? Is there potential for mismatch of aggregate representations for any #[repr(C)] types?
- already answered
Taylor: what level of documentation and stability guarantees do we want to offer for our extern "wasm" ABI? Is it "something that magically works with bindgen", and bindgen is lifted to some kind of "priveleged" library? Or do we provide a full specification of the ABI that other folks could rely on?
- Alex: The goal of extern wasm is that, given a Rust signature, you can precisely predict the wasm signature. Therefore I would expect us to document this.
- Alex: This would be Rust specific.
- cramertj: so e.g. someone else could write code against our definition of extern wasm and we would expect it to work going forward?
- Alex: Yes. This is why it's useful to wasm-bindgen.
pnkfelix: to clarify, things like u8 would be supported
- but some things like u128 that may be supported in the future would be rejected?
- alex: maybe. maybe that doesn't work.
- alex: goal is still to have most Rust code generally work, hence the support for u8. Things like u128 are much more unusual.
Mark: Are we expecting "C" and "wasm" to eventually become synonymous? Will there be repr(wasm) of some kind on structs, alongside lints?
- Answered. no.
Josh: Will extern "wasm" exist on other targets? Might it make sense to use it as a "portable safe ABI" in the future, to handle things like slices and strings safely?
- No.
- Josh: I can imagine that people might want to have two different targets where the only difference is the ABI. That's kind of a pain.
- Alex: I've done this with macros, but that would be a downside.
- Josh: we should document the expected pattern.
Scott: Is there a use for making this be extern "splatted" and available on things other than WASM?
- Alex: I don't want to be the one to add a new ABI just because I want splatting... but if it really is useful for others, we could do that.
Mark: the predominant use case is bridging javascript, right? how often would you want this for functions that might be C on other platforms?
- Alex: Well, if interface types makes progress, then you might define native plugins using it, and the same ABI you're using for wasm might be used for native modules too. So you might want the same function to be compiled both to wasm and to native code.
- Alex: so e.g. you might want to write a plugin that can be compiled to either wasm or to C ABI. But if we make it an error to use wasm on other targets to start
Emmanuel: Will the C ABI and wasm ABI merge in the future (like when the C ABI gets support for multiple returns)?
- Discussed, but likely no.
- If they do merge we can get rid of wasm.
Multiple returns: second motivation for the wasm ABI, unsupported by C
- Would be a way for us to roll out multiple return support in Rust w/o new targets, ABIs, etc
- We have arm targets for "softfloat" and "hardfloat" for example, because whole world has to agree
- But 'wasm' and 'wasm-multivalue' doesn't make sense as everyone will eventually use the latter
- It's hard to know how clang will approach this; they might add a codegen flag
- pnkfelix: isn't the analogous situation to soft/hard-float what version of the VM you are using?
  - Alex: yes, it's just that we expect all VMs to support multivalue return
- e.g., if we bind WASI and WASI has functions that return multiple things
  - we would use wasm to match that
Scott: when rust does multiple return values, what happens?
- Alex: if you return a Rust tuple in a "C" ABI, you get a warning
- LLVM supports it by returning an aggregate by value, which LLVM translates to a multivalue return
- We could say that we define returning a tuple like -> (T, U) is how we translate multivalue returns
Josh: Alternative solutions?
- Alex: the main one is to push the C ABI to match what we want and support multivalue
- Mark: doesn't fit the desire for a simple translation that hard errors on weird stuff
- Alex: true. It gives us less flexibility when wasm evolves in the future.
  - answer from the C side is "it breaks"
  - by separating wasm and C, we insulate users from some amount of breakage
Scott: Would there be a use for repr(wasm) to go with this? Are there any cases where repr(C) would have a different definition of which things are aggregates than what wasm would want? Would it make sense to define that to allow enums to be able to be passed, like we defined repr(C, u8) on enums?
- Alex: maybe. Discussed previously.
- Sounds like it's more repr(linear) than needing anything particularly wasm-specific, if it's up to the lint/monomorphization anyway to catch the problems?
- pnkfelix: without repr(wasm) we could get earlier errors...
- niko: ...but then if you have a struct that wants to pass over an FFI boundary you will have to know which target you're using
- pnkfelix: but with repr attribute you have more flexibility, not as hard, you can use cfg_attr
- cramertj: it's kind of an extra set of requirements, a subset of C
- cramertj: I do like that repr(wasm) means libraries exposing types with private fields are promising to support wasm compilation
  - alex: it's already an ABI breaking change
  - cramertj: right, but now it's a compilation error as well, so recompiling the world will cause problems

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2021-04-21-wasm-abi.md

2021-04-21-wasm-abi.md

New ABI: "wasm"

Background - multi-value

Background - ABI mismatch

Future of `extern "C"`

Motivation of `extern "wasm"`

Definition of `extern "wasm"`

Notes from meeting

Files

2021-04-21-wasm-abi.md

Latest commit

History

2021-04-21-wasm-abi.md

File metadata and controls

New ABI: "wasm"

Background - multi-value

Background - ABI mismatch

Future of extern "C"

Motivation of extern "wasm"

Definition of extern "wasm"

Notes from meeting

Future of `extern "C"`

Motivation of `extern "wasm"`

Definition of `extern "wasm"`