-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should there be an extern "wasm"
ABI?
#29
Comments
I see why are you stating that structs can't pass accross FFI boundaries: wasm can't hold any values other than i32, i64, f32, f64. |
Other ABIs have defined rules for how to split structs up, and how their parts cross the boundary. Both sides of FFI can agree on how to split up and paste back together structs. AIUI, wasm does not have any such rules, and so you cannot pass structs across the boundary because there is no guarantees that both sides will agree on what is being passed and how. For wasm, it would be incorrect for the compiler to allow structs to be used in FFI. |
Interesting, I thought contrary to that (at least, in cases where on both sides code is produced by LLVM). |
An interesting question! Makes me wonder a bit more in general about ABI implications in Rust right now. At the base leve @pepyakin's right, we can only deal with some integers/floats, but another aspect we're not exposing today is the "two level import" where you import from a module and then an item in that module. That leads me to two questions:
There's also the question of ABI strings ( |
See also rust-lang/rust#47599 |
This is what I'd expect. |
But what if I would like to actually link to a C code? |
@pepyakin that's true but I also personally have no model for what that even means. Today wasm (in rust at least) has no compilation units, but linking to C code implies some degree of compilation units. In that sense it's difficult to envision what's happening here... One perhaps reasonable way to rationalize this could be:
That I imagine would help solve communicating with C (assuming everythign works out, again that doesn't really exist yet!) while also being clear about what you're importing from the environment. Perhaps then... #[module = "foo"] // required for `extern "wasm"`
extern "wasm" {
fn bar() -> i32; // imported from module "foo"
// fn baz() -> i8; // illegal
}
extern "C" {
fn another(); // generates a link error if unresolved and final `.wasm` file is created
} |
Yes; we are expecting a stable C ABI to emerge for wasm eventually. In fact, most of the C ABI details in clang are pretty stable already. The one upcoming C ABI change that is I know of is that when wasm gets multiple return values, clang will likely switch to passing and returning "small-ish" structs by value (rather than byval/sret in LLVM). After that, it's likely that wasm's C ABI will be stable. |
Sounds reasonable!
There is some demand for linking against bare wast files (e.g. produced by |
@sunfishcode ah while you're here, do you know if there's an idea of how to specify wasm imports from C? (imports that get hooked into the ES module system, that is) |
If there is we could see this as an opportunity to remove the requirement on the |
@alexcrichton Yeah, there are ongoing discussions. The rough theory is:
|
@sunfishcode ok cool, thanks! That actually aligns pretty well with what I had in mind already, and should work perfectly for us! |
To add to this: I personally tend to avoid writing out the "C": |
I've now posted https://reviews.llvm.org/D42520 which is an LLVM patch implementing |
Awesome, thanks for the update @sunfishcode! |
Thinking about this a bit I'd like to bikeshed some possibilities for how we can expose this in Rust. I think there's a big difference in the "desired end state" from what we have today where the wasm module imports/exports are conflated with imports/exports from codegen units. In other words, I think we'll want to consider a sort of "two level namespace" along the lines of: #[no_mangle]
pub extern fn foo() {} // exports today, will not export tomorrow
extern {
fn bar(); // import from the module today, won't tomorrow.
} So with that in mind I'd like to bikeshed two syntaxes for this. Add a
|
Would we allow pointers and
Are you imagining this with the future dynamic wasm linking, or with statically linking multiple Given what is supported now, I would expect this
to turn into something like this #[wasm_import] // we expect to import this at instantiation time, not link to another `.wasm` providing it.
extern "wasm" {
fn bar();
}
Other than the above confusion, this is 💯 |
Perhaps yeah! I'd personally lean initially towards the ultra-conservative stance of "no" but I'd be fine going in either direction.
Ah so to clarify, wasm-the-specification supports something which we're not exporting in Rust right now which is two-level imports. Sort of like in JS where you can do: import { foo } from 'bar'; the module you're importing from here is So today you'd write: extern {
fn foo();
} which in JS would be equivalent to: import { foo } from 'env'; which isn't actually what we want! Instead what I'm thinking we'd do is: #[wasm_module = "bar"]
extern {
fn foo();
} which is equivalent to our original js: import { foo } from 'bar'; Does that help clear things up? |
Yep! |
I feel conflicted on this. On the one hand I like the first one more because #[no_mangle]
pub extern "wasm" fn foo() {} is consistent with our current code doing: #[no_mangle]
pub extern "c" fn foo() {} I would think we would want to err on the side of consistency for now because it would "Just make sense" to what Rustaceans already know. That being said I find the: #[wasm_export]
pub fn foo() {} To have less noisy syntax and more clearly state what's going in the code, especially for people who might be new to Rust and aren't set in the current FFI conventions. I don't know what I want yet personally out of those choices. Maybe we could do a compromise: #[wasm_export]
pub fn foo() {} implies: #[no_mangle]
pub extern "wasm" fn foo() {} and both as valid choices and let the user choose how they want to define it. This does lead to more maintenance burden but we could just have #[wasm_export] just be a proc_macro that desugars to the I do think that this: #[wasm_module = "bar"]
extern {
fn foo();
} would be great to avoid putting everything in |
Just to clarify a few things #[no_mangle]
pub extern "C" fn foo(...) { ... } will continue to work just like it does right now, right? Otherwise this proposal will completely wreck my use case. |
@mgattozzi yeah one thing I'm also not sure about with the @CryZe I do think we will break that one way or another, yes. Right now we're mixing up linker-like symbols with imports/exports from the wasm module. The intention of this issue/proposal is to make it more explicitly what's happening at the wasm import/export layer (and don't conflate it with symbols). The functionality isn't being lost though! While today you'd write: #[no_mangle]
pub extern "C" fn foo() {} that'd be equivalent to tomorrow's #[wasm_export]
pub fn foo() {} (depending on the syntax we settle on) |
I have a C API that I compile to various targets (39 targets including the WASM target atm). Requiring me to have a full on copy of this C API just for the WASM target is not maintainable for me (that's like 500 exported symbols). So it would be preferable to have a solution that works across various targets, like keeping the behavior of the extern "C" functions, or allowing extern "C" with #[wasm_export] and ignoring that attribute for the other targets. |
Aha an excellent point @CryZe, thanks for the info! To me that's the nail in the coffin to do: #[wasm_export] // implies `no_mangle`, only works with `"C"` ABI functions
pub extern fn foo() {}
#[wasm_module = "baz"]
extern { // must be "C" ABI
fn bar();
} We could ignore the @CryZe note that it's also being considered that we should validate signatures with |
@alexcrichton I think we should do at least some kind of validation if we can. Unlike #[no_std] removing the std library there's not much we can do to prevent people from using primitives they shouldn't beyond stopping the compiler from finishing the compile. I think this is good behavior to have. I have a feeling we should allow pointers as that's usually how to deal with passing foreign types to other languages but, maybe we can add it at a later date once we get the actual ABI (I think we can call it that in this case) started and hash out the specifics more. We generally go for conservative implementation first than iterate. |
I am sympathetic to making cross platform externs work. I think #[cfg_attr(target_arch = "wasm32", wasm_import)] should do the trick:
Note that this works today, so it is more about if we want it not to work anymore. Also, despite using |
So, after thinking about this some more, there's two different ways of handling
So because the conversion depends upon the host environment, my current thinking is that Rust should simply forbid passing If the programmer knows that the host environment supports native And in the case of JavaScript the programmer can use This also gives more flexibility: if the JavaScript code does a conversion from But if the JavaScript code expects a non-negative Number (without conversion), then the programmer should use |
This commit adds a new attribute to the Rust compiler specific to the wasm target (and no other targets). The `#[wasm_import_module]` attribute is used to specify the module that a name is imported from, and is used like so: #[wasm_import_module = "./foo.js"] extern { fn some_js_function(); } Here the import of the symbol `some_js_function` is tagged with the `./foo.js` module in the wasm output file. Wasm-the-format includes two fields on all imports, a module and a field. The field is the symbol name (`some_js_function` above) and the module has historically unconditionally been `"env"`. I'm not sure if this `"env"` convention has asm.js or LLVM roots, but regardless we'd like the ability to configure it! The proposed ES module integration with wasm (aka a wasm module is "just another ES module") requires that the import module of wasm imports is interpreted as an ES module import, meaning that you'll need to encode paths, NPM packages, etc. As a result, we'll need this to be something other than `"env"`! Unfortunately neither our version of LLVM nor LLD supports custom import modules (aka anything not `"env"`). My hope is that by the time LLVM 7 is released both will have support, but in the meantime this commit adds some primitive encoding/decoding of wasm files to the compiler. This way rustc postprocesses the wasm module that LLVM emits to ensure it's got all the imports we'd like to have in it. Eventually I'd ideally like to unconditionally require this attribute to be placed on all `extern { ... }` blocks. For now though it seemed prudent to add it as an unstable attribute, so for now it's not required (as that'd force usage of a feature gate). Hopefully it doesn't take too long to "stabilize" this! cc rust-lang-nursery/rust-wasm#29
I've sent a PR to rust-lang/rust to implement |
I just found this. It's a part of the host bindings proposal, so it's not standardized yet, but it provides a mechanism for wasm to import/export So that makes me even more convinced that right now we shouldn't allow for |
This commit adds a new attribute to the Rust compiler specific to the wasm target (and no other targets). The `#[wasm_import_module]` attribute is used to specify the module that a name is imported from, and is used like so: #[wasm_import_module = "./foo.js"] extern { fn some_js_function(); } Here the import of the symbol `some_js_function` is tagged with the `./foo.js` module in the wasm output file. Wasm-the-format includes two fields on all imports, a module and a field. The field is the symbol name (`some_js_function` above) and the module has historically unconditionally been `"env"`. I'm not sure if this `"env"` convention has asm.js or LLVM roots, but regardless we'd like the ability to configure it! The proposed ES module integration with wasm (aka a wasm module is "just another ES module") requires that the import module of wasm imports is interpreted as an ES module import, meaning that you'll need to encode paths, NPM packages, etc. As a result, we'll need this to be something other than `"env"`! Unfortunately neither our version of LLVM nor LLD supports custom import modules (aka anything not `"env"`). My hope is that by the time LLVM 7 is released both will have support, but in the meantime this commit adds some primitive encoding/decoding of wasm files to the compiler. This way rustc postprocesses the wasm module that LLVM emits to ensure it's got all the imports we'd like to have in it. Eventually I'd ideally like to unconditionally require this attribute to be placed on all `extern { ... }` blocks. For now though it seemed prudent to add it as an unstable attribute, so for now it's not required (as that'd force usage of a feature gate). Hopefully it doesn't take too long to "stabilize" this! cc rust-lang-nursery/rust-wasm#29
This commit adds a new attribute to the Rust compiler specific to the wasm target (and no other targets). The `#[wasm_import_module]` attribute is used to specify the module that a name is imported from, and is used like so: #[wasm_import_module = "./foo.js"] extern { fn some_js_function(); } Here the import of the symbol `some_js_function` is tagged with the `./foo.js` module in the wasm output file. Wasm-the-format includes two fields on all imports, a module and a field. The field is the symbol name (`some_js_function` above) and the module has historically unconditionally been `"env"`. I'm not sure if this `"env"` convention has asm.js or LLVM roots, but regardless we'd like the ability to configure it! The proposed ES module integration with wasm (aka a wasm module is "just another ES module") requires that the import module of wasm imports is interpreted as an ES module import, meaning that you'll need to encode paths, NPM packages, etc. As a result, we'll need this to be something other than `"env"`! Unfortunately neither our version of LLVM nor LLD supports custom import modules (aka anything not `"env"`). My hope is that by the time LLVM 7 is released both will have support, but in the meantime this commit adds some primitive encoding/decoding of wasm files to the compiler. This way rustc postprocesses the wasm module that LLVM emits to ensure it's got all the imports we'd like to have in it. Eventually I'd ideally like to unconditionally require this attribute to be placed on all `extern { ... }` blocks. For now though it seemed prudent to add it as an unstable attribute, so for now it's not required (as that'd force usage of a feature gate). Hopefully it doesn't take too long to "stabilize" this! cc rust-lang-nursery/rust-wasm#29
It's now possible in nightly to specify the module of an import via the I'm personally sort of tempted to close this as "done" otherwise rather than implement a check for types in the ABI and perhaps investigate a lint at a later date |
@alexcrichton I was going through this tutorial and noticed the note at the top. Does the recent PR change anything for the tutorial here https://rustwasm.github.io/book/js-ffi.html? What does |
For example, is it possible to add/use |
@DrSensor , Essentially, it is whether the generated JS glue looks like // default
const importedThing = window.importedThing; vs // using #[wasm_bindgen(wasm_import_module = "./foo")]
import { importedThing } from "./foo"; For importing |
It's been awhile since this has come up again and I think in the meantime we've ended up with what seems like a good balance between the various concerns here. Notably the current state of play is:
Otherwise in the meantime tools like With all that, I'm gonna close this! |
I wonder if this should be reconsidered. The reason being I think it's useful to use the wasm abi for building ffi's to gc'ed languages. It's possible to write completely safe ffi apis that express ownership/lifetimes and are ergonomic to write. An example would look like this: #[repr(C)]
pub struct Str<'a> {
ptr: NonNull<u8>,
len: usize,
marker: PhantomData<&'a str>,
}
#[no_mangle]
pub extern "C" fn ffi_new_greeter(greeting: Str<'_>) -> FfiResult<Box<Greeter>> {
catch_panic(move || {
Ok(Box::new(Greeter::new(greeting.as_ref().to_string())))
})
} however when binding it on the other end in gc'ed languages it requires two allocations to call this method, one to allocate the byte buffer, and one to allocate the on wasm32, especially interesting with |
@dvc94ch looks like it may have been. |
interesting. for now I decided to go with parsing and generating code for "rust header" files after studying various projects for generating ffi interfaces and bindings. I expect ffi-gen to be released in the next couple of weeks. |
Just released the first version of ffi-gen with support for dart/flutter/node/browser. Still a little lite on documentation, there are cases that can misscompile if you aren't familiar with the internals and errors are very poor. However I think it's currently the best solution out there if you want to create a rust library with bindings to multiple languages. We've already migrated our api to ffi-gen. |
It would make compile-time errors if you tried to pass something that can't be passed across FFI boundaries in wasm, but can in C, such as structs.
Maybe this already exists and I'm ignorant of it?
Anyways, if feels weird to expose wasm APIs with
extern "C"
.The text was updated successfully, but these errors were encountered: