Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take another step towards interface types #395

Merged
merged 5 commits into from
Feb 18, 2021

Conversation

alexcrichton
Copy link
Contributor

This commit is another large refactoring on the tail of #391 which is
intended to help march one step closer to using pure interface types for
specifying WASI and using *.witx files. Contained here is a large
amount of refactoring of the *.witx files, the internals of the witx
crate, and how code generators are supposed to work.

At the *.witx level, some notable changes have been made:

  • All functions returning results now return (expected ok? (error $errno)). This helps signify that the intention of all functions is
    that they return a packaged value which is either a successful result
    or the erroneous error code on failure. ABI-wise nothing has changed,
    but this is intended to model more closely what the interface-types
    API of this would be.

  • The flags type in *.witx now desugars as a struct-of-bools rather
    than simply an int. This doesn't have any effect on the ABI today,
    but it does change the AST and type-level representation of these types.

All existing *.witx files in this repository have been updated for all
phases with these new forms. To reiterate, though, no details of the ABI
have changed (or at least not intentionally). Everything should still be
ABI-compatible with before.

Under the hood for code generators this is a very large breaking change.
The intention of this commit is to start moving code generators relying
on witx into the direction that we'll end up with interface types.
Namely the coretypes module is entirely replaced with a new abi
module with the intention of handling lots more high-level details of
translation than the previous module did.

The InterfaceFunc type now sports two new methods: call_wasm and
call_interface. These two methods represent the two halves of
lifting/lowering operations performed by code generators. The
call_wasm function takes interface types values (or whatever they are
represented as in the source language) and converts them to wasm types
to call a wasm import. The wasm import's return values are then
"deserialized" back into the language's interface types values too. The
call_interface function is the opposite, it assumes that you're coming
from wasm values and calling, e.g., a host function with interface values.

The abi module is refactored with an Instruction enum which lists
out what are the current set of "adapter instructions". These adapter
instructions are intended to somewhat model the eventual idea of adapter
functions and their instructions, but for now they're pretty specific to
WASI as-is today. All instructions currently model what WASI currently
does, sometimes in very specific manners. It's expected that the
Instruction type will be in flux as we tweak the ABI of WASI over
time, but the current set of instructions will likely only be expanded
because of maintaining compatibility with the current snapshot.

The expected usage of Instruction is that code generators will
implement how to generate code for each Instruction. This should be a
very low-level operation (generating code per instruction) and should be
in theory quite easy to implement. Not all code generators need to
implement all instructions depending on their use case. Additionally
WASI as-is today doesn't always exercise all types of instructions.

The next steps after a PR like this include starting to define a new ABI
for WASI which supports all types in all positions rather than the
current ABI which supports only a limited subset of types in some
positions. The intention is that this new ABI may add a new instruction
or two but will generally not be a large breaking change for all
existing code generators. Some new additions are expected but other than
that existing code generators updated to use this PR will require little
effort to support new ABIs.

This commit is another large refactoring on the tail of WebAssembly#391 which is
intended to help march one step closer to using pure interface types for
specifying WASI and using `*.witx` files. Contained here is a large
amount of refactoring of the `*.witx` files, the internals of the `witx`
crate, and how code generators are supposed to work.

At the `*.witx` level, some notable changes have been made:

* All functions returning results now return `(expected ok? (error
  $errno))`. This helps signify that the intention of all functions is
  that they return a packaged value which is either a successful result
  or the erroneous error code on failure. ABI-wise nothing has changed,
  but this is intended to model more closely what the interface-types
  API of this would be.

* The `flags` type in `*.witx` now desugars as a struct-of-bools rather
  than simply an `int`. This doesn't have any effect on the ABI today,
  but it does change the AST and type-level representation of these types.

All existing `*.witx` files in this repository have been updated for all
phases with these new forms. To reiterate, though, no details of the ABI
have changed (or at least not intentionally). Everything should still be
ABI-compatible with before.

Under the hood for code generators this is a very large breaking change.
The intention of this commit is to start moving code generators relying
on `witx` into the direction that we'll end up with interface types.
Namely the `coretypes` module is entirely replaced with a new `abi`
module with the intention of handling lots more high-level details of
translation than the previous module did.

The `InterfaceFunc` type now sports two new methods: `call_wasm` and
`call_interface`. These two methods represent the two halves of
lifting/lowering operations performed by code generators. The
`call_wasm` function takes interface types values (or whatever they are
represented as in the source language) and converts them to wasm types
to call a wasm import. The wasm import's return values are then
"deserialized" back into the language's interface types values too. The
`call_interface` function is the opposite, it assumes that you're coming
from wasm values and calling, e.g., a host function with interface values.

The `abi` module is refactored with an `Instruction` `enum` which lists
out what are the current set of "adapter instructions". These adapter
instructions are intended to somewhat model the eventual idea of adapter
functions and their instructions, but for now they're pretty specific to
WASI as-is today. All instructions currently model what WASI currently
does, sometimes in very specific manners. It's expected that the
`Instruction` type will be in flux as we tweak the ABI of WASI over
time, but the current set of instructions will likely only be expanded
because of maintaining compatibility with the current snapshot.

The expected usage of `Instruction` is that code generators will
implement how to generate code for each `Instruction`. This should be a
very low-level operation (generating code per instruction) and should be
in theory quite easy to implement. Not all code generators need to
implement all instructions depending on their use case. Additionally
WASI as-is today doesn't always exercise all types of instructions.

The next steps after a PR like this include starting to define a new ABI
for WASI which supports all types in all positions rather than the
current ABI which supports only a limited subset of types in some
positions. The intention is that this new ABI may add a new instruction
or two but will generally not be a large breaking change for all
existing code generators. Some new additions are expected but other than
that existing code generators updated to use this PR will require little
effort to support new ABIs.
@alexcrichton
Copy link
Contributor Author

For a frame of reference, I've updated three different code generators for this refactoring of witx:

  • witx-bindgen - generating Rust bindings for a *.witx file to be called from Rust. This is intended for Rust code that's compiled to WebAssembly itself.
  • wasi-headers - generating a C header file (and now a sidecar source file) to call *.witx-defined APIs from C. This is intende for C code that's compiled to WebAssembly.
  • wiggle - generating Rust source code to implement a *.witx file. This is intended for host code that runs on a native platform.

My hope is that by updating a number of locations this stresses this implementation in a few ways. It first ensures that the actual ABI of all functions has not changed. Additionally it enables running tests and diffs to ensure that generated code has not radically changed shape and still does roughly the same thing. Finally it tests both sides of calling and defining WASI APIs.

The PRs to all those repos do not exist just yet since the code is in a bit of a messy state. Additionally I'd like to perform another pass in this repository to document things before merging (it's just getting late here today). If desired though I can push up the work-in-progress state of each of the branches for reviewing if desired.

@sbc100
Copy link
Member

sbc100 commented Feb 17, 2021

This seems great! Will it be possible to have the C headers be completely unchanged by this? (That would give a fairly high degree of confidence that the ABI didn't break).

@pchickey
Copy link
Contributor

Last time we made this sort of change (#220) we weren't able to keep the text of the headers the same but I used static assertions of size/align to show that the data layout was identical: WebAssembly/wasi-libc#165

Copy link
Contributor

@pchickey pchickey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a great read, I really love how this is unfolding

@alexcrichton
Copy link
Contributor Author

Oh I'm not actually really that worried about this accidentally breaking the ABI we have today. I've changed how the header file is generated and the APIs it generates are indeed different, but that's intentional because shim functions are used to delegate to the real functions (and the real functions don't have a changed ABI). I mostly just wanted to emphasize that all the type-level and *.witx-level changes are not intended to be ABI-breaking, and if they are that's just a bug to fix.

I've also finished up documenting remaining TODO items and adding some more tests I wanted to be sure to get to. @pchickey would you be ok merging this and publishing this in this state? If you'd prefer I'm also happy to clean up and publish the changes to the code generators to review first.

It's worth mentioning that the code generators are seeing breaking changes in the code they generate, namely:

  • In C the APIs in the header are different because they're no longer hooked up directly to the raw import. Instead they're hooked up to generated shims which do parameter conversion as necessary. Currently the only difference is that string lengths are not taken as parameters and the adapters instead call strlen internally (to match C's native representation of what a string is).
  • In Rust enum types are now wrapped up and cannot be forged. The Errno type directly in WASI now implements the Error trait instead of having a separate and wrapping Error type.
  • In wiggle some errors which previously returned EINVAL now return traps, notably if you pass a bad return pointer or something like that.

Note that in C's case the adapters are currently simple but it's expected that they'll get more complicated in the future. For example a next step would be to add debug-mode (or maybe always-on?) code which asserts that outgoing values are all valid. For example witx enums are represented in C as integers, but this means that invalid values could be passed to a consuming API. This error should ideally be caught in C rather than on the other side of the API to reflect the intended semantics of interface types.

alexcrichton added a commit to alexcrichton/rust-wasi that referenced this pull request Feb 17, 2021
This pulls in the new `witx` 0.9 crate which notably includes
WebAssembly/WASI#395. This is a large refactoring of how witx signatures
are processed and a large update to the actual syntax of the witx ABIs
themselves.

This commit adjusts to all the new `witx` APIs and also tweaks idioms in
a few locations as well. Code generation is now less custom to Rust and
instead tries to match almost exactly what the `witx` crate tells us to
generate. Additionally the representation of types is chosen to more
closely align with interface types in the future where it's impossible
to pass invalid values to the outside world. Notable changes here are:

* `enum`-like values are now newtype wrappers around integers to prevent
  invalid values from flowing out.
* Error traits are now implement directly for the `Errno` wrapper.
* The module containing the raw functions now exclusively uses i32/i64
  types and doesn't try to use convenience types in Rust.
alexcrichton added a commit to alexcrichton/rust-wasi that referenced this pull request Feb 17, 2021
This pulls in the new `witx` 0.9 crate which notably includes
WebAssembly/WASI#395. This is a large refactoring of how witx signatures
are processed and a large update to the actual syntax of the witx ABIs
themselves.

This commit adjusts to all the new `witx` APIs and also tweaks idioms in
a few locations as well. Code generation is now less custom to Rust and
instead tries to match almost exactly what the `witx` crate tells us to
generate. Additionally the representation of types is chosen to more
closely align with interface types in the future where it's impossible
to pass invalid values to the outside world. Notable changes here are:

* `enum`-like values are now newtype wrappers around integers to prevent
  invalid values from flowing out.
* Error traits are now implement directly for the `Errno` wrapper.
* The module containing the raw functions now exclusively uses i32/i64
  types and doesn't try to use convenience types in Rust.
@pchickey
Copy link
Contributor

I'll take a look at the code generator changes today!

alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Feb 17, 2021
This commit updates to the 0.9 version of the witx crate implemented in
WebAssembly/WASI#395. This new version drastically changes code
generation and how we interface with the crate. The intention is to
abstract the code generation aspects and allow code generators to
implement much more low-level instructions to enable more flexible APIs
in the future. Additionally a bunch of `*.witx` files were updated in
the WASI repository.

It's worth pointing out, however, that `wasi-common` does not change as
a result of this change. The shape of the APIs that we need to implement
are effectively the same and the only difference is that the shim
functions generated by wiggle are a bit different.
@alexcrichton
Copy link
Contributor Author

For posterity, these are the three other changes for this PR:

Same as records, they use pointers
@sunfishcode
Copy link
Member

For completeness, POSIX generally requires EINVAL (or EFAULT) errors at least for things that are pointers at the POSIX level, however I think trapping is ok: on one hand the POSIX behavior can still be emulated by libc if we decide we really want it, and on the other, it seems unlikely that we'll want it. There isn't a lot of value in being able to pass an invalid pointer into the API, because presumably the application also wouldn't be able to dereference that pointer either.

@alexcrichton
Copy link
Contributor Author

FWIW another motivation for switching to trapping is that it reflects the intended final semantics of interface types themselves where adapter functions would be doing reads/writes when passing values between modules, and the adapter function would trap given an invalid pointer and such.

Ok though I think this is in a good enough spot I'm going to merge with @pchickey's previous approval.

@alexcrichton
Copy link
Contributor Author

Aha or not, I do not have the Green Button!

@pchickey would you be ok merging/publishing for me?

alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Feb 18, 2021
This commit updates to the 0.9 version of the witx crate implemented in
WebAssembly/WASI#395. This new version drastically changes code
generation and how we interface with the crate. The intention is to
abstract the code generation aspects and allow code generators to
implement much more low-level instructions to enable more flexible APIs
in the future. Additionally a bunch of `*.witx` files were updated in
the WASI repository.

It's worth pointing out, however, that `wasi-common` does not change as
a result of this change. The shape of the APIs that we need to implement
are effectively the same and the only difference is that the shim
functions generated by wiggle are a bit different.
@pchickey pchickey merged commit ef8c1a5 into WebAssembly:main Feb 18, 2021
@pchickey
Copy link
Contributor

published. @sunfishcode can you please add alex to committers here?

abrown pushed a commit to bytecodealliance/wasmtime that referenced this pull request Feb 18, 2021
This commit updates to the 0.9 version of the witx crate implemented in
WebAssembly/WASI#395. This new version drastically changes code
generation and how we interface with the crate. The intention is to
abstract the code generation aspects and allow code generators to
implement much more low-level instructions to enable more flexible APIs
in the future. Additionally a bunch of `*.witx` files were updated in
the WASI repository.

It's worth pointing out, however, that `wasi-common` does not change as
a result of this change. The shape of the APIs that we need to implement
are effectively the same and the only difference is that the shim
functions generated by wiggle are a bit different.
alexcrichton added a commit to bytecodealliance/wasi-rs that referenced this pull request Feb 19, 2021
* Update witx to 0.9

This pulls in the new `witx` 0.9 crate which notably includes
WebAssembly/WASI#395. This is a large refactoring of how witx signatures
are processed and a large update to the actual syntax of the witx ABIs
themselves.

This commit adjusts to all the new `witx` APIs and also tweaks idioms in
a few locations as well. Code generation is now less custom to Rust and
instead tries to match almost exactly what the `witx` crate tells us to
generate. Additionally the representation of types is chosen to more
closely align with interface types in the future where it's impossible
to pass invalid values to the outside world. Notable changes here are:

* `enum`-like values are now newtype wrappers around integers to prevent
  invalid values from flowing out.
* Error traits are now implement directly for the `Errno` wrapper.
* The module containing the raw functions now exclusively uses i32/i64
  types and doesn't try to use convenience types in Rust.

* Update submodule management

* Fix no-std-compat

* Fix wasi-ephemeral

* Regenerate lib_generated

* Update to main branch

* Update manifest directives
peterhuene pushed a commit to peterhuene/wasmtime that referenced this pull request Feb 19, 2021
This commit updates to the 0.9 version of the witx crate implemented in
WebAssembly/WASI#395. This new version drastically changes code
generation and how we interface with the crate. The intention is to
abstract the code generation aspects and allow code generators to
implement much more low-level instructions to enable more flexible APIs
in the future. Additionally a bunch of `*.witx` files were updated in
the WASI repository.

It's worth pointing out, however, that `wasi-common` does not change as
a result of this change. The shape of the APIs that we need to implement
are effectively the same and the only difference is that the shim
functions generated by wiggle are a bit different.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants