Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the AST of types in the witx crate #391

Merged
merged 10 commits into from
Feb 11, 2021

Conversation

alexcrichton
Copy link
Contributor

This series of commits is intended to start a refactoring of the witx crate's AST to get, yet again, closer to interface types. With a recently rereshed explainer interface types can almost entirely encompass what WASI uses right now with a smaller set of types and some more shorthands. The purpose of this PR was to simplify all types to just those specified in the interface types specification right now, namely:

  • Array is renamed to List
  • Struct is renamed to Record
  • int typedefs are refactored into simple builtin types and a separatetly-stored list of constants
  • Flags as-is are redefined as integers with associated constants for each value. Note that this is different from the proposed elaboration of flags to a struct-of-bools. While we may want to go in this direction eventually I opted to do the simple thing to keep ABI stability for now.
  • string is recast as list[char]
  • Union is entirely defined in terms of Variant
  • Types like Char8 and Usize are now hints on U8 and U32

The general goal is to get the type-level AST to exactly match interface types. The eventual intention is to continue refactoring internals and exposing more functionality around "what is the wasm signature for this function" along with "how do I convert a language value to this wasm signature?" or "how do I convert the wasm signature to a language value?".

This is not intended to have any ABI-breaking changes at this time. It's a large breaking change for tooling, though, since lots got renamed. I've been updating the wasi crate as part of this work to ensure that the same ABI is generated as before.

I've added a number of locations in definitions of flags/enums as well that have specific @witx annotations. These are mostly around tag sizes for now, to specifically say that the size of the tag on a Variant is required to be a particular value. Eventually interface types will abstract the need for this to be specified at all, and more closely in the future I hope to start proposing a stable ABI mapping from interface types used by witx to wasm signatures, where the stable mapping would mean we could eschew manually sizing tags as well.

I hope to follow-up with subsequent PRs to this as well. Namely I would like to explore updating the internals to centralize more details about wasm-to-language and language-to-wasm conversions in this repository. I'm not 100% sure how this will look yet but the intention is that it should be easy to use the witx crate to be used for any language binding generator to convert signatures back and forth.

Move constants to a separate list in `Document`
Remove the `Type::Flags` variant in favor of a witx-specific way to
define a bitflags with a number of constants.
Instead of having a separate variant for `Char8` and `Usize` instead
store boolean flags on `U8` and `U32` whether they're intended for these
language-specific purposes. This additionally changes to require `@witx`
when parsing `char8` to ensure usage is flagged as witx-specific.
Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great!

tools/witx/src/ast.rs Show resolved Hide resolved
tools/witx/src/ast.rs Show resolved Hide resolved
tools/witx/src/docs/ast.rs Outdated Show resolved Hide resolved
(union (@witx tag $eventtype)
$subscription_clock
$subscription_fd_readwrite
$subscription_fd_readwrite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not immediately clear why some unions such as event_u are migrated to variant and others, such as subscription_u here, are left as union.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The desugaring is a bit tricky with the current ABI, but I generally tried to keep things as union where possible. By using @witx tag the union's types must all mattch up exactly with the variants of the tag (which should be an enum).

One place this was tripped up though was the empty case parsing for unions. That was where one entry of the union didn't have any payload. This didn't have any translation to this updated union syntax where it must be a set of types that are union'd, so anything using empty was switch to the explicit variant.

In general it doesn't matter too too much except from a readability perspective since this all desugars to the same thing in the AST, which is to say a tagged variant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is variant a superset of the functionality of union? If so, I think it's ok to migrate to variant over time. To help readers, I do want to aim for a single syntax, but we can finish the migration later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, yeah, mostly because variant is the actual thing in the AST whereas union is just syntax sugar as seen here .

This will hopefully become less important over time as @witx tag is migrated away from perhaps? Otherwise though there could perhaps be more syntax sugar for a union where one of the cases is empty.

@sunfishcode sunfishcode merged commit cbce369 into WebAssembly:main Feb 11, 2021
@alexcrichton alexcrichton deleted the refactor-type-ast branch February 11, 2021 15:16
alexcrichton added a commit to alexcrichton/WASI that referenced this pull request Feb 16, 2021
This commit is another large refactoring on the tail of WebAssembly#391 which is
intended to help march one step closer to using pure interface types for
specifying WASI and using `*.witx` files. Contained here is a large
amount of refactoring of the `*.witx` files, the internals of the `witx`
crate, and how code generators are supposed to work.

At the `*.witx` level, some notable changes have been made:

* All functions returning results now return `(expected ok? (error
  $errno))`. This helps signify that the intention of all functions is
  that they return a packaged value which is either a successful result
  or the erroneous error code on failure. ABI-wise nothing has changed,
  but this is intended to model more closely what the interface-types
  API of this would be.

* The `flags` type in `*.witx` now desugars as a struct-of-bools rather
  than simply an `int`. This doesn't have any effect on the ABI today,
  but it does change the AST and type-level representation of these types.

All existing `*.witx` files in this repository have been updated for all
phases with these new forms. To reiterate, though, no details of the ABI
have changed (or at least not intentionally). Everything should still be
ABI-compatible with before.

Under the hood for code generators this is a very large breaking change.
The intention of this commit is to start moving code generators relying
on `witx` into the direction that we'll end up with interface types.
Namely the `coretypes` module is entirely replaced with a new `abi`
module with the intention of handling lots more high-level details of
translation than the previous module did.

The `InterfaceFunc` type now sports two new methods: `call_wasm` and
`call_interface`. These two methods represent the two halves of
lifting/lowering operations performed by code generators. The
`call_wasm` function takes interface types values (or whatever they are
represented as in the source language) and converts them to wasm types
to call a wasm import. The wasm import's return values are then
"deserialized" back into the language's interface types values too. The
`call_interface` function is the opposite, it assumes that you're coming
from wasm values and calling, e.g., a host function with interface values.

The `abi` module is refactored with an `Instruction` `enum` which lists
out what are the current set of "adapter instructions". These adapter
instructions are intended to somewhat model the eventual idea of adapter
functions and their instructions, but for now they're pretty specific to
WASI as-is today. All instructions currently model what WASI currently
does, sometimes in very specific manners. It's expected that the
`Instruction` type will be in flux as we tweak the ABI of WASI over
time, but the current set of instructions will likely only be expanded
because of maintaining compatibility with the current snapshot.

The expected usage of `Instruction` is that code generators will
implement how to generate code for each `Instruction`. This should be a
very low-level operation (generating code per instruction) and should be
in theory quite easy to implement. Not all code generators need to
implement all instructions depending on their use case. Additionally
WASI as-is today doesn't always exercise all types of instructions.

The next steps after a PR like this include starting to define a new ABI
for WASI which supports all types in all positions rather than the
current ABI which supports only a limited subset of types in some
positions. The intention is that this new ABI may add a new instruction
or two but will generally not be a large breaking change for all
existing code generators. Some new additions are expected but other than
that existing code generators updated to use this PR will require little
effort to support new ABIs.
pchickey pushed a commit that referenced this pull request Feb 18, 2021
* Take another step towards interface types

This commit is another large refactoring on the tail of #391 which is
intended to help march one step closer to using pure interface types for
specifying WASI and using `*.witx` files. Contained here is a large
amount of refactoring of the `*.witx` files, the internals of the `witx`
crate, and how code generators are supposed to work.

At the `*.witx` level, some notable changes have been made:

* All functions returning results now return `(expected ok? (error
  $errno))`. This helps signify that the intention of all functions is
  that they return a packaged value which is either a successful result
  or the erroneous error code on failure. ABI-wise nothing has changed,
  but this is intended to model more closely what the interface-types
  API of this would be.

* The `flags` type in `*.witx` now desugars as a struct-of-bools rather
  than simply an `int`. This doesn't have any effect on the ABI today,
  but it does change the AST and type-level representation of these types.

All existing `*.witx` files in this repository have been updated for all
phases with these new forms. To reiterate, though, no details of the ABI
have changed (or at least not intentionally). Everything should still be
ABI-compatible with before.

Under the hood for code generators this is a very large breaking change.
The intention of this commit is to start moving code generators relying
on `witx` into the direction that we'll end up with interface types.
Namely the `coretypes` module is entirely replaced with a new `abi`
module with the intention of handling lots more high-level details of
translation than the previous module did.

The `InterfaceFunc` type now sports two new methods: `call_wasm` and
`call_interface`. These two methods represent the two halves of
lifting/lowering operations performed by code generators. The
`call_wasm` function takes interface types values (or whatever they are
represented as in the source language) and converts them to wasm types
to call a wasm import. The wasm import's return values are then
"deserialized" back into the language's interface types values too. The
`call_interface` function is the opposite, it assumes that you're coming
from wasm values and calling, e.g., a host function with interface values.

The `abi` module is refactored with an `Instruction` `enum` which lists
out what are the current set of "adapter instructions". These adapter
instructions are intended to somewhat model the eventual idea of adapter
functions and their instructions, but for now they're pretty specific to
WASI as-is today. All instructions currently model what WASI currently
does, sometimes in very specific manners. It's expected that the
`Instruction` type will be in flux as we tweak the ABI of WASI over
time, but the current set of instructions will likely only be expanded
because of maintaining compatibility with the current snapshot.

The expected usage of `Instruction` is that code generators will
implement how to generate code for each `Instruction`. This should be a
very low-level operation (generating code per instruction) and should be
in theory quite easy to implement. Not all code generators need to
implement all instructions depending on their use case. Additionally
WASI as-is today doesn't always exercise all types of instructions.

The next steps after a PR like this include starting to define a new ABI
for WASI which supports all types in all positions rather than the
current ABI which supports only a limited subset of types in some
positions. The intention is that this new ABI may add a new instruction
or two but will generally not be a large breaking change for all
existing code generators. Some new additions are expected but other than
that existing code generators updated to use this PR will require little
effort to support new ABIs.

* Add more documentation and some more tests for shorthand syntax

* Bump to 0.9.0 and bump wast dependency

* Support variants as arguments

Same as records, they use pointers

* Fix a typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants