Refactor the AST of types in the `witx` crate #391

alexcrichton · 2021-02-08T22:37:32Z

This series of commits is intended to start a refactoring of the witx crate's AST to get, yet again, closer to interface types. With a recently rereshed explainer interface types can almost entirely encompass what WASI uses right now with a smaller set of types and some more shorthands. The purpose of this PR was to simplify all types to just those specified in the interface types specification right now, namely:

Array is renamed to List
Struct is renamed to Record
int typedefs are refactored into simple builtin types and a separatetly-stored list of constants
Flags as-is are redefined as integers with associated constants for each value. Note that this is different from the proposed elaboration of flags to a struct-of-bools. While we may want to go in this direction eventually I opted to do the simple thing to keep ABI stability for now.
string is recast as list[char]
Union is entirely defined in terms of Variant
Types like Char8 and Usize are now hints on U8 and U32

The general goal is to get the type-level AST to exactly match interface types. The eventual intention is to continue refactoring internals and exposing more functionality around "what is the wasm signature for this function" along with "how do I convert a language value to this wasm signature?" or "how do I convert the wasm signature to a language value?".

This is not intended to have any ABI-breaking changes at this time. It's a large breaking change for tooling, though, since lots got renamed. I've been updating the wasi crate as part of this work to ensure that the same ABI is generated as before.

I've added a number of locations in definitions of flags/enums as well that have specific @witx annotations. These are mostly around tag sizes for now, to specifically say that the size of the tag on a Variant is required to be a particular value. Eventually interface types will abstract the need for this to be specified at all, and more closely in the future I hope to start proposing a stable ABI mapping from interface types used by witx to wasm signatures, where the stable mapping would mean we could eschew manually sizing tags as well.

I hope to follow-up with subsequent PRs to this as well. Namely I would like to explore updating the internals to centralize more details about wasm-to-language and language-to-wasm conversions in this repository. I'm not 100% sure how this will look yet but the intention is that it should be easy to use the witx crate to be used for any language binding generator to convert signatures back and forth.

Move constants to a separate list in `Document`

Remove the `Type::Flags` variant in favor of a witx-specific way to define a bitflags with a number of constants.

Instead of having a separate variant for `Char8` and `Usize` instead store boolean flags on `U8` and `U32` whether they're intended for these language-specific purposes. This additionally changes to require `@witx` when parsing `char8` to ensure usage is flagged as witx-specific.

sunfishcode

Overall looks great!

tools/witx/src/ast.rs

tools/witx/src/docs/ast.rs

sunfishcode · 2021-02-08T23:24:18Z

phases/ephemeral/witx/typenames.witx

+  (union (@witx tag $eventtype)
+    $subscription_clock
+    $subscription_fd_readwrite
+    $subscription_fd_readwrite


It's not immediately clear why some unions such as event_u are migrated to variant and others, such as subscription_u here, are left as union.

The desugaring is a bit tricky with the current ABI, but I generally tried to keep things as union where possible. By using @witx tag the union's types must all mattch up exactly with the variants of the tag (which should be an enum).

One place this was tripped up though was the empty case parsing for unions. That was where one entry of the union didn't have any payload. This didn't have any translation to this updated union syntax where it must be a set of types that are union'd, so anything using empty was switch to the explicit variant.

In general it doesn't matter too too much except from a readability perspective since this all desugars to the same thing in the AST, which is to say a tagged variant.

Is variant a superset of the functionality of union? If so, I think it's ok to migrate to variant over time. To help readers, I do want to aim for a single syntax, but we can finish the migration later.

It is, yeah, mostly because variant is the actual thing in the AST whereas union is just syntax sugar as seen here .

This will hopefully become less important over time as @witx tag is migrated away from perhaps? Otherwise though there could perhaps be more syntax sugar for a union where one of the cases is empty.

This commit is another large refactoring on the tail of WebAssembly#391 which is intended to help march one step closer to using pure interface types for specifying WASI and using `*.witx` files. Contained here is a large amount of refactoring of the `*.witx` files, the internals of the `witx` crate, and how code generators are supposed to work. At the `*.witx` level, some notable changes have been made: * All functions returning results now return `(expected ok? (error $errno))`. This helps signify that the intention of all functions is that they return a packaged value which is either a successful result or the erroneous error code on failure. ABI-wise nothing has changed, but this is intended to model more closely what the interface-types API of this would be. * The `flags` type in `*.witx` now desugars as a struct-of-bools rather than simply an `int`. This doesn't have any effect on the ABI today, but it does change the AST and type-level representation of these types. All existing `*.witx` files in this repository have been updated for all phases with these new forms. To reiterate, though, no details of the ABI have changed (or at least not intentionally). Everything should still be ABI-compatible with before. Under the hood for code generators this is a very large breaking change. The intention of this commit is to start moving code generators relying on `witx` into the direction that we'll end up with interface types. Namely the `coretypes` module is entirely replaced with a new `abi` module with the intention of handling lots more high-level details of translation than the previous module did. The `InterfaceFunc` type now sports two new methods: `call_wasm` and `call_interface`. These two methods represent the two halves of lifting/lowering operations performed by code generators. The `call_wasm` function takes interface types values (or whatever they are represented as in the source language) and converts them to wasm types to call a wasm import. The wasm import's return values are then "deserialized" back into the language's interface types values too. The `call_interface` function is the opposite, it assumes that you're coming from wasm values and calling, e.g., a host function with interface values. The `abi` module is refactored with an `Instruction` `enum` which lists out what are the current set of "adapter instructions". These adapter instructions are intended to somewhat model the eventual idea of adapter functions and their instructions, but for now they're pretty specific to WASI as-is today. All instructions currently model what WASI currently does, sometimes in very specific manners. It's expected that the `Instruction` type will be in flux as we tweak the ABI of WASI over time, but the current set of instructions will likely only be expanded because of maintaining compatibility with the current snapshot. The expected usage of `Instruction` is that code generators will implement how to generate code for each `Instruction`. This should be a very low-level operation (generating code per instruction) and should be in theory quite easy to implement. Not all code generators need to implement all instructions depending on their use case. Additionally WASI as-is today doesn't always exercise all types of instructions. The next steps after a PR like this include starting to define a new ABI for WASI which supports all types in all positions rather than the current ABI which supports only a limited subset of types in some positions. The intention is that this new ABI may add a new instruction or two but will generally not be a large breaking change for all existing code generators. Some new additions are expected but other than that existing code generators updated to use this PR will require little effort to support new ABIs.

* Take another step towards interface types This commit is another large refactoring on the tail of #391 which is intended to help march one step closer to using pure interface types for specifying WASI and using `*.witx` files. Contained here is a large amount of refactoring of the `*.witx` files, the internals of the `witx` crate, and how code generators are supposed to work. At the `*.witx` level, some notable changes have been made: * All functions returning results now return `(expected ok? (error $errno))`. This helps signify that the intention of all functions is that they return a packaged value which is either a successful result or the erroneous error code on failure. ABI-wise nothing has changed, but this is intended to model more closely what the interface-types API of this would be. * The `flags` type in `*.witx` now desugars as a struct-of-bools rather than simply an `int`. This doesn't have any effect on the ABI today, but it does change the AST and type-level representation of these types. All existing `*.witx` files in this repository have been updated for all phases with these new forms. To reiterate, though, no details of the ABI have changed (or at least not intentionally). Everything should still be ABI-compatible with before. Under the hood for code generators this is a very large breaking change. The intention of this commit is to start moving code generators relying on `witx` into the direction that we'll end up with interface types. Namely the `coretypes` module is entirely replaced with a new `abi` module with the intention of handling lots more high-level details of translation than the previous module did. The `InterfaceFunc` type now sports two new methods: `call_wasm` and `call_interface`. These two methods represent the two halves of lifting/lowering operations performed by code generators. The `call_wasm` function takes interface types values (or whatever they are represented as in the source language) and converts them to wasm types to call a wasm import. The wasm import's return values are then "deserialized" back into the language's interface types values too. The `call_interface` function is the opposite, it assumes that you're coming from wasm values and calling, e.g., a host function with interface values. The `abi` module is refactored with an `Instruction` `enum` which lists out what are the current set of "adapter instructions". These adapter instructions are intended to somewhat model the eventual idea of adapter functions and their instructions, but for now they're pretty specific to WASI as-is today. All instructions currently model what WASI currently does, sometimes in very specific manners. It's expected that the `Instruction` type will be in flux as we tweak the ABI of WASI over time, but the current set of instructions will likely only be expanded because of maintaining compatibility with the current snapshot. The expected usage of `Instruction` is that code generators will implement how to generate code for each `Instruction`. This should be a very low-level operation (generating code per instruction) and should be in theory quite easy to implement. Not all code generators need to implement all instructions depending on their use case. Additionally WASI as-is today doesn't always exercise all types of instructions. The next steps after a PR like this include starting to define a new ABI for WASI which supports all types in all positions rather than the current ABI which supports only a limited subset of types in some positions. The intention is that this new ABI may add a new instruction or two but will generally not be a large breaking change for all existing code generators. Some new additions are expected but other than that existing code generators updated to use this PR will require little effort to support new ABIs. * Add more documentation and some more tests for shorthand syntax * Bump to 0.9.0 and bump wast dependency * Support variants as arguments Same as records, they use pointers * Fix a typo

alexcrichton added 9 commits February 8, 2021 07:15

Rename Array to List

3bcf82b

Rename structs to records

1fc1dd2

Add initial support for variant type

e6451fb

Remove IntDatatype

ce0c8c1

Move constants to a separate list in `Document`

Implement enums in terms of Variant

01360e1

Define Flags in terms of integers

616dac9

Remove the `Type::Flags` variant in favor of a witx-specific way to define a bitflags with a number of constants.

Represent strings as lists of char internally

de0b5ff

Implement Union in terms of Variant

cf29081

sunfishcode reviewed Feb 8, 2021

View reviewed changes

Address some review feedback

2cdd6be

alexcrichton mentioned this pull request Feb 10, 2021

Add a new style of test suite for the witx tooling similar to *.wast #392

Merged

sunfishcode approved these changes Feb 11, 2021

View reviewed changes

sunfishcode merged commit cbce369 into WebAssembly:main Feb 11, 2021

alexcrichton deleted the refactor-type-ast branch February 11, 2021 15:16

alexcrichton mentioned this pull request Feb 16, 2021

Take another step towards interface types #395

Merged

thunderseethe mentioned this pull request Mar 31, 2021

Upgrade to witx 0.9 jedisct1/witx-codegen#4

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the AST of types in the `witx` crate #391

Refactor the AST of types in the `witx` crate #391

alexcrichton commented Feb 8, 2021

sunfishcode left a comment

sunfishcode Feb 8, 2021

alexcrichton Feb 9, 2021

sunfishcode Feb 11, 2021

alexcrichton Feb 11, 2021

Refactor the AST of types in the witx crate #391

Refactor the AST of types in the witx crate #391

Conversation

alexcrichton commented Feb 8, 2021

sunfishcode left a comment

Choose a reason for hiding this comment

sunfishcode Feb 8, 2021

Choose a reason for hiding this comment

alexcrichton Feb 9, 2021

Choose a reason for hiding this comment

sunfishcode Feb 11, 2021

Choose a reason for hiding this comment

alexcrichton Feb 11, 2021

Choose a reason for hiding this comment

Refactor the AST of types in the `witx` crate #391

Refactor the AST of types in the `witx` crate #391