Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Await #1345

Closed
kripken opened this issue May 18, 2020 · 96 comments
Closed

Proposal: Await #1345

kripken opened this issue May 18, 2020 · 96 comments

Comments

@kripken
Copy link
Member

kripken commented May 18, 2020

@RReverser and I would like to propose a new proposal for WebAssembly: Await.

The motivation for the proposal is to help "synchronous" code compiled to WebAssembly, that does something like a read from a file:

fread(buffer, 1, num, file);
// the data is ready to be used right here, synchronously

This code can't easily be implemented in a host environment which is primarily asynchronous, and which would implement "read from a file" asynchronously, for example on the Web,

const result = fetch("http://example.com/data.dat");
// result is a Promise; the data is not ready yet!

In other words, the goal is to help with the sync/async issue that is so common with wasm on the Web.

The sync/async issue is a serious problem. While new code can be written with it in mind, large existing codebases often cannot be refactored to work around it, which means they cannot run on the Web. We do have Asyncify which instruments a wasm file to allow pausing and resuming, and which has allowed some such codebases to be ported, so we are not completely blocked here. However, instrumenting the wasm has significant overhead, something like a 50% increase to code size and a 50% slowdown on average (but sometimes much worse), because we add instructions to write out / read back in the local state and call stack and so forth. That overhead is a big limitation and it rules out Asyncify in many cases!

This proposal's goal is to allow pausing and resuming execution in an efficient way (in particular, without overhead like Asyncify has) so that all applications that encounter the sync/async problem can easily avoid it. Personally we intend this primarily for the Web, where it can help WebAssembly integrate better with Web APIs, but use cases outside the Web may be relevant as well.

The idea in brief

The core problem here is between wasm code being synchronous and the host environment which is asynchronous. Our approach is therefore focused on the boundary of a wasm instance and the outside. Conceptually, when a new await instruction is executed, the wasm instance "waits" for something from the outside. What "wait" means would differ on different platforms, and may not be relevant on all platforms (like not all platforms may find the wasm atomics proposal relevant), but on the Web platform specifically, the wasm instance would wait on a Promise and pause until that resolves or rejects. For example, a wasm instance could pause on a fetch network operation, and be written something like this in .wat:

;; call an import which returns a promise
call $do_fetch
;; wait for the promise just pushed to the stack
await
;; do stuff with the result just pushed to the stack

Note the general similarity to await in JS and other languages. While this is not identical to them (see details below) the key benefit is that it allows writing synchronous-looking code (or rather, to compile synchronous-looking code into wasm).

The details

Core wasm spec

The changes to the core wasm spec are very minimal:

  • Add a waitref type.
  • Add an await instruction.

A type is specified for each await instruction (like call_indirect), for example:

;; elaborated wat from earlier, now with full types

(type $waitref_=>_i32 (func (param waitref) (result i32)))
(import "env" "do_fetch" (func $do_fetch (result waitref)))

;; call an import which returns a promise
call $do_fetch
;; wait for the promise just pushed to the stack
await (type $waitref_=>_i32)
;; do stuff with the result just pushed to the stack

The type must receive a waitref, and can return any type (or nothing).

await is only defined in terms of making the host environment do something. It is similar in that sense to the unreachable instruction, which on the Web makes the host throw a RuntimeError, but that isn't in the core spec. Likewise, the core wasm spec only says await is meant to wait for something from the host environment, but not how we would actually do so, which might be very different in different host environments.

That's it for the core wasm spec!

Wasm JS spec

The changes to the wasm JS spec (which affect only JS environments like the Web) are more interesting:

  • A valid waitref value is a JS Promise.
  • When an await is executed on a Promise, the entire wasm instance pauses and waits for that Promise to resolve or reject.
  • If the Promise resolves, the instance resumes execution after pushing to the stack the value received from the Promise (if there is one)
  • If the Promise rejects, we resume execution and throw a wasm exception from the location of the await.

By "the entire wasm instance pauses" we mean all local state is preserved (the call stack, local values, etc.) so that we can resume the current execution later, as if we never paused (of course global state may have changed, like the Memory may have been written to). While we wait, the JS event loop functions normally, and other things can happen. When we resume later (if we don't reject the Promise, in which case an exception would be thrown) we continue exactly where we left off, basically as if we never paused (but meanwhile other things have happened, and global state may have changed, etc.).

What does it look like to JS when it calls a wasm instance which then pauses? To explain that, let's first take a look at a common example encountered when porting native applications to wasm, an event loop:

void event_loop_iteration() {
  // ..
  while (auto task = getTask()) {
    task.run(); // this *may* be a network fetch
  }
  // ..
}

Imagine that this function is called once per requestAnimationFrame. It executes the tasks given to it, which might include: rendering, physics, audio, and network fetching. If we have a network fetch event, then and only then do we end up running an await instruction on the fetch's Promise. We may do that 0 times for one call of event_loop_iteration, or 1 time, or many times. We only know whether we end up doing so during the execution of this wasm - not before, and in particular not in the JS caller of this wasm export. So that caller must be ready for the instance to either pause or not.

A somewhat analogous situation can happen in pure JavaScript:

function foo(bar) {
  // ..
  let result = bar(42);
  // ..
}

foo gets a JS function bar and calls it with some data. In JS bar may be an async function or it may be a normal one. If it's async, it returns a Promise, and only finishes execution later. If it's normal, it executes before returning and returns the actual result. foo can either assume that it knows which kind bar is (no type is written in JS, in fact bar may not even be a function!), or it can handle both types of functions to be fully general.

Now, normally you know exactly what set of functions bar might be! For example, you may have written foo and the possible bars in coordination, or documented exactly what the expectations are. But the wasm/JS interaction we are talking about here is actually more similar to the case where you don't have such a tight coupling between things, and where in fact you need to handle both cases. As mentioned earlier, the event_loop_iteration example requires that. But even more generally, often the wasm is your compiled application while the JS is generic "runtime" code, so that JS has to handle all cases. JS can easily do so, of course, for example using result instanceof Promise to check the result, or use JS await:

async function runEventLoopIteration() {
  // await in JavaScript can handle Promises as well as regular synchronous values
  // in the same way, so the log is guaranteed to be written out consistently after
  // the operation has finished (note: this handles 0 or 1 iterations, but could be
  // generalized)
  await wasm.event_loop_iteration();
  console.log("the event loop iteration is done");
}

(note that if we don't need that console.log then we wouldn't need the JS await in this example, and would have just a normal call to a wasm export)

To summarize the above, we propose that the behavior of a pausing wasm instance be modeled on the JS case of a function that may or may not be async, which we can state as:

  • When an await is executed, the wasm instance immediately exits back out to whoever called into it (typically that would be JS calling a wasm export, but see notes later). The caller receives a Promise which it can use to know when execution of the wasm concludes, and to get a result if there is one.

Toolchain / library support

In our experience with Asyncify and related tools it is easy (and fun!) to write a little JS to handle a waiting wasm instance. Aside from the options mentioned earlier, a library could do one of the following:

  1. Wrap around a wasm instance to make its exports always return a Promise. That gives a nice simple interface to the outside (however, it adds overhead to quick calls into wasm that do not pause). This is what the standalone Asyncify helper library does, for example.
  2. Write some global state when an instance pauses and check that from the JS that called into the instance. That is what Emscripten's Asyncify integration does, for example.

A lot more can be built on top of such approaches, or other ones. We prefer to leave all that to toolchains and libraries to avoid complexity in the proposal and in VMs.

Implementation and Performance

Several factors should help keep VM implementations simple:

  1. A pause/resume occurs only on an await, and we know their locations statically inside each function.
  2. When we resume we continue exactly from where we left things, and we only do so once. In particular, we never "fork" execution: nothing here returns twice, unlike C's setjmp or a coroutine in a system that allows cloning/forking.
  3. It is acceptable if the speed of an await is slower than a normal call out to JS, since we will be waiting on a Promise, which at minimum implies a Promise was allocated and that we wait on the event loop (which has minimum overhead plus potentially waiting for other things currently running). That is, the use cases here do not demand that VM implementers find ways to make await blazingly-fast. We only want await to be efficient compared to the requirements here, and in particular expect it to be far faster than Asyncify's large overhead.

Given the above, a natural implementation is to copy the stack when we pause. While that has some overhead, given the performance expectations here it should be very reasonable. And if we copy the stack only when we pause then we can avoid doing extra work to prepare for pausing. That is, there should be no extra general overhead (which is very different from Asyncify!)

Note that while copying the stack is a natural approach here, it is not a completely trivial operation, as the copy may not be a simple memcpy, depending on the VM's internals. For example, if the stack contains pointers to itself, then those would either need to be adjusted, or for the stack to be relocatable. Alternatively, it might be possible to copy the stack back to its original position before resuming it, since as mentioned earlier it is never "forked".

Note also that nothing in this proposal requires copying the stack. Perhaps some implementations can do other things, thanks to the simplifying factors mentioned in the 3 points from earlier in this section. The observable behavior here is fairly simple, and explicit stack handling is not part of it.

We are very interested to hear VM implementor feedback on this section!

Clarifications

This proposal only pauses WebAssembly execution back out to the caller of the wasm instance. It does not allow pausing host (JS or browser) stack frames. await operates on a wasm instance, only affecting stack frames inside it.

It is ok to call into the WebAssembly instance while a pause has occurred, and multiple pause/resume events can be in flight at once. (Note that if the VM takes the approach of copying the stack then this does not mean a new stack must be allocated each time we enter the module, as we only need to copy it if we actually pause.)

Connection to other proposals

Exceptions

Promise rejection throwing an exception means that this proposal depends on the wasm exceptions proposal.

Coroutines

Andreas Rossberg's coroutines proposal also deals with pausing and resuming execution. However, while there is some conceptual overlap, we don't think the proposals compete. Both are useful because they are focused on different use cases. In particular, the coroutines proposal allows coroutines to be switched between inside wasm, while the await proposal allows an entire instance to wait for the outside environment. And the way in which both things are done leads to different characteristics.

Specifically, the coroutines proposal handles stack creation in an explicit manner (instructions are provided to create a coroutine, to pause one, etc.). The await proposal only talks about pausing and resuming and therefore the stack handling is implicit. Explicit stack handling is appropriate when you know you are creating specific coroutines, while implicit is appropriate when you only know you need to wait for something during execution (see the example from before with event_loop_iteration).

The performance characteristics of those two models may be very different. If for example we created a coroutine every time we ran code that might pause (again, often we don't know in advance) that might allocate memory unnecessarily. The observed behavior of await is simpler than what general coroutines can do and so it may be simpler to implement.

Another significant difference is that await is a single instruction that provides all a wasm module needs in order to fix the sync/async mismatch wasm has with the Web (see the first .wat example from the very beginning). It is also very easy to use on the JS side which can just provide and/or receive a Promise (while a little library code may be useful to add, as mentioned earlier, it can be very minimal).

In theory the two proposals could be designed to be complementary. Perhaps await could be one of the instructions in the coroutines proposal somehow? Another option is to allow an await to operate on a coroutine (basically giving a wasm instance an easy way to wait on coroutine results).

WASI#276

By coincidence WASI #276 was posted by @tqchen just as we were finishing to write this up. We're very happy to see that as it shares our belief that that coroutines and async support are separate functionalities.

We believe that an await instruction could help implement something very similar to what is proposed there (option C3), with the difference that there would not need to be special async syscalls, but rather some syscalls could return a waitref which can then be await-ed.

For JavaScript we defined waiting as pausing a wasm instance, which makes sense because we can have multiple instances as well as JavaScript on the page. However, in some server environments there might only be the host and a single wasm instance, and in that case, waiting can be much simpler, perhaps literally waiting on a file descriptor or on the GPU. Or waiting could pause the entire wasm VM but keep running an event loop. We don't have specific ideas here ourselves, but based on the discussion in that issue there may be interesting possibilities here, we're curious what people think!

Corner case: wasm instance => wasm instance => await

In a JS environment, when a wasm instance pauses it returns immediately to whoever called it. We described what happens if the caller is from JS, and the same thing happens if the caller is the browser (for example, if we did a setTimeout on a wasm export that pauses; but nothing interesting happens there, as the returned Promise is just ignored). But there is another case, of the call coming from wasm, that is, where wasm instance A directly calls an export from instance B, and B pauses. The pause makes us immediately exit out of B and return a Promise.

When the caller is JavaScript, as a dynamic language this is less of an issue, and in fact it's reasonable to expect the caller to check the type as discussed earlier. When the caller is WebAssembly, which is statically typed, this is awkward. If we don't do something in the proposal for this then the value will be cast, in our example from a Promise to whatever instance A expects (if an i32, it would be cast to a 0). Instead, we suggest that an error occur:

  • If a wasm instance calls (directly or using call_indirect) a function from another wasm instance, and while running in the other instance an await is executed, then a RuntimeError exception is thrown from the location of the await.

Importantly, this could be done with no overhead unless pausing, that is, keeping normal wasm instance -> wasm instance calls at full speed, by checking the stack only when doing a pause.

Note that users that do want something like a wasm instance to call another and have the latter pause can do so, but they need to add some JS in between the two.

Another option here is for a pause to propagate to the calling wasm as well, that is, all wasm would pause all the way out to JS, potentially spanning multiple wasm instances. This has some advantages, like wasm module boundaries stop mattering, but also downsides, like the propagation being less intuitive (the calling instance's author may not expect such behavior) and that adding JS in the middle could change the behavior (also potentially unexpectedly). Requiring that users have JS in between, as mentioned earlier, seems less risky.

Another option might be for some wasm exports to be marked async while others are not, and then we could know statically what is what, and not allow improper calls; but see the event_loop_iteration example from earlier which is a common case that would not be solved by marking exports, and there are also indirect calls, so we can't avoid the issue that way.

Alternative approaches considered

Perhaps we don't need a new await instruction at all, if wasm pauses whenever a JS import returns a Promise? The problem is that right now when JS returns a Promise that is not an error. Such a backwards-incompatible change would mean wasm can no longer receive a Promise without pausing, but that might be useful too.

Another option we considered is to mark imports somehow to say "this import should pause if it returns a Promise". We thought about various options for how to mark them, on either the JS or the wasm side, but didn't find anything that felt right. For example, if we mark imports on the JS side then the wasm module would not know if a call to an import pauses or not until the link step, when imports arrive. That is, calls to imports and pausing would be "mixed together". It seems like the most straightforward thing is to just have a new instruction for this, await, which is explicit about waiting. In theory such a capability may be useful outside the Web as well (see notes earlier), so having an instruction for everyone may make things more consistent overall.

Previous related discussions

#1171
#1252
#1294
#1321

Thank you for reading, feedback is welcome!

@devsnek
Copy link
Member

devsnek commented May 18, 2020

Excellent write-up! I like the idea of host-controlled suspension. @rossberg's proposal also discusses functional effect systems, and I admittedly am not an expert with them, but at first glance it seems like those could fulfill the same non-local control flow need.

@sbc100
Copy link
Member

sbc100 commented May 18, 2020

Regarding: "Given the above, a natural implementation is to copy the stack when we pause." How would this work for the execution stack? I imagine that most JIT engines share the native C execution stack between JS and wasm so I'm not sure what saving and restoring would mean in this context. Does this proposal mean that wasm execution stack would need to be somehow virtualized? IIUC avoiding the use of the C stack like this was pretty tricky when python tried to do something similar: https://github.com/stackless-dev/stackless/wiki.

@lachlansneff
Copy link

I share a similar worry to @sbc100. Copying the stack is inherently quite a difficult operation, especially if your VM doesn't already have a GC implementation.

@kripken
Copy link
Member Author

kripken commented May 18, 2020

@sbc100

Does this proposal mean that wasm execution stack would need to be somehow virtualized?

I have to leave this to VM implementers as I'm not an expert on it. And I don't understand the connection to stackless python, but perhaps I don't know what that is well enough to understand the connection, sorry!

But in general: various coroutine approaches work by manipulating the stack pointer at a low level. Those approaches may be an option here. We wanted to point out that even if the stack has to be copied as part of such an approach, doing so has acceptable overhead in this context.

(We are not certain if those approaches can work in wasm VMs or not - hoping to hear from implementers if yes or no, and whether there are better options!)

@lachlansneff

Can you please explain in more detail what you mean by GC making things easier? I don't follow.

@lachlansneff
Copy link

lachlansneff commented May 18, 2020

@kripken GCs often (but not always) have the ability to walk a stack, which is necessary if you need to rewrite pointers on the stack to point to the new stack. I believe JSC doesn't have that ability, so I don't believe it would be possible to deep copy stacks with their VM. Perhaps someone who knows more about JSC can confirm or deny this.

@kripken
Copy link
Member Author

kripken commented May 18, 2020

@lachlansneff

Thanks, now I see what you're saying.

We do not suggest that walking the stack in such a full way (identifying each local all the way up, etc.) is necessary to do this. (For other possible approaches, see the link in my last comment about low-level coroutine implementation methods.)

I apologize for the terminology of "copy the stack" in the proposal - I see that it was not clear enough, based on your and @sbc100 's feedback. Again, we don't want to suggest a specific VM implementation approach. We just wanted to say that if copying the stack is necessary in some approach, that would not be a problem for speed.

Rather than suggest a specific implementation approach, we hope to hear from VM people how they think this could be done!

@acfoltzer
Copy link

I'm very excited to see this proposal. Lucet has had yield and resume operators for a while now, and we use them precisely for interacting with async code running in the Rust host environment.

This was fairly straightforward to add to Lucet, since our design already committed to maintaining a separate stack for Wasm execution, but I could imagine it could present some implementation difficulties for VMs that don't.

@syrusakbary
Copy link

This proposall sounds great! We have been trying to get into a good way for managing async code on wasmer-js for a little bit (since we have no access to the VM internals in a browser context).

Rather than suggest a specific implementation approach, we hope to hear from VM people how they think this could be done!

I think perhaps using the callback strategy for async functions might be the easiest way to get things rolling and also in a language-agnostic way.

@malbarbo
Copy link

It seems .await can be called in a JsPromise inside a Rust function using wasm-bindgen-futures? How this can work without the await instruction proposed here? I'm sorry for my ignorance, I'm looking for solutions to call fetch inside wasm and I'm learning about Asyncify, but it seams that the Rust solution is simpler. What I'm missing here? Can someone make it clear for me?

@tqchen
Copy link

tqchen commented May 18, 2020

I am very excited about this proposal. The main advantage of the proposal is its simplicity, as we can build APIs that are synchronize to the wasm's POV, and it makes it much easier to port applications without having to explicitly think about callbacks and async/await. It would enable us to bring WASM and WebGPU based machine learning to native wasm vms using a single native API and run on both web and native.

@tqchen
Copy link

tqchen commented May 18, 2020

One thing that I think worth discussing is the signature of the functions that potentially calls await. Imagine that we have the following function

int test() {
   await();
   return 1;
}

The signature of the corresponding function is () => i32. Under the new proposal, calls into test could either returns i32, or a Promise<i32>. Note that it is harder to ask user to statically declare the a new signature(because the cost of code-porting, and could be indirect calls inside the function that we don't know that calls await).

Should we have a separate call mode into the exported function(e.g. async call) to indicate await is permitted during runtime?

Terminology-wise, the proposed operation is like a yield operation in operation systems. Since it yields the control to the OS(in this case the wasm VM) to wait for the syscall to finsih.

@RossTate
Copy link

If I understand this proposal correctly, I think it's roughly equivalent to removing the restriction that the await in JS be only usable in async functions. That is, on the wasm side waitref could be externref and rather than an await instruction you could have an imported function $await : [externref] -> [], and on the JS side you could supply foo(promise) => await promise as the function to import. In the other direction, if you were JS code that wanted to await on a Promise outside of async function, you could supply that promise to a wasm module that simply calls await on the input. Is that a correct understanding?

@binji
Copy link
Member

binji commented May 18, 2020

@RossTate Not quite, AIUI. The wasm code can await a promise (call it promise1), but only the wasm execution will yield, not the JS. The wasm code will return a different promise (call it promise2) to the JS caller. When promise1 resolves, then the wasm execution continues. Finally, when that wasm code exits normally, then promise2 will resolve with the wasm function's result.

@kripken
Copy link
Member Author

kripken commented May 19, 2020

@tqchen

Should we have a separate call mode into the exported function(e.g. async call) to indicate await is permitted during runtime?

Interesting - where do you see the benefit? As you said, there is really no way to be sure if an export will end up doing an await or not, in common porting situations, so at best it could only be used sometimes. Would this help VMs internally though maybe?

@tqchen
Copy link

tqchen commented May 19, 2020

Having an explicit declaration might make sure that the user state their intent clearly, and the VM could throw a proper error message if the user's intent is not doing a call that runs async.

From the user's POV it also makes the code writing more consistent. For example, the user would could write the following code, even if test does not call a await, and the system interface returns Promise.resolve(test()) automatically.

await inst.exports_async.test();

@RReverser
Copy link
Member

RReverser commented May 19, 2020

It seems .await can be called in a JsPromise inside a Rust function using wasm-bindgen-futures ? How this can work without the await instruction proposed here? I'm sorry for my ignorance, I'm looking for solutions to call fetch inside wasm and I'm learning about Asyncify, but it seams that the Rust solution is simpler. What I'm missing here? Can someone make it clear for me?

@malbarbo There is little overlap between two despite the similar use-cases; what Rust is doing is essentially full coroutines, which are more in the scope of the other linked proposal.

It's more flexible, but also requires more involvement and overhead from both the language and the codebase - it has to have a concept of async functions natively, as well as mark every single function in the call chain as such.

What this proposal is trying to achieve instead is a way to wait for host-provided syscalls, where them being async is only an implementation detail, and so such functions can be called from anywhere in an existing codebase in a backwards-compatible way, without having to rewrite how whole app operates. (Example being file I/O, which source languages including C / C++ / Rust normally expect to be available synchronously, but it isn't e.g. on the Web.)

@RReverser
Copy link
Member

RReverser commented May 19, 2020

From the user's POV it also makes the code writing more consistent. For example, the user would could write the following code, even if test does not call a await, and the system interface returns Promise.resolve(test()) automatically.

@tqchen Note that user can already do this as shown in example in the proposal test. That is, JavaScript already supports and handles both synchronous and asynchronous values in an await operator in the same fashion.

If the suggestion is to enforce a single static type, then we believe this can be done on either lint or type system level or a JavaScript wrapper level without introducing complexity on the core WebAssembly side or restricting implementers of such wrappers.

@RossTate
Copy link

Ah, thanks for the correction, @binji.

In that case, is the following roughly equivalent? Add a WebAssembly.instantiateAsync(moduleBytes, imports, "name1", "name2") function to the JS API. Suppose moduleBytes has a number of imports plus an additional import import "name1" "name2" (func (param externref)). Then this function instantiates the imports with the values given by imports and instantiates the additional import with what is conceptually await. When exported functions are created from this module, they get guarded so that when this await is called it walks up the stack to find the first guard and then copies the contents of the stack over into a new Promise that is then immediately returned.

Would that work? My sense is that this proposal can be done solely by modifying the JS API without need to modify WebAssembly itself. Of course, even then it still adds a lot of useful functionality.

@Pauan
Copy link

Pauan commented May 19, 2020

@kripken How would the start function be handled? Would it statically disallow await, or would it somehow interact with Wasm instantiation?

@malbarbo wasm-bindgen-futures allows you to run async code in Rust. That means you have to write your program in an async way: you have to mark your functions as async, and you need to use .await. But this proposal allows you to run async code without using async or .await, instead it looks like a regular synchronous function call.

In other words, you cannot currently use synchronous OS APIs (like std::fs) because the web only has async APIs. But with this proposal you could use synchronous OS APIs: they would internally use Promises, but they would look synchronous to Rust.

Even if this proposal is implemented, wasm-bindgen-futures will still exist and will still be useful, because it's handling a different use case (running async functions). And async functions are useful because they can be easily parallelized.

@RReverser
Copy link
Member

RReverser commented May 19, 2020

@RossTate It seems your suggestion is quite similar to one covered in "Alternative approaches considered":

Another option we considered is to mark imports somehow to say "this import should pause if it returns a Promise". We thought about various options for how to mark them, on either the JS or the wasm side, but didn't find anything that felt right. For example, if we mark imports on the JS side then the wasm module would not know if a call to an import pauses or not until the link step, when imports arrive. That is, calls to imports and pausing would be "mixed together". It seems like the most straightforward thing is to just have a new instruction for this, await, which is explicit about waiting. In theory such a capability may be useful outside the Web as well (see notes earlier), so having an instruction for everyone may make things more consistent overall.

@RReverser
Copy link
Member

RReverser commented May 19, 2020

How would the start function be handled? Would it statically disallow await, or would it somehow interact with Wasm instantiation?

@Pauan We didn't cover this specifically, but I think there's nothing stopping us from allowing await in start as well. In this case the Promise returned from instantiate{Streaming} would still naturally resolve/reject when the start function has finished executing completely, with the only difference being that it would wait for awaited promises.

That said, same limitations as today apply and for now it wouldn't be too useful for cases that require access to e.g. the exported memory.

@Pauan
Copy link

Pauan commented May 19, 2020

@RReverser How would that work for the synchronous new WebAssembly.Instance (which is used in workers)?

@kripken
Copy link
Member Author

kripken commented May 19, 2020

Interesting point @Pauan about start!

Yeah, for synchronous instantiation it seems risky - if await is allowed, it's odd if someone calls into the exports while it's paused. Disallowing await there may be simplest and safest. (Perhaps also in async start for consistency, there don't seem to be important use cases that would prevent? Needs more thought.)

@RReverser
Copy link
Member

(which is used in workers)?

Hmm good point; I don't think it has to be used in Workers, but since this API already exists, perhaps it could return a Promise? I've seen this as a semi-popular emerging pattern to return thenables from a constructor of various libraries, although not sure if it's a good idea to do this in a standard API.

I agree disallowing it in start (as in trapping) is safest for now, and we can always change that in the future in a backwards-compatible way should something change.

@Kangz
Copy link

Kangz commented May 19, 2020

Maybe I missed something, but there is no discussion of what happens when the WASM execution is paused with an await instruction and a promise returned to JS, then JS calls back into WASM without waiting on the promise.

Is that a valid use case? If it is, then it could allow "main loop" applications to receive input events without yielding to the browser manually. Instead they could yield back by awaiting on a promise that's resolved immediately.

@chicoxyzzy
Copy link
Member

What about cancellation? It's not implemented in JS promises and this causes some issues.

@kripken
Copy link
Member Author

kripken commented May 19, 2020

@Kangz

Maybe I missed something, but there is no discussion of what happens when the WASM execution is paused with an await instruction and a promise returned to JS, then JS calls back into WASM without waiting on the promise.

Is that a valid use case? If it is, then it could allow "main loop" applications to receive input events without yielding to the browser manually. Instead they could yield back by awaiting on a promise that's resolved immediately.

The current text is perhaps not clear enough on that. For the first paragraph, yes, that is allowed, see the "Clarifications" section: It is ok to call into the WebAssembly instance while a pause has occurred, and multiple pause/resume events can be in flight at once.

For the second paragraph, no - you can't get events earlier, and you can't make JS resolve a Promise earlier than it would. Let me try to summarize things in another way:

  • When wasm pauses on Promise A, it exits back out to whatever called it, and returns a new Promise B.
  • Wasm resumes when Promise A resolves. That happens at the normal time, which means everything is normal in the JS event loop.
  • After wasm resumes and also finishes running, only then is Promise B resolved.

So in particular Promise B has to resolve after Promise A. You can't get the result of Promise A earlier than JS can get it.

To put it another way: this proposal's behavior can be polyfilled by Asyncify + some JS that uses Promises around it.

@RossTate
Copy link

@RReverser, I don't think those are the same, but first I think we need to clarify something (if it hasn't already been clarified, in which case I'm sorry for missing it).

There can be multiple calls from JS into the same wasm instance on the same stack at the same time. If await gets executed by the instance, which call gets paused and returns a promise?

@taralx
Copy link

taralx commented May 24, 2020

I'm not suggesting it as a long-term solution. I'm suggesting a polyfill that uses it could be used to see if a non-reentrant solution will work for people.

@kripken
Copy link
Member Author

kripken commented May 25, 2020

@taralx Oh, ok, now I see, thanks.

@rossberg
Copy link
Member

@taralx:

I think you can get convenient JS integration without whole program transformation if you don't allow the module to be re-entered.

That would be bad. It means that merging multiple modules could break their behaviour. That would be the antithesis to modularity.

As a general design principle, operational behaviour should never be dependent on module boundaries (other than simple scoping). Modules are merely a grouping and scoping mechanism in Wasm, and you want to maintain the ability to regroup stuff (link/merge/split modules) without that changing the behaviour of a program.

@carlopi
Copy link

carlopi commented May 25, 2020

@rossberg: this is generalizable as blocking access to any Wasm module, as proposed earlier. But then it's probably too limiting.

@taralx
Copy link

taralx commented May 25, 2020

That would be bad. It means that merging multiple modules could break their behaviour. That would be the antithesis to modularity.

That was my point with the polyfilling argument - atomic.wait doesn't break modularity, so this shouldn't either.

@rossberg
Copy link
Member

@taralx, atomic.wait references a specific location in a specific memory. Which memory and location would await blocking use, and how would one control which modules share that memory?

@taralx
Copy link

taralx commented May 25, 2020

@rossberg can you elaborate on a scenario you think this breaks? I suspect we have different ideas on how the non-reentrant version would work.

@rossberg
Copy link
Member

@taralx, consider loading two modules A and B, each providing some export function, say A.f and B.g. Both might perform await when called. Two pieces of client code are each passed one of these functions, respectively, and they call them independently. They don't interfere or block one another. Then somebody merges or refactors A and B into C, without changing anything about the code. Suddenly both pieces of client code could start blocking each other unexpectedly. Spooky action at a distance through hidden shared state.

@taralx
Copy link

taralx commented May 25, 2020

That makes sense. But allowing re-entry risks concurrency in modules that don't expect it, so it's spooky action at a distance either way.

@sbc100
Copy link
Member

sbc100 commented May 25, 2020

But modules are already re-enter-able, no? Whenever a module makes a call an import, the external code can re-enter the module which could change global state before returning. I can't see how re-entry during the proposed await is any more spooky or concurrent than calling an imported function. Maybe I'm missing something?

@taralx
Copy link

taralx commented May 25, 2020

(edited)

Hm, yes. Okay, so an imported function could re-enter the module. I clearly need to think harder about this.

@devsnek
Copy link
Member

devsnek commented May 25, 2020

When code is running, and it calls a function, there are two possibilities: It knows that the function will not call random things, or the function might call random things. In the latter case, re-entrancy is always possible. The same rules apply to await.

@taralx
Copy link

taralx commented May 25, 2020

(edited my comment above)

@kripken
Copy link
Member Author

kripken commented May 28, 2020

Thanks everyone for the discussion so far!

To summarize, it sounds like there is general interest here, but there are big open questions like whether this should be 100% on the JS side or just 99% - sounds like the former would remove the major worries some people have, and that would be fine for the Web case, so that is probably ok. Another big open question is how feasible this would be to do in VMs which we need more info about.

I'll suggest an agenda item for the next CG meeting in 2 weeks to discuss this proposal and consider it for stage 1, which would mean opening a repo and discussing the open questions in separate issues in more detail there. (I believe that's the right process, but please correct me if I'm wrong.)

@fgmccabe
Copy link

fgmccabe commented May 28, 2020 via email

@kripken
Copy link
Member Author

kripken commented May 28, 2020

@fgmccabe

We should discuss that for sure.

In general though, unless your proposal focuses on the JS side, I'm guessing it wouldn't make this one moot (which is 99%-100% on the JS side).

@RossTate
Copy link

RossTate commented Jun 1, 2020

Now that discussion on implementation details has concluded, I would like to reraise a higher-level concern I expressed earlier but dropped for the sake of having one discussion at a time.

A program is made up of many components. From a software-engineering perspective, it is important that splitting components into parts or merging components together does not significantly change the behavior of the program. This is the reasoning behind the module-composition principle discussed at the last in-person CG meeting, and it's implicit in the design of many languages.

In the case of web programs, now with WebAssembly these different components might even be written in different languages: JS or wasm. In fact, many components could just as well be written in either language; I'll refer to these as "ambivalent" components. Right now, most ambivalent components are written in JS, but I imagine we're all hoping that more and more of them will be rewritten into wasm. To facilitate this "code migration", we should try to ensure that rewriting a component in this fashion does not change how it interacts with the environment. As a toy example, whether a particular "apply" program component (f, x) => f(x) is written in JS or in wasm should not affect the behavior of the overall program. This is a code-migration principle.

Unfortunately, all of the variants of this proposal seem to violate either the module-composition program or the code-migration principle. The former is violated when await captures the stack up to where the current wasm module was most recently entered, because this boundary changes as modules are split apart or combined together. The latter is violated when await captures the stack up to where wasm was most recently entered, because this boundary changes as code is migrated from JS to wasm (so that migrating something as simple as (f, x) => f(x) from JS to wasm can significantly change the behavior of the overall program).

I don't think these violations are due to poor design choices of this proposal. Rather, the problem seems to be that this proposal is trying to avoid indirectly making JS any more powerful, and that goal is forcing it to impose artificial boundaries that violate these principles. I totally understand that goal, but I suspect this problem will come up more and more: adding functionality to WebAssembly in a manner that respects these principles will often require indirectly adding functionality to JS due to JS being the embedding language. My preference would be to tackle that issue head on (which I really have no idea how to resolve). If not that, then my secondary preference would be to make this change solely in the JS API, because it is JS that is the limiting factor here, rather than add instructions to WebAssembly that wasm has no interpretation for.

@kripken
Copy link
Member Author

kripken commented Jun 1, 2020

I don't think these violations are due to poor design choices of this proposal. Rather, the problem seems to be that this proposal is trying to avoid indirectly making JS any more powerful

That is important, but it is not the main reason for the design here.

The main reason for this design is that while I fully agree that the principle of composition makes sense for wasm, the fundamental problem we have on the Web is that in fact JS and wasm are not equivalent in practice. We have handwritten JS that is async and ported wasm that is sync. In other words, the boundary between them is actually the exact problem we are trying to address. Overall I am not sure I agree the principle of composition should be applied to wasm and JS (but maybe it should, could be an interesting debate).

@kripken
Copy link
Member Author

kripken commented Jun 3, 2020

I was hoping to have more discussion publicly here, but to save time I reached out to some VM implementers directly, as few have engaged here so far. Given their feedback together with the discussion here, sadly I think we should pause this proposal.

Await has much simpler observable behavior than general coroutines or stack switching, but the VM people I talked to agree with @rossberg that the VM work in the end would probably be similar for both. And at least some VM people believe we will get coroutines or stack switching anyhow, and that we can support await's use cases using that. That will mean creating a new coroutine/stack on each call into the wasm (unlike with this proposal), but at least some VM people think that could be made fast enough.

In addition to the lack of interest from VM people, we have had some strong objections to this proposal here from @fgmccabe and @RossTate , as discussed above. We disagree on some things but I appreciate those points of view, and the time that went into explaining them.

In conclusion, overall it feels like it would be a waste of everyone's time to try to move forward here. But thank you to everyone that participated in the discussion! And hopefully at least this motivates prioritizing coroutines / stack switching.

Note that the JS part of this proposal may be relevant in the future, as JS sugar basically for convenient Promise integration. We'll need to wait for stack switching or coroutines and see if this could work on top of that. But I don't think it's worth keeping the issue open for that, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests