Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on multithreaded browser Wasm #89

Closed
kettle11 opened this issue Jan 23, 2024 · 6 comments · Fixed by #108
Closed

Panic on multithreaded browser Wasm #89

kettle11 opened this issue Jan 23, 2024 · 6 comments · Fixed by #108

Comments

@kettle11
Copy link

This segment of code has assumptions that are incorrect for projects making use of multithreaded Wasm:

async-executor/src/lib.rs

Lines 276 to 288 in 6c70369

fn state(&self) -> &Arc<State> {
#[cfg(not(target_family = "wasm"))]
{
return self.state.get_or_init_blocking(|| Arc::new(State::new()));
}
// Some projects use this on WASM for some reason. In this case get_or_init_blocking
// doesn't work. Just poll the future once and panic if there is contention.
#[cfg(target_family = "wasm")]
future::block_on(future::poll_once(
self.state.get_or_init(|| async { Arc::new(State::new()) }),
))
.expect("encountered contention on WASM")

Multithreaded Wasm is possible with Rust, but the chief limitation is that the main thread panics if it tries to block/wait. However it is allowed to busy-loop as a form of waiting. Crates like wasm_sync reimplement synchronization primitives and use busy-looping instead of blocking on the main thread.

So here's the puzzle: projects like Bevy depend on async-executor, and right now I'm trying to make Bevy multithreaded. There needs to be a workaround to prevent outright crashing if contention is encountered on the main thread due to the above snippet of code.

It could be modified to work like this on Wasm (preferably only on the main thread):

loop {
    if let Ok(state) = self
        .state
        .get_or_try_init_blocking::<()>(|| Ok(Arc::new(State::new())))
    {
        return state;
    }
}

Alternatively the fix could go into the async-lock project.

Would this project be willing to take on a fix like that to unblock upstream users of multithreaded Wasm?

Perhaps eventually this workaround could land in the standard library itself, but there are challenges with that as well. In an even better world browsers would allow waiting on the main thread, but that will not be considered any time soon.

@notgull
Copy link
Member

notgull commented Jan 24, 2024

However it is allowed to busy-loop as a form of waiting.

I don't think this is correct? Busy looping in the browser thread locks up the entire browser and leads to unpredictable behavior. Web frameworks go to great lengths to avoid busy looping for this reason.

I'd rather pursue smol-rs/parking#20, but that's more complex.

@kettle11
Copy link
Author

smol-rs/parking#20 actually already works the way it describes. Rust's std sync primitives use Wasm's atomic.wait when compiled with the atomics flag. So parking will work correctly on a worker thread and potentially crash on the main thread.


I don't think this is correct? Busy looping in the browser thread locks up the entire browser and leads to unpredictable behavior. Web frameworks go to great lengths to avoid busy looping for this reason.

While busy-looping is bad, and is generally to be avoided, it is tolerable when the busy-loop is expected to run for a very short period of time. The alternative is offering an API to upstream users that can crash unpredictably and requires significant rearchitecting to avoid the crash.

For example the emscripten project, the standard way to use C/C++ with Wasm, actually bakes in busy looping on the main thread into its standard library primitives. I wish the Rust project at least offered that approach as it'd prevent a lot of confusion and conditional code within crates.

I'm unfamiliar with the async-executor ecosystem, so I apologize if anything I say is off-base. I was just looking into how things are setup and is parking the only library that directly uses standard library primitives that may block? If that's the case then 'fixing' this issue may be as simple as adding an opt-in feature to have parking use wasm_sync instead of std.

Now that I'm looking into it more I see there are a bunch of cfgs within async-lock, event-listener-strategy, and event-listener which assume Wasm can never wait. Those should be removed one way or another as well.

@kettle11
Copy link
Author

Coming back to this.

I no longer think that it will be necessary (at least for now) to involve any sort of 'busy loop' hack. Instead Rust / Wasm programs can architect themselves to run async-executor off the browser's main thread. This is less nice than a 'it just works' solution but it allows the async-executor ecosystem to avoid messy hacks and lets developers take advantage of it on Wasm.

This will require going through all of the async-executor libraries and removing #[cfg(not(target_family = "wasm"))] in a number of places. Use of Instant and Duration will also be removed on Wasm targets as there's not a standard way to make those calls in the Rust / Wasm ecosystem.

I can work on pull requests for those changes.

@notgull
Copy link
Member

notgull commented Feb 16, 2024

Apologies, I thought I typed up a response here, but I guess I forgot to hit "comment"?

The main issue is that our current usage of async_lock::OnceCell in this way is essentially a stopgap until our MSRV is bumped high enough so that we can use std::sync::OnceLock. As far as I know the standard library's OnceLock suffers from this issue as well. So, this would force us to keep using async_lock::OnceCell lest we break multithreaded web, so we cannot migrate to std::sync::OnceLock.

A better solution would be to use a different way of initializing the Executor, rather than relying on OnceCell. Perhaps we could use something similar to event_listener::Event, where we have an AtomicPtr that represents the underlying allocation, and we use a racy strategy to slide that allocation into the pointer.

I will accept a PR for this.

notgull pushed a commit that referenced this issue Apr 9, 2024
Fixes #89. Uses @notgull's suggestion of using a `AtomicPtr` with a racy initialization instead of a `OnceCell`.

For the addition of more `unsafe`, I added the `clippy::undocumented_unsafe_blocks` lint at a warn, and fixed a few of the remaining open clippy issues (i.e. `Waker::clone_from` already handling the case where they're equal).

Removing `async_lock` as a dependency shouldn't be a SemVer breaking change.
@9SMTM6
Copy link

9SMTM6 commented Sep 7, 2024

The main issue is that our current usage of async_lock::OnceCell in this way is essentially a stopgap until our MSRV is bumped high enough so that we can use std::sync::OnceLock

Is OnceLock unavailable in all situations? Even when building with build-std etc? When you have a setup such as with wasm_thread (a crate), then most synchronization primitives from std just work, all that wasm_thread specifically implements is thread spawning etc.

Not sure if that would actually be acceptable in any shape or form, it likely would lead to a number of special cases, since the kind of setup that wasm_thread has isn't without difficulties for deployment (e.g. using github pages to host a wasm application with that isn't going to work) and probably also browser compatibility issues.

@9SMTM6
Copy link

9SMTM6 commented Sep 7, 2024

Use of Instant and Duration will also be removed on Wasm targets as there's not a standard way to make those calls in the Rust / Wasm ecosystem.

Instant still doesn't work, but I think Duration always worked, and for Instant theres a bunch of crates. web-time works just fine in webworkers, one synchronization hazard aside, that would be solved though if one has the wasm-thread setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants