-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guarantee synchronous creation of Workers under limited conditions? #10228
Comments
I don't think this works. Blobs cannot be read synchronously. I could maybe see this if we have a new API where you construct a dedicated worker from a string or a |
The specific syntax of the API could be different. The general wish here is to have something that would however not require allowing the unsafe-eval CSP policy. |
To motivate this use case better, here is another example code snippet
<html><body><script>
fetch('b.js').then(response => response.blob()).then(blob => {
let worker = new Worker(URL.createObjectURL(blob));
worker.postMessage('init');
worker.onmessage = () => {
let sab = new Uint8Array(new SharedArrayBuffer(16));
worker.postMessage(sab);
console.log('Waiting for Worker to finish');
while(sab[0] != 1) /*wait to join with the result*/;
console.log(`Worker finished. SAB: ${sab[0]}`);
};
});
</script></body></html>
onmessage = (e) => {
if (e.data == 'init') {
console.log('Worker received SAB');
postMessage(0);
} else {
console.log('Received SAB');
e.data[0] = 1;
}
} The above code example does not hang, but does works as expected in all browsers. The workaround scheme shown by b.html is what WebAssembly/SharedArrayBuffer users use today since a.html does not work. The above code pattern is what currently ships in all multithreaded Emscripten WebAssembly programs in the wild. The difference between b.html and a.html is that in b.html, But the troubling affairs with the workaround presented by b.html is that in that example, one must preallocate the Worker up front while still computing in an asynchronous context. After the synchronous part of the multithreaded program begins, the needed Workers must already be synchronously available. Continuing the example 2 of the multithreaded GC from above: in that GC, I would like to perform the GC marking step quicker by using a pool of background Workers. But I would also like to spawn that GC marking Worker pool only on-demand when necessary, instead of requiring the whole WebAssembly site to have to delay its page startup until I first manage to spin up all the GC Workers (that may or may not even ever fire, depending on what the user does on the site!). If the code example in a.html worked, then I would be able to only ever spawn the GC Workers pool at the first occassion that I need to GC, which would enable a kind of "only-pay-if-you-use-it" type of allocation of site resources. So ideally, if |
I think there is some confusing parts in here. Currently the specs for |
Thanks for helping to clarify. Yeah, that wording reads completely accurate. |
Hmm yeah, on reflection both examples should probably already work. So I think next steps here are:
|
…llel with main page executing JS code. Resolves whatwg/html#10228.
…llel with main page executing JS code. Resolves whatwg/html#10228.
Had a stab at adding tests at web-platform-tests/wpt#45502 . LMK how that looks like. |
…stMessage() to happen in parallel, a=testonly Automatic update from web-platform-tests Add tests for new Worker() and worker.postMessage() to happen in parallel Resolves whatwg/html#10228. -- wpt-commits: 2060611f666a08629a55d5d594a0188c49c9ef5e wpt-pr: 45502
…stMessage() to happen in parallel, a=testonly Automatic update from web-platform-tests Add tests for new Worker() and worker.postMessage() to happen in parallel Resolves whatwg/html#10228. -- wpt-commits: 2060611f666a08629a55d5d594a0188c49c9ef5e wpt-pr: 45502
What problem are you trying to solve?
Today, web has a limitation that spawning a
new Worker(url)
does not actually progress until the main JS thread yields back to the event loop. Historically, the rationale was given that this was so because the constructor of a Worker will perform a fetch to a potentially remote network location.We are requesting that it would be guaranteed that whenever the URL passed to
new Worker(url)
represents an in-memory Blob (i.e. information already synchronously available in RAM), that the construction of a new Worker should be guaranteed to complete without needing to yield back to the event loop.Rationale
In multithreaded WebAssembly and SharedArrayBuffer (SAB) users space, the delayed behavior of
new Worker
creates major challenges to correctly planning the resource utilization of a web page, in a wide variety of use cases that utilize best practices for multithreading techniques.This challenge results in sites having to balance between gratuitously over-utilizing page resources, vs risking a deadlock in their programs.
It is our request that when the URL passed to a Worker represents an in-memory Blob (so no remote network request is needed), that the constructor of
new Worker()
should be guaranteed to complete without needing to yield back to the browser event loop.That is, today, the following code will deadlock in a browser, but we wish that the standard would provide a guarantee that it should not do so.
a.html
a.js
Today, browsers will deadlock on the
while()
loop, because the linenew Worker()
will not proceed to actually create the Worker until the main thread JS code yields back to the browser's event loop. We wish that this should not be the case, but the launch of a Worker should progress even while the main thread JS code is busy executing thewhile()
loop.Note that the presence of the
while()
loop in the above test case is contrived, to illustrate in minimal terms the problem. In this example it may read insensible to want to wait for new Worker() to finish, but in real-world use cases, there is a strong motivation to do so, explained further in below.Why is this a problem (worth solving)?
There are several algorithms and problem spaces where utilizing a fork-join pattern is the most efficient and best practices pattern of structuring program code.
This may seem counterintuitive to developers who have learned the "sync=bad, async=good" mantra, but in multithreaded programming, the opposite can instead be true. To convince why this is the case, let's briefly look at a couple of examples.
Example 1: multithreaded rendering
In realtime interactive rendering applications, a scene graph is traversed to update and collect items to render in the 3D view at each requestAnimationFrame() call.
If the simulation contents are complex, instead of performing the update and collect step in the single main thread, performance and responsiveness can be greatly improved by splitting up the scene traversal across a work pool of threads. This represents a typical fork-join model of computation, where the main thread will fork off # of logical cores number of threads to quickly iterate through a scene graph, parallelizing both throughput and latency extremely well. In the join stage, the main JS thread will hold to wait until the worker thread(s) complete.
Ideally, when the program code needs to fork off computations for the first time around, it would be able to synchronously create the work pool it needs. Later rAF() callbacks would then be able to reuse this already populated pool.
Example 2: multithreaded garbage collection
A second example can be found in multithreaded Mark-and-Sweep garbage collection. In https://github.com/juj/emgc you can find an example of a multithreaded GC, to be used for example in compiling a C#, Java or Python VM into WebAssembly.
In such scenario, if the WebAssembly heap gets unlucky and runs out of memory on a malloc, the VM may need to trigger an on-demand GC to reclaim memory. To improve overall performance of the GC and responsiveness of the main thread, instead of doing the GC marking phase just on the main thread, it is desirable to do the GC marking phase with the help of multiple Workers.
But in order to achieve such a there-and-then GC marking, the main thread would need to synchronously fork off the marking phase to background Workers, and then join when the Workers finish.
Example 3: Parallel for
In some compiled languages, there is support for a parallel for construct, in which the for loop is sliced up into segments that are each processed in a background thread. Parallel for loops are desirable due to the programming simplicity, and the performance improvements of such for loops can be tremendous.
The above examples are valid best-practices use cases of synchronously handing off computations to background Workers, that will result in greatly improved performance and user responsiveness to the main JS thread (as opposed to having the main JS thread alone undertake such computations).
However, since the nature of these kind of fork-join computations is synchronous, today, the developer must have ensured ahead of time that the problematic pool of
new Worker()
steps have been "pre-cached" or "pre-pooled", since the Worker() constructor has that problem of not being able to complete synchronously.This pre-pooling is exactly what WebAssembly/SharedArrayBuffer users have been doing so far - they generally warm up a pool of Workers in advance, so that when the time comes in their programs to perform any synchronous fork-join operations, they would be guaranteed to have all the Workers ready to synchronously receive commands.
Now, with the experience from the years that have passed, this kind of pre-pooling workaround scheme is starting to be seen considerably harmful for several reasons.
Pre-pooling Workers leads to worse web sites in the wild
On simple web sites, the pre-pooling workaround is typically simple enough to implement; but as sites and WebAssembly programs scale, the pre-pooling technique does not, and there are several problems that the pre-pooling workaround leads to. Pre-pooling is considered harmful at least for the following reasons:
Pre-pooling pessimises site startup times. Because all the synchronous Workers typically need to be pre-pooled before WebAssembly Module instantiation time, the site must wait until all of the Workers in the pool have successfully launched.
Pre-pooling risks web site deadlock. Developers need to manually estimate the number of such pre-pooled threads they will ever need at most. If they make a mistake in this analysis and underestimate the number of threads they need, then when their program code runs into a situation where they would need a new thread for synchronous computations, this cannot be achieved, and the site computation will need to halt.
Pre-pooling leads to wasting site resources. Because of the above, developers generally err on the side of caution, choosing counts of pre-pooled threads that are overly conservative, which leads into allocating Workers that the site may never practically use.
Unused Workers are hard to free up. Since a multithreaded WebAssembly application will need to ensure that it has all the Workers available it may need to, it can be very difficult to reason when unused Workers could be reclaimed. This results in monolithic "grow-only" type of pooling of Workers on these pages.
In summary, today developers who produce multithreaded WebAssembly sites have the challenge that they need to estimate how many
pthread_create()
s their whole codebase will ever simultaneously do. As codebases get larger, or when composing software from multiple authors, this may become impossible to do.What would be gained by solving this?
If the example code
a.html
anda.js
above would be guaranteed to work, then multithreaded WebAssembly applications would be able to spin up any needed Workers (or pools of Workers) on-demand, rather than needing to orchestrate their creation well in advance with oracle knowledge.This would greatly ease development of such applications and reduce surface area of novel "web-only" bugs. It would remove the source of resource over-utilization on all multithreaded WebAssembly pages, and it would help sites to shrink down such Workers when likely not in use, without risking a page crash if such assumption did not 100% hold.
Sidenote: getting an exception if Worker cannot be created?
Today
new Worker(url)
further has an undesirable behavior that if the site already has used up all the Workers that it is limited to spawn, the new Worker operation will pause, and only progress when a previous Worker has been GCd first.It would be desirable to have an API that would throw a catchable exception if there are currently no Workers available. This way program code would be able to make some other decisions at least (maybe perform the operation on the main thread instead). The current behavior risks causing sites to hang waiting for some computation that might never arrive.
The text was updated successfully, but these errors were encountered: