-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yielding to the event loop (in some way) between module evaluations #4400
Comments
Overall, this idea sounds interesting to me, just wanted to verify some things about practicality. Performance-wise, will it be scalable to yield in these cases? If we may eventually have pages with hundreds of small modules (made practical by low networking costs due to improvements in HTTP and/or bundled exchange), would it cause significant overhead on startup to perform extra microtask checkpoints or event loop turns? (See also: tc39/ecma262#1250 , where reducing the number of microtasks improved throughput signficantly across engines, but that was many more microtasks than we are talking about here.) How high do we expect the compatibility risk to be for this change? It's observable across browsers that no one yields in these cases today, but I don't know if people are depending on this property. |
If we assume that most modules are fast to evaluate (just a few func defs and export), I'd want to be careful to introduce "we must yield" steps since task runner / microtask in Chrome has non-trivial scheduler overhead. Casual Q: Would it be feasible to have it so that the UA can choose when to yield? or do you think this is a bad idea due to non-determinism? |
Note that top-level-await (where this discussion originated) effectively does exactly provide the user with a way to choose when to yield. This does seem more flexible to me than doing it for all modules. |
I must admit I am not sure I understand all aspects of it but at first glance this looks to me like a really subtle way of breaking existing code bases (if not the web, but people are using bundlers today that will hide these issues) that rely on a predictable, synchronous execution order between modules. Example:
As I understand this proposal, it will break this example and similar ones where either circular dependencies are involved or a module depends on something happening in the synchronous initialisation code of its dependent modules. Of course being current maintainer of RollupJS, the bigger issue I see is either breaking parity between bundled and unbundled ES modules or breaking core optimizsations such as scope-hoisting in interesting and subtle ways. IMO these issues should rather be solved explicitly by using things like dynamic imports. |
Or of course top level await. |
@nyaxt If we want to let the UA choose whether/when to yield, I think we'd probably want to adopt some form of tc39/proposal-top-level-await#49 for top-level await, and not depend on yielding as a mechanism to enforce ordering as previously proposed. |
Perhaps unsurprisingly, since I'm coming from the same tooling perspective, I share @guybedford and @lukastaegert's concerns. Unless bundlers change their behaviour in ways that I think users would bristle at (and find highly surprising), we risk creating a situation where the behaviour of your app differs between development and production in unexpected ways. I think I'd also push back a little on the claims about developer expectations. I suspect that if you asked devs 'do microtasks run between There is widespread awareness nowadays that we should avoid parsing and evaluating large chunks of JS, but between TLA and dynamic import there are already ways to solve this problem that are explicit, easily understood and tooling-friendly. |
@littledan Would it be feasible (and useful) to adopt the ASAP policy for web assembly modules? |
I think people who want to use tools to get large-blocks-of-script can continue to do so; they actually free us up to make progress with different and better strategies on the web, and we should not be tied down to their large-block-of-script semantics. |
@bergus Given that WebAssembly module instantiation is highly observable (though the start function), I think it's important to give it the same kinds of guarantees as JavaScript execution when modules start. It's possible that the start function might not be used very much in practice, instead with a convention to use exported |
I wouldn't want to make a recommendation until we have a sense of the performance impact. Indeed, I think that's really what's needed to move this discussion forward, is finding some real-world websites using modules, and prototyping a quick change that at least runs a microtask checkpoint and benchmarking before/after. My hope is that this doesn't become a performance issue, so that this only becomes a semantic choice. People who want the semantics of large blocks of script (i.e., runs top to bottom no yielding, potentially janking the page) can use tools to create those semantics. Whereas people who want to the semantics of small modules, can leave them un-compiled. |
In addition, we (as a whole) can't make a reccomendation either if the introduction of this feature degredates the performance of bundled JavaScript. I have no concern with attempts at making native modules faster then bundling today. But the introduction of a feature that makes modules faster by making bundled code slower, in my opinion, is unacceptable. Therefore I expect before/after assesments of performance for bundled code (I'll be happy to provide you with multiple examples in the wild) to ensure that bundled code see's zero performance regressions for this normative change to be remotely acceptable. |
This fix is unrelated to bundled code. |
Bundlers aim to implement at compile-time what the runtime semantics would have been without the bundling. Changing runtime semantics (should) change bundler semantics. As you did mention, though, tooling can still choose to "sync-ify" things by inlining them. I don't understand how the microtask checkpoint would help with interactivity, though. The control still isn't yielded from the javascript host back to the renderer. |
Sure, bundlers can decide to follow JS semantics or follow bundled semantics; that's up to them and their users. But fixing this bug in the spec will not change how bundled code behaves. We can easily deploy this feature, and improve (? pending perf tests) websites which use native ES modules, without impacting websites that use bundled code. |
This was my last option, or I guess maybe you're proposing even a variant of that, which makes the microtask checkpoint optional too? I think it's not great, and especially I'd like to get microtask checkpoints fixed. But in the end we do need to spec something implementable. And we have some precedent with the HTML parser, as something which is allowed to yield at arbitrary times in an optional fashion. So I am open to this. Do you have a sense of how hard it would be to prototype (a) just a microtask checkpoint, (b) a full yield to the event loop? Then we could try it out on some sites. |
Empirically, the right answer doesn't seem to be either large-block-of-script or many-tiny-modules, but somewhere in between (which is what modern bundlers generally strive for). Getting the best performance is always going to involve tooling, so a proposal centered on performance ought to consider feedback from that constituency. Using the term 'bugfix' implies that the current behaviour is a bug — I don't think that's been established. |
Note that code-splitting as a bundling feature will introduce arbitrary transitions between the two semantics. |
I mean, as the person who wrote the spec, I can definitely say it's a mistake and a bug that module scripts are not consistent with classic scripts. At least the microtask checkpoint work is a bugfix. |
Yep, I think that is exactly @Rich-Harris's point that tooling will be useful to allow an app to choose between the two semantics as appropriate for their scenario. |
That's not my point. I think it's unrealistic to expect developers to understand the distinction; we should strive for a situation in which browsers-doing-the-right-thing and bundlers-doing-the-right-thing leads to the same observable behaviour, so that developers don't face surprising bugs in production. For clarity, I'm not saying that tooling should drive these considerations, just that a proposal designed to maximise performance should be mindful of the the fact that performance maximisation is achieved via tooling. If bundlers change to match the proposed semantics, it will cause performance regressions in some apps. |
@domenic Your benchmarking idea sounds like it'd help work through this. I heard @kinu had some pretty exciting benchmark results for bundled exchange, which might give a more forward-looking complement. In general, I don't think new features should be limited to what can be implemented in tools, but I was looking forward to the future where small modules could be shipped directly to the web with webpackage. (I am confused about why this is considered a bug when we discussed this possibility in 2016 before it shipped in browsers (in particular, I suggested in a comment in this document you linked to considering a microtask checkpoint in this exact situation); I thought the choice was deliberate.) Edit: To clarify, I agree with what's said above that, when possible, it's optimal if we can work towards alignment between tools and browsers. Tools can help lead to faster adoption of the native implementation of new features in browsers, bridging the gap when some browsers take longer. I'm worried that that bundlers and browsers will face similar performance constraints when it comes to working with small modules (and if browsers have a faster path to queueing a task, well, maybe that's a new web API to expose). |
Sorry the "casual q" is without any data so +1 on judging based on measurement. The changes needed to measure this should be trivial. |
We should absolutely measure this but my guess is that microtaskq flush may be doable, while yielding to event loop is challenging (will likely require years of Blink optimization efforts to enable). |
tl;dr Would it make sense to yield to either the event loop or microtask queue after each JavaScript module load? Yielding to the event loop could help keep things running smoothly while code is setting itself up, but on the other hand, it could be expensive and break (possibly foolish) atomicity expectations. |
From the the UA point of view microtasks are synchronous. So microtasks wouldn't help with the jank issue at all. "Allow the browser to choose between just-microtask-checkpoint, and queuing a task " is definitely no no. Microtasks and tasks are just so totally different things. |
Right, and so is the completion of a module script execution. |
Yeah, that behavior would be ideal from a loading performance perspective.
I agree that blocking the event loop while scripts are loading doesn't sound like an improvement (and could easily become a regression) from a performance perspective. |
That would be when the entire module graph completes execution, not between each module execution. |
From Blink/V8, maybe @tzik, @GeorgNeis and/or @MayaLekova would have relevant thoughts about microtask checkpoints and modules. |
@yoavweiss This loading proposal sounds very interesting to me. It would be a significant change from current JavaScript module semantics, which are based on an "Instantiate" pass over all modules before any of them are executed. It's not clear to me how, e.g., circular dependencies would work, but probably this could be worked through. I imagine you would also want something early in the bundle that indicates that the module will be imported at startup, since the bundle may include things that are not immediately imported. Is there a good place for developing more details on this idea? |
@littledan - Thanks! :)
In your view, would that shift be web compatible? If not, can developers opt-in to that such a mode?
Yeah, for circular dependencies, we probably need to wait with execution until all the relevant modules are loaded. It will require some more thought, but it could mean that when creating the bundles, we'd need to potentially bump the priority/position of modules to which there are back-edges in the post-order DFS.
I'm not sure that's necessary, but I may be missing something. I'd imagine the bundlers would calculate the dependency graph, DFS it, and only include the immediately imported modules. But I may be discounting dynamic module imports and the will to bundle them as well. Maybe they should be included at the end of the bundle, to reduce their upfront "cost"? How do today's bundlers deal with dynamic imports?
I generally thought of this in the context of bundled exchanges and in that context, https://github.com/wicg/webpackage/issues would be the best place, even if this idea doesn't necessarily impact the format itself. Would that work? Or do you feel we need to start a new repo dedicated to this? |
@domenic I don't understand why the stack is empty. Isn't record.Evaluate() on the stack per run a module script? |
I think always running a microtask checkpoint between shouldn't really violate developer expectations given that a module could've already been imported by a different graph in which case we might've had any arbitrary event loop work between two imports. Is it even desirable to a system where by simply including a module in another graph it could break your code? It seems to me that any code that depends on no microtasks between evaluations is probably fragile. If it's really sufficiently important for certain modules to run within the same microtask then they probably need a way to explicitly say that loading this module from a separate graph is not allowed either ( |
@littledan, does WICG/webpackage#411 capture the idea you want to explore more? |
Oh, I now understand what I was missing... Even if we deliver the modules in order, the browser doesn't have the full graph until it downloads all of them, so it wouldn't know it needs to start executing them. Thanks @jyasskin for starting that issue!! :) |
Sorry for the delay. My current vote as a Chromium implementer is "flush microtask queue after each module eval" to accomodate TLA. Yes I've always been dreaming a way to embed resource dependency graph info in bundled exchanges. (@KenjiBaheux sold the idea to me.) Thanks for starting the issue. |
@nyaxt There are some other proposed ways to make top-level await work which don't require microtask checkpoints, such as tc39/proposal-top-level-await#61 . If top-level await adopts these semantics (which are currently still under discussion), do you think there are other reasons for a microtask checkpoint after each module? |
@littledan Thanks for the pointer. TLA was the only reason that I'm aware of for flushing microtask checkpoint after each module eval. |
Above, we did not reach agreement on any proposed change, partly because:
Top-level await has solved its problems about coordinating module loading in other ways, but at the same time, there is still ongoing investigation into the potential performance benefit of enabling some kind of "streaming module execution", e.g., this document by @LeszekSwirski and @yoavweiss . I'm not sure if the performance issues that streaming solves are currently the bottleneck for module loading, but they may become the bottleneck in the presence of browser optimizations for module graph loading (Chrome efforts) and improved mechanisms for module delivery (some ideas). Various JS tooling authors (e.g., @guybedford) have been investigating the technique of including multiple Given all of these factors, I wonder if what we want is an explicit opt-in to a mode where the UA is permitted (but not required) to execute some modules in a streaming fashion. The opt-in could look like this:
It would be valid to implement
The semantics of incrementally linking and executing different overlapping module subgraphs is already defined in the JavaScript standard--this is the same path that dynamic What would people think here of an opt-in approach to allowing streaming module graph execution? |
How easy is it to write code that depends on |
@annevk If you're trying to write a test, it's fairly easy to observe timings with a combination of top-level statements and I suspect that normal application code typically won't typically notice, but I expect that the bundler authors who commented above on this thread would have a stronger understanding of compatibility in practice than I do. |
Earlier this year, I created a prototype that evaluates module scripts as early as possible, for experimenting on local Chromium builds and exploring example workloads that can be optimized: Some Chromium folks around me tried a couple of workloads on this prototype but haven't see performance benefits on those workloads so far (e.g. because the root module is the heaviest module). |
Note that we wouldn't expect a prototype implementation to bring particularly large benefits without additionally implementing concurrency between the parsing/compiling and evaluation of module scripts. We should retry that prototype in Chromium once the streaming module compilation is implemented. |
@hiroshige-g Great to see this written out. My comment above was ambiguous between the two possibilities you have implemented there; it will be great to test both modes once the optimizations that @LeszekSwirski refers to are implemented. |
Any particular reason for closing this? |
Nobody seems to be pursuing this and there was significant opposition last time. If someone has a new concrete proposal that they are prototyping then that person filing a new issue makes sense to me. |
The streaming module compilation I mentioned above landed this year, we haven't had the time yet to remeasure a fully concurrent evaluation prototype but it's not the case that we're not pursuing it at all. I can open a new issue but it's the same stuff we've discussed above, just a real prototype. |
Yeah, I think a new issue with a concrete proposal (or better yet, a spec PR) would be ideal at this point. |
Can anyone help to answer that the <script type="module" onload="console.log("script loaded")">
console.log('script start execution');
const something = await import('somewhere');
console.log('script end execution')
</script> and found |
|
This is an issue to discuss a proposed, but still vague, normative change to module evaluation. This would be done in collaboration with TC39 (although it isn't strictly dependent on any changes to ES262).
If you want extra background, please read JS Modules: Determinism vs. ASAP. In terms of that document, the status quo is the super-deterministic strategy. This thread is to explore moving to (some form of) a deterministic-with-yielding strategy. We're not considering the ASAP strategy.
See also some discussion related to top-level await: tc39/proposal-top-level-await#47 (comment)
If you don't want to read those background documents, here's a summary. Currently module evaluation for a single module graph (i.e. a single
import()
or<script type=module>
) happens as one big block of synchronous code, bound by prepare to run script and clean up after running script. The event loop does not intervene in between, but instead resumes after the entire graph has evaluated. This includes immediately running a microtask checkpoint, if the stack is empty.Side note: for module scripts, the stack is always empty when evaluating them, because we don't allow sync module scripts like we do classic scripts.
This sync-chunk-of-script approach is simple to spec, but it has two problems:
Arguably, violating developer expectations. If you have two classic scripts,
<script></script><script></script>
, a microtask checkpoint will run between them, if the stack is empty. (And maybe sometimes more event loop tasks, since the parser can pause any time?) Example. But if you have two module scripts,import "./1"; import "./2";
, right now we do not run a microtask checkpoint between them.Stated another way, right now we try to sell developers the intuitive story "microtask checkpoints will run when the stack is empty." But, between evaluating two module scripts, the stack is definitely empty---and we don't run any microtask checkpoints. That's confusing.
It increases the likelihood of jank during page loading, as there is a large block of script which per spec you cannot break up the evaluation of. If it takes more than 16 ms, you're missing a frame and blocking user interaction.
The proposal is to allow yielding to the event loop between individual module scripts in the graph. This will help developer expectations, and will potentially decrease jank during page loading---although it will not help total-time-to-execute-script, it just smears it across multiple event loop turns.
What form of yielding does this look like, exactly? A few options.
Note that with top-level await, we'll likely end up yielding to the full event loop anyway at some point, e.g. if you do
await fetch(x)
at top level, a certain subgraph of the overall graph will not execute until the networking task comes back and fulfills the promise. (This also means a module can always cause a yield to the event loop withawait new Promise(r => setTimeout(r))
or similar.)I think the biggest option questions are:
/cc @whatwg/modules, @yoavweiss
The text was updated successfully, but these errors were encountered: