-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[js-api] JS API exposes function identity #1351
Comments
The spec mentions this. It is unfortunate, but generally seems unavoidable, given the interop constraints. What alternative would you suggest? |
Ah, thanks for the pointer!
I would not have used an Exported Function cache. Of course, in theory this would mean executing all the steps of "create a new Exported Function" every time, but in practice all those steps could be cached, conceptually as an object's prototype. So the only practical difference is that the change would mean allocating a new object with the given (hidden) prototype each time. |
Hm, "every time" would mean an allocation on every JS-side access to a table, or a function global, or a higher-order call. Besides the cost, wouldn't it be semantically dubious if you were to get a different value each time you access an immutable global? |
Not really; JS code just wouldn't rely on object identity for these values, which is what we want (and which I suspect is pretty common anyways). Plus, if it's an immutable global, you can always cache the value on the JS side anyways if you really want to maintain identity. That cache could be built into the standard JS API, or it could be left to the user, but either way that doesn't affect the semantics of WebAssembly. This problem will become more and more prominent. So while I see how the tradeoffs led to this design decision for functions, I am dubious that those tradeoffs will extend to other values in the same way, at which point we have to address these issues anyways. And then we'd be very close to having one universal semantics for wasm. |
It seems unfortunate for JS/wasm interop perf if each time wasm passes a |
I wouldn't be surprised if the performance cost of allocation would be negligible in the vast majority of cases (likely none of which exist currently). Plus, there are costs to not concealing function identities; they're much more indirect costs, but they might be more significant. For the cases where the performance cost is notable, I suspect the better way to address them is to provide some stronger type. |
Is the idea here to require a new object, or to permit a new object? The latter basically says "don't count on function identity in the embedding", but permits the embedder to reuse the object. |
In other words, if ever the JS API gets a |
I imagine "permit" would lead to observably different behavior between browsers, which is extremely undesirable. Separately, any proposed changes to the JS API that are not backwards compatible need very strong supporting evidence that no code will be broken and that the significant effort to make the change is justified. |
Yep. What I'm trying to gauge here is whether there's interest and, even if there isn't in doing so for |
For next meeting's agenda (i.e. not tomorrow), I'm planning on suggesting a discussion on what to do about the disparity between wasm and JS's notions of identity. Dunno if it will come to any decision, or even if that'd be a goal for that specific day, but it seems worth getting broader understanding of. Of course, more offline discussion ahead of time will help make the online discussion more fruitful. To that end, here are two thoughts on the costs of exposing identity.
Hoping to hear more thoughts! (Also, given that there are limited ways in which to get a |
The plan of record is to introduce an
This problem doesn't come from subtyping specifically. For performance reasons, an engine may often want to represent certain reference types differently on both sides (a wrapper object in JS, a direct handle to the thing in Wasm), so for some reference types it has to perform a mapping at the boundaries. Consequently, it may be unavoidable in general that the conversions ToJSValue and ToWAValue have to dispatch on the type of the object. For anything other than anyref the engine has to inspect the value's type anyway (at least in the JS->Wasm direction), since it has to check that it matches the static type. So in general, I believe this is an unavoidable cost. We can merely tweak the potential for optimisation.
I agree this is unfortunate. The simplest (and lamest) solution would be to say that the identity of JS objects returned by ToWAValue is not specified for certain types. Or we do what you suggested elsewhere and specify that it's always fresh. However, on second thought, such a semantics would probably only be applicable to a minority of reference types. For functions, we can't afford to break backwards compatibility. For GC objects, you usually want to preserve identity (at least when they are mutable). For externref it naturally has to be preserved. For exotic types like exnref there hardly ever is a reason to pass them out anyway, so the overhead would be irrelevant. The one relevant case I can think of is immutable structs, possibly.
Based on my experience with JS usage in the wild, I would bet quite a fortune that this assumption is false and that we'd break the web if we changed the semantics. :( The JS eco system and some of its API rely on function identity, cf. examples like removeEventListener. |
Even if it's a minority, not having identity for that minority can be hugely useful.
When they are mutable, identity must be preserved. When they're immutable, whether or not they have identity varies a bunch across languages. Whether "usually" is the right word just depends on which languages are more prominent. If you bake decisions into the design based on expectations of which languages will be more prominent, then you help make those languages more prominent because they're better served by the design.
There is no reasonable comparison that can be made between JS usage of functions and of funcrefs in the wild. I don't know how big the difference in number of occurrences of functions versus number of occurrences of funcrefs in JS, but I imagine it is many orders of magnitude. Also, the suggestion is not to have JS values representing funcrefs not have an identity, since every JS value has identity; the suggestion is to not have repeated fetches of funcrefs from wasm produce the same JS identity. So you can still use removeEventListener with the proposed change to JS funcrefs. What you can't do is rely on table.get returning the same JS funcref in order to remove the event listener you added from table.get. So it's not even occurrences of funcrefs in JS that we need to worry about, which is already likely very low; it's just repeated accesses to table.get returning the same value, which is likely even much lower than the occurrences of funcrefs in JS.
According to this study, WebAssembly usage is still pretty low (2000 out of the top million sites). Plus websites that are using WebAssembly are likely very actively maintained (as it's only a few years old) or not critical to anything (i.e. experiments to set up). So even if one of those 2000 sites happened to be sensitive to specifically table.get returning the same or distinct values each time (which on it's own I'm doubtful of), that sensitivity would either be quickly patched by it's active maintainer or not noticed. I think stating we can't afford to break backwards compatibility is an overstatement both of the current observability of the change and of the current prevalence of WebAssembly. |
I think there is far too much uncertainty to make that guess. As a counter-example I am familiar with, people ship games on the web using wasm which are not constantly maintained. They are published and left alone. (I'm not claiming that's the common use case, of course.) |
I think you’re right that that is a use case. More generally, there are likely sites that are essentially WebAssembly programs wrapped by some JS code to integrate it into the ecosystem (e.g. providing imports for sound and graphics). But these sites are unlikely to be relying upon repeated calls to table.get returning the exact same JS object each time. |
Agreed in the abstract, yet such a hypothetical would need somewhat more concrete evidence to apply to specific cases. I think we are worrying about premature optimisation way too often lately.
TC39's experience with hoping to get away with a seemingly innocent change and then running into years of trouble could probably fill a little book by now. Sometimes even a single breaking line in a sufficiently relevant web page (or a library, which is the bigger problem) can practically kill a language change. In this particular case it's not difficult to imagine a plausible usage pattern that would break. We would need to have sufficient confidence that such a pattern does not exist, or if it does, that it (and all downstream dependencies) are still actively maintained, and the CG can identify and convince all responsible devs to change their code beforehand. TC39 has done that successful on a couple of limited occasions. The way TC39 sometimes has tried to figure out the likelihood of breakage beforehand was by browsers implementing telemetry usage counters to survey the wild on a specific pattern. But that approach produces an increasingly weak signal as the amount of off-web usage of JS (and Wasm) grows. |
Leaving the opportunity to optimize open is different from prematurely optimizing.
Yes, and I'm trying to have us discuss that topic now, while there's still a reasonable possibility, rather than just have us give up prematurely.
Then demonstrate it. It's much easier to find a program that would break than to prove that no programs would break. At the moment, after filtering out all the testing code I'm having a hard time finding programs on GitHub that even use |
I'm working on assembling a corpus of wasm found "in the wild", so to speak. I'll let you know what I find. |
@taralx, that's a very cool undertaking! Do you think you'll have some data in time for Tuesday's meeting? If not, would it make sense to push this back to the following meeting? |
Mmm, I do not want to commit anything yet, but I am also not sure anyone should wait for me because I don't know what I'll find. I'll do my best, but regardless you probably should go ahead on Tuesday. |
Thanks for the assessment! I'll stick with the timeline then. |
In many discussions it has been suggested that
funcref
values intentionally do not have a notion of identity (i.e. you cannot ask if twofuncref
values are equal) in order to enable a number of common optimizations with functions. However, the JS API exposes the identity offuncref
values, meaning none of those optimizations would be valid in browsers. As two trivial examples just to help clarify the issue, if you have two identical function definitions, you cannot merge them, or if you have a function that just calls an imported function, you cannot simply use the imported function in place of that function.Was this intentional?
(More generally speaking, any time the JS API can do more with a wasm value than wasm itself can, then that will typically mean that observationally equivalent wasm programs will not be observationally equivalent in browsers, giving wasm effectively two semantics.)
The text was updated successfully, but these errors were encountered: