-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sketch: Linear memory GC with IT #78
Comments
I think this is very interesting! I like how it leaves cross-language stuff to Interface Types, and how it might help port VMs. To make sure I fully understand: is the idea that the compiled VM, when it allocates GC objects, allocates them in these special GC pages. Then when the wasm VM does a GC, it traces from the roots through objects in those pages, using a layout that is agreed upon (I assume that's what you mean in "We'd define what a GC struct and an array must look like in the GC pages of linear memory"). Then it would somehow inform the compiled VM of which objects can be freed, but not touch memory itself? If that's generally correct, it sounds good to me! Some questions:
|
This is definitely an interesting idea, but after playing with it for a while it needs some fixing up. First of all, it's not enough to just know the roots. You need to be able to walk the gc-allocated objects and determine what is transitively reachable from the roots. But to do that you need to be able to distinguish references from integers. So the first change is that you need all of this gc memory to be Of course, there's a big downside, which is that Given that, an alternative design is to, at each allocation, specify which indices in the array are references and which are In both these designs, we're stilling using integers to index into things like the How do those patched designs sound to you @aardappel? |
Couldn't all this (local GC scope, API stability, leakage prevention) be replicated in current proposal by just not sharing references across modules? I feel like you can have all the benefits but without field access bound checks, and with ability to not use this restricted mode and share objects away.
On the flip side, there are a lot of language implementations that don't compile to machine code ahead of time. They use built-in mechanisms of virtual machines like JS or JVM to allocate memory for them. They would have to start maintaining C/C++ runtime if you require them to deal with linear memory. |
I've been thinking about this problem too and I'm glad to see I'm not alone :-) Speaking narrowly about the GC implementation bits, some attempts to summarize and comment:
(The use of the wording "weak ref" is also unfortunate; "untraced ref" might be better.) Given that we store i32r as i32 there will need to be some kind of downcast functionality for i32 -> i32r (whether a cast instruction or something embedded in a load instruction; it amounts to the same thing). Thus it is possible for some references to live as non-references in locals / on the eval stack for stretches of the program; this ties into the need for a robust safepoint mechanism. |
To bolster @skuzmich's point above, it is unrealistic to expect this code to work in wasm. Much of the complexities of run-time systems have to do with features that wasm either doesn't support or takes over, such as JITing or dynamic loading of code. Even your lower-level proposal here doesn't support the functionality I've used for my toy research languages, so I'd have to make a separate wasm implementation for them anyways rather than our current implementation using LLVM.
From what I understand, y'all want wasm modules to be able to have references to JS/DOM values. Once this is possible then you now need a multi-module GC. Similarly, if there is any way to import/export function pointers that are closures, then again you need multi-module GC.
This seems to be more of an argument against
Yes, but we should not throw out sharing of mutable data structures or large immutable data structures altogether. For example, with GC it would be easy for WASI to provide an Note that none of these comments are about the proposed design sketch; I'm just trying to clarify the motivation here. |
@kripken Yes, that is mostly correct, except that the "free-ing" is something that the Wasm GC can do itself, since it is entirely in control of what gets stored where in the GC pages. Thus the linear program doesn't need to be informed. Your Qs:
|
@RossTate I guess I wasn't clear in that I was assuming we'd have typed definitions of structs and arrays similar to the current proposals, such that once the GC knows it is looking at a particular type of struct in the GC pages, it also knows how to distinguish between refs and scalars inside of it (see also my reply to @kripken above). What you are talking about would be a design where we don't have to rely on such definitions, but the GC would work generically on blocks of N refs (and M scalars, possibly always in that order) which could be even more flexible, especially for more dynamic languages that may decide how to layout an object at runtime. This may well be superior given that we can't use static type information entirely statically anyway. Minimum header information would be storing N and M, and it wouldn't have to distinguish between structs and arrays? I generally like this direction, yes, but I'll need to think about the consequences some more. |
@skuzmich Yup, if from the above discussion we conclude that "local GC has many benefits over global GC", and we enable that optionally, by default, or even always in the existing proposal, that would be a great outcome. Re: not compiling to machine code ahead of time.. Wasm requires that currently (we have no known viable JIT path yet). Every Wasm program will always have some linear memory, and I don't think "allocate and manage an object for me" is much different in the existing GC proposal or what I am sketching here. |
(I'll edit the original post to use "untraced ref") |
Essentially you are saying languages will need a rewrite of their runtime anyway to target Wasm? And what would you write that new runtime in? (since it needs some combination of minimum overhead while still being GC instruction aware). If you're right, then indeed the argument for direct linear program <-> GC interop goes away. It is my impression that it will not be so easy to replace existing runtimes, however.
It is all about defaults. If even the smallest objects in each module, and each language, and the host can all arbitrarily refer to eachother, then yes, you need a global GC for cycles to not become a huge pain. I feel this is the wrong granularity of sharing though, and thus the wrong default. If the default is (semantic) copying, and the vast majority of cross module exchanges happen this way, then the opportunity for cycles we're not aware of is vastly reduced (if cross module refs are something you need to explicitly opt-in to). We still need such refs, especially for big expensive objects like textures etc, but these would be much more explicitly managed. Giving even the smallest struct this power is lifting the pains of sharing and leaks to the level of a many-module soup, where it is hard to know who is responsible for the problems.
Agreed (see above). |
From feedback I've gotten so far this idea really contains 2 parts, which could really be considered:
It seems so far that (1) may have more merit, and that (2) has some tricky issue to solve, and potentially some bigger downsides as a consequence. It may be worth discussing these separately, but for now maybe it is enough to be aware that (1) does not need to imply (2). |
@aardappel there are different types of WebAssembly ports with vastly different needs. From what I see, there are two sides of the spectrum of approaches that can pursued by language implementers when targeting Wasm:
In case of Kotlin, we have a partial implementation of (1), but we are moving towards (2). In the new prototype, compiler backend is rewritten to target Wasm specifically. Small runtime library is implemented in Kotlin itself with some Wasm-specific intrinsics. |
This is something I have been wondering about. See below for one thought on this.
I wouldn't say it goes away. My sense is that it is moved. There's no reason linear memory and GC cannot interop in either of the GC proposals. Wasm can have both an What would be great is for someone to make a C/C++ library of macros or primitive types/operations that get translated down to new LLVM types/operations and then to GC types and operations. Then I could write C/C++ code that actually operates on GC references. If I'm careful to only use variables on the stack, I could even make the compiled code not need linear memory. So my sense is that your point (2) is really a tooling problem and not a gc-wasm problem. |
This is something we've been thinking about. At the very least we will want to make it possible to write .s assembly files that use GC types and instructions in functions that can be passed to clang and linked with normal C/C++ code. If we can solve the problem of modeling GC types in LLVM IR and other layers of LLVM, I would also like to expose as much of the proposal as possible as source-level intrinsics and primitive types. But I would rather not make changes to LLVM IR itself, for example by adding new types, so this is going to be an interesting problem. Whether interop requires hand-written assembly or calling intrinsic functions, there will still be significant porting effort required. |
If you're just loading a the value of a field, and that code has to be hand-crafted anyway since there is no way to do it in unmodified C/C++, then whether it is done with an existing But you can do a lot more when things are linear memory addressable, including passing on (parts of) that memory to existing C/C++ code. New field loading instructions would have to create/allocate copies in linear memory to allow that, and while copies at the granularity of module boundaries is probably mostly fine, a lot of runtime code is extremely performance sensitive. Besides efficiency, copying also loses identity/mutability which may matter in some cases. To me the biggest difference is thus not in tooling, but in the runtime. It may also make tooling easier, for example in the existing proposal the tools will have to guarantee that under no circumstances an |
I put together a primer on low-level stack and control-flow primitives. One of the first examples is how to support marking of GC roots, in case y'all are interested. WebAssembly/exception-handling#105 |
To be clear, what @RossTate is proposing in WebAssembly/exception-handling#105 would allow more efficient linear memory GC with a GC written in Wasm itself, whereas the current issue allows more efficient linear memory GC with a GC provided by the engine. Both ideas share reasons why having GC objects in linear memory might be desirable (access by the language runtime, etc). The diversity of current GC implementations and language implementations is enormous, so I personally think both these options are worth pursuing, and possibly providing in parallel. Offering more options to language implementers is what will make the Wasm eco-system thrive. |
One thought on forgeability of GC roots: The current issue proposes that roots on the stack/local would be exact, but since GC linear memory is accessible by regular stores, a GC would need to do bounds/type checks to perform an actual GC. This favors the linear memory runtime having unfettered access and thus simplicity of implementation. A different tradeoff is however possible: make the GC linear memory pages readable by the Wasm code, but writeable only by special new instructions. This would allow the GC to operate at full speed without checks. The Wasm code can perform stores on arguments directly from stack/locals (which are exact), or there may be a special cast instruction |
In the interest of tomorrow's GC meeting, for those who do not want to read all of the above, here's a shorter summary of the salient points this issue is making. First, as apparent by the discussion, there are 2 parts to this idea of which the first part can be evaluated independently: Idea A: Having many modules written in many different languages communicate over interface types may well be superior to holding on to each-others references.
Idea B: IF we're going to have per module GC, there is the opportunity to make GC objects linear memory addressable.
If you want to discuss the above points however, please make sure to having read the entire issue before doing so :) |
Since we're now at phase 2 with the current MVP, I'll go ahead and close this. |
This sketches an alternative way in which GC could in theory work in Wasm. It is probably crazy, but bear with me, it may have some merit.
NOTE: this is NOT meant to be a "proposal", it is merely a sketch of an idea with most details left open, that puts emphasis on different advantages compared to the existing proposal(s), intended for discussion. If some of these aspects are deemed important, then maybe they can inform future GC features.
I will focus on these aspects of what would make for a good GC provided out of the box by Wasm:
This proposal keeps (1), arrives at (2) thru different means, and improves upon (3) & (4) compared to the existing GC proposals.
The basics of the idea are simple: A linear memory program/runtime assigns certain 64KB pages to the Wasm GC. The Wasm GC manages this memory as it sees fit. The linear program can address this memory, but only has minimal guarantees on what layout to expect. Inter module communication uses Interface Types, not
anyref
. A special kind ofi32
provides exact roots.Now, the details. This trivially retains (1), so let's tackle (2) & (3) first:
Inter-module data exchange.
The existing GC proposal(s) assume it is desirable that different modules can share references to GC objects freely, without incurring the cost of copying, and with the benefit of being able to collect inter-module cycles trivially. There are however issues with this approach:
So instead, we'll use Interface Types. The potential for Interface Types (with Shared-Nothing-Linking) to revolutionize how we build software out of parts, and how we can work with different languages is huge. If most interfacing with GC language modules is going to go over IT, we might as well go all the way, and get some of the benefits this would bring:
anyref
that affects every possible object, and can't be opted-out of.Now for (4):
Runtime interaction.
The vast majority of current language implementation come with some form of "runtime", usually written in C or C++. Besides the GC algorithm (which we're trying to save them from having to ship), this can include other support functions that implement the runtime semantics of the language, implementations of the built-in functionality, and general standard library implementations. Typically all that code will access (and create) GC objects directly. Under the existing GC proposals all this code would need an almost full rewrite from C/C++ into a language that is Wasm-GC aware (i.e. can emit Wasm-GC object access/creation opcodes directly from the language, to be efficient). There may be ways for this to retrofitted on to C/C++ (using intrinsics?) but how this would flow all the way thru the LLVM pipeline is an open question. Even if we had this functionality, this would still require a significant rewrite of a language runtime.
Will language implementers go thru such a rewrite? Or will they throw in the towel, and port their existing GC to linear memory, not using a standard Wasm GC at all?
There is generally the question of how well a standard generic Wasm GC will cater for all languages. It is well known that many languages have intricate GC designs, whose functionality intertwines with the language's promised semantics. Arguably, for the runtime to be able to interact with a Wasm GC directly in linear memory would expand the possible cases where such a GC can be used, in favor of porting an existing GC. Given that the GC is per-module, it can possibly be parametrized to fit the language, rather than collecting across languages generically, which means it cannot take anything about the language into account.
How would this GC/runtime interaction go? First, the linear memory program needs to assign pages to the GC, which it could do either manually, or we could also imagine a more managed system where the upper part of memory is always for the GC (such that the GC could grow it independently of the linear program, though possibly making
memory.grow
more expensive if the GC memory needs to be moved). I am sure there are many possible ways to do this, for now assume there exist linear memory addressable GC pages.The GC would need to know the roots outside of its GC space, which I imagine would come in two forms: references on the Wasm stack, and optionally from a table managed by the linear program of additional roots. These roots can be used by the linear program as naked pointers (
i32
), so wouldn't be ananyref
. To help the GC be exact (non-conservative), we'd introduce a new (sub)type ofi32
that, when present on the Wasm stack or a table, would be a root (let's sayi32r
). This type would degrade toi32
when stored in (unmanaged) linear memory, and would then not guarantee the pointee to be retained (it is a "untraced ref" that can only be used if the linear program guarantees there is at least one activei32r
on the stack/tables ori32
in the GC pages).Existing load/store ops can be used to access these objects, using either
i32
from linear memory ori32r
from the stack. We'd define what a GC struct and an array must look like in the GC pages of linear memory, which given their low level nature, doesn't sound too much of a restriction to GC implementation (any additional data a GC wishes to store can be at negative offsets).The GC makes NO promises as to the state of memory outside of well defined parts of structs and arrays, much like it is already undefined behavior to even access memory outside of a
malloc
returned block in C/C++.i32r
values can possibly be rewritten by the GC to point elsewhere, invalidating any "untraced ref" copies in linear memory. For this to function there needs to be a minimal level of cooperation between the linear memory program and the GC as to when these changes may happen. How, TBD.There is of course the ability for the linear program to access GC memory, and even corrupt it. This does not seem like a problem from a language point of view: almost all programming languages "trust" their runtime code already, and all have the potential for bugs in said code to disrupt the otherwise safe execution of the language. Given that this is now isolated to a single module, the impact of this is smaller than if there was a bug in a Wasm runtime, affecting all languages within.
It does have the downside of requiring a slightly more conservative Wasm GC implementation, i.e.
i32
s loaded from linear memory will need to be bounds-checked, whereas the existing GC proposals can directly treatanyref
as a pointer. I think this is a small cost compared to the potential for optimisation this new system otherwise brings (see above). It generally goes along with the existing Wasm philosophy of intra-module trusted, inter-module untrusted which IT builds on, and which existing GC proposals complicate.A final problem may be non-determinism (but only for "buggy" programs): access to GC memory outside of designated objects could result in different data under different Wasm implementations. We'd have to specify that this is undefined behavior?
The text was updated successfully, but these errors were encountered: