Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frozen proposal #1493

Open
zapashcanon opened this issue Nov 10, 2023 · 9 comments
Open

Frozen proposal #1493

zapashcanon opened this issue Nov 10, 2023 · 9 comments

Comments

@zapashcanon
Copy link

zapashcanon commented Nov 10, 2023

Hi,

@chambart and myself have a proposal for introducing freezable GC/global values. As asked by @dtig, we present the idea here and hopefully there will be some time available at one of the CG meeting to discuss it further.

Building recursive values

Let's say you have a recursive type $t, currently you can build a value of type $t like this:

(module
  (rec
    (type $t (struct (field $f (mut (ref null $t))))))

  (func $loop (result (ref $t))
    (local $l (ref $t))
    (local.set $l (struct.new $t (ref.null $t)))
    (struct.set $t $f (local.get $l) (local.get $l))
    (local.get $l)
  )
)

Building immutable recursive values ?

Now, let us define a type $t' which is the same as $t but with an immutable field $f:

(module
  (rec
    (type $t  (struct (field $f (mut (ref null $t))))))

  (rec
    (type $t' (struct (field $f      (ref      $t')))))

  (func $loop (result (ref $t)) ... )
)

Currently, there's no way to build a value of type $t'.

Freezing

Our proposal allows to build immutable recursive values in the following way:

(module
  (rec
    (type $t         freezable  (struct (field $f (mut (ref null $t))))))

  (rec
    (type $t_freeze (freeze $t) (struct (field $f      (ref      $t_freeze)))))

  (func $loop_tmp (result (ref $t)) ... )

  (func $loop (result (ref $t_freeze))
    (ref.freeze $t_freeze $t (call $loop_tmp))
  )
)

The point ?

The same as immutable values in general:

  • cleaner API / preserve code invariants ;
  • avoid read barriers ;
  • being more explicit about the use.

Could also avoid null checks and allow creating immutable arrays.

The idea

The freezability check can be similar to the sub-typing rules:

  • there should be the same (or less ?) fields ;
  • can remove mut annotations ;
  • can remove null annotations ;
  • fields should be (frozen versions of) subtypes (maybe also upcasts ?).

After the freeze, the freezable values should not be accessed:

  • dynamic checks on freezable types accesses ;
  • the freeze operation is expected to walk the value and flip a frozen bit ;
  • trap if fields are still null at freeze time.

Heuristic: unfrozen values are seldom accessed, frozen ones can be accessed a lot

Freezing is not 'fixed number of hardware instruction', but the combined time of building then freezing is kind of an amortized version of it.

Globals

Similarly, a global needs to be nullable in some cases:

(module
  (rec (type $t (freeze $t) (struct (field $f (ref $t))))))
  (global $g (mut (ref null $t)) (ref.null $t))

  (func $f (result (ref $t))
    (ref.as_non_null (global.get $g))
  )
  (func $loop (result (ref $t)) ...)

  (func $st
    (global.set $g (call $loop))
    (drop (call $f))
  )

  (start $st))

Freezing globals

Here is what our proposal would allow:

(module
  (global $g (mut (ref null $t)) (ref.null $t))
  (global $g_frozen (ref $t) (freeze $g))
  (func $f (result (ref $t))
    (global.get $g_frozen) ;; <-- no test
  )

  (func $st
    (global.set $g (call $loop))
    (drop (call $f))
  )
  (start $st))

But when are the global actually frozen ? Well...

Phases

We introduce phases:

(module
  (global $g (mut (ref null $t)) freezable (ref.null $t))
  (global $g_frozen (ref $t) (freeze $g) (phase 1))

  (func $f (phase 1) (result (ref $t))
    (global.get $g_frozen)
  )
  (func $loop (phase 0) (result (ref $t)) ...)

  (func $st_0 (phase 0) (global.set $g (call $loop)))
  (func $st_1 (phase 1) (drop (call $f)))
  (start $st_0 (phase 0))
  (start $st_1 (phase 1))
)

The idea:

  • invariant: cannot call a function of phase $n$ before the end of the start of phase $n-1$ ;
  • a function can only refer to values of phase less or equal than its own phase (ref and calls) ;
  • each phase can have one start ;
  • starts are run in phase order ;
  • global are frozen at the end of the previous phase ;
  • failure to freeze is a panic ;
  • accessing an unfrozen global from a previous phase is a panic ;
  • cannot export freezable global.

Also:

  • if multiple start functions seems distasteful, we could have a call_and_freeze instruction moving to the next phase ;
  • default phase is 0 ;
  • global freeze can change the type of its contents ;
  • encoding: not really thought, could be compact.

Who might be interested ?

It would be useful in our OCaml compiler Wasocaml. Maybe @vouillon would also use it in its other OCaml compiler.

Maybe @wingo would use it in its Guile compiler ? At least that what we thought after the presentation he gave at the in-person CG meeting.

Questions

Would you see other use cases for this ?

Should there be a phase 1 proposal to explore that kind of needs ?

@dtig
Copy link
Member

dtig commented Nov 10, 2023

Thanks for filing, added to the december 5th meeting (WebAssembly/meetings#1428).

@rossberg
Copy link
Member

Interesting idea, and thanks for proposing it. But although I see many potential use cases for complex initialisation of immutable types, I have to admit that I am pretty sceptical.

I primarily worry about the performance overhead and complexity of the freeze operation. That would have to walk an arbitrarily large object graph, so it's cost is unbounded, and only determined by the state of the store. And it would have to be able to deal with cycles in that graph (which, fwiw, the semantics currently forbids for immutable types).

And presumably, this operation would need to work transitively, i.e., through multiple types at once, since, e.g., you might have mutually recursive types that can only be frozen simultaneously. That means that it isn't even directly obvious from the types how deep the traversal is going to be (that information is kind of hidden in the freeze-subtype relation), or how exactly the boundary of the traversal would be detected at runtime.

Finally, thinking ahead to threaded GC objects, can we allow a concurrent thread to mutate part of the object graph while it is being frozen, possibly changing the very graph structure that freeze is currently traversing? Or do we need a global write lock for freezable objects to make freezing atomic?

Also, the freeze bit potentially requires an extra word in the representation of all freezable or frozen objects, because an implementation might not have a free bit to spare anywhere else. In general, every mechanism that requires per-object runtime meta data is a bit of a problem.

Compared to the much simpler and runtime-free notion of a readonly attribute as a supertype of mutable (which allows for the same subtyping as freezing and most invariants), this implies a lot of cost and complexity. I suspect it will be very tough to demonstrate performance benefits for reads that are substantial enough to justify this cost.

FWIW, the phases idea for globals could be replaced by just allowing ordered module-level declarations through repeated sections, an idea that keeps coming up.

@wingo
Copy link

wingo commented Nov 13, 2023

I think Hoot would be happy to experiment with frozen globals, or indeed with readonly globals: we do have a pattern where we allocate a number of globals at compile-time, but because they need run-time relocations, we have to make more things mutable and thus nullable as well. I can't be sure what the the performance and size impact of freezing might be: there are arguments in both directions.

With regards to recursive data structures, really we have just one: because of Scheme's lack of a type system, we just end up breaking many cycles via the ref eq unitype, and that's OK. There is one major exception, which is in record types: we have a CLOS-like object system where the class of an object is itself an object, and so on, up to a root which is the class of itself. That self-link needs knot-tying and its mutability and the lack of a default value for that type means that the field has to be nullable: https://gitlab.com/spritely/guile-hoot/-/blob/main/module/hoot/stdlib.scm?ref_type=heads#L296-301. Fixing that would be quite welcome, though again, I don't know what the concrete impact would be.

@ospencer
Copy link

ospencer commented Dec 5, 2023

Hey folks, I saw this proposal today during the CG meeting and had an idea for an alternative implementation that might achieve the desired outcomes without needing a concept of freezing.

Similar to the way we define recursive types:

(rec
  (type $t1 (struct (field (ref $t2))))
  (type $t2 (struct (field (ref $t1))))
)

We could allow recursive construction of multiple values (some work on the exact syntax would be needed):

(rec
  (value $s1 (struct.new $t1 (ref $s2)))
  (value $s2 (struct.new $t2 (ref $s1)))
)

The engine would allocate and perform the back-patching necessary to create the values and the results would be placed on the stack.

From the module's perspective, this occurs as a single instruction and thus there's no need to go back and fill in the references (as they've already been placed by the engine). You can then be intentional about your types and have them match the source code with GC implementations able to do optimizations on the immutable structures.

I know this would work for Grain (with the caveat that we haven't begun our wasm-GC implementation yet) because the code we generate to do this essentially does the allocations and then immediately does the back-patching, but it might not work for compilers that might want to delay back-patching for some reason, but I can't think of any.

Would this sort of approach work for Wasocaml?

@rossberg
Copy link
Member

rossberg commented Dec 6, 2023

@ospencer, incidentally, OCaml itself has this feature (on the source level). Its usefulness is somewhat disputed even there, but for Wasm, it would have many problems.

For one, this construct is not compositional: it has to work with pre-allocation and backpatching. For that, it essentially needs to modify the semantics (and compilation) of all allocation instructions occurring in the recursion. That is very strange from an instruction set perspective.

Furthermore, for that to work, all the respective allocation instructions and their operands must be directly visible to the construct. Hence it's got to be very restrictive: in particular, no function call taking or returning any of the recursively initialised variables can be allowed inside the recursion, which rules out most non-trivial use cases for initialising recursive values, because it is incompatible with factoring initialisation logic into runtime functions.

Validation would also have to ensure that the allocated references do not escape in other ways before the recursion is completed. That is not entirely straightforward, since they are implicit on the operand stack. Just imagine we were to add a dup instruction to Wasm, and what validation rules we'd need in that case.

Besides, it also hides an arbitrary number of implicit execution steps (all the backpatching, linear in the total sum of allocated object fields) in one monolithic construct, which in a low-level language like Wasm, we'd rather expect to be programmed explicitly.

@zapashcanon
Copy link
Author

zapashcanon commented Dec 6, 2023

I believe @lthls probably has things to say about this.

@eqrion
Copy link

eqrion commented Dec 6, 2023

Thanks for this proposal, and thinking about this!

I don't see a proposal repo for this yet, so I'll comment this here. We can move discussion once that's up.

From the original post, there was a list of benefits:

    cleaner API / preserve code invariants ;
    avoid read barriers ;
    being more explicit about the use.

Could also avoid null checks and allow creating immutable arrays.

I think it would be really useful to have an idea of the concrete runtime performance improvements this proposal could have. In SpiderMonkey today, we use a signal handling approach to catch null pointer accesses in most cases (some edge cases haven't been implemented yet). So there is generally little performance impact in accessing a nullable object. We also don't have read barriers at this time.

I also asked about supporting this with shared GC objects in the meeting, and think that may be worth some more discussion. My understanding is that writes to a freezable object need to check the 'frozen' bit to see if they must trap. If a racing freeze operation is happening, this access needs to be synchronized. I'm not sure if a per-object mutex is sufficient for this, as the freeze operation is recursive and you should trap if an object that points to you is being frozen (IIUC). So it seems like a global mutex would be required on all freezable accesses.

It's not a dealbreaker if this doesn't work with shared GC objects, we could just prevent them from being marked as shared. But it's good to know if that's the case.

@sunfishcode
Copy link
Member

Concerning the motivation of "cleaner API / preserve code invariants" (and I acknowledge that that is not the only motivation): In other areas in WebAssembly, we've said that extending Wasm's type system to allow source languages to model more of their invariants is not a goal:

The vast majority of a typical module's contract is not expressible within Wasm's type system and will never be. The same is still true for languages with way more sophisticated type systems. But unlike those user-facing languages, Wasm's type system is not even intended for verifying correctness. Why would we make an exception for this one property?

I'm curious how people interpret this here.

@zapashcanon
Copy link
Author

The repository for the proposal is now available at WebAssembly/frozen-values. Should we move the discussion there?

While doing some research on related work, I found the Freedom Before Commitment paper which seems quite similar to our proposal, with the difference that it does not require runtime support while maintaining the performance benefits for field reads. The only limitation I foresee is regarding arrays (as explained in their Extensions section), but I still have to think about it, it probably won't be an issue for the main use-cases we have been describing so far. I still have to look in details at the formalization of their type system and how it could be integrated to Wasm but it looks quite interesting so far.

@rossberg, I would be curious to hear you thoughts about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants