Test the soundness of `vm::Value` #40

rpjohnst · 2020-04-04T17:50:25Z

Edit: This has been addressed; this issue now tracks testing to ensure this doesn't get broken.

The VM stack contains three kinds of raw pointers, with relatively unclear rules around when and how they can be used soundly. This makes it hard to determine whether (modifications to) the interpreter are correct. Worse, it exposes the entirely-safe engine APIs to unsoundness- for example, this is the main blocker for implementing instance_create and instance_destroy.

These are the three kinds of pointers:

vm::Value can encode a vm::Array, which is itself an Rc<UnsafeCell>. However, because vm::Value is Copy, the Rc is not helpful at this level- it's there to implement copy-on-write arrays at the GML level.
vm::Row is an intermediate ptr::NonNull into a vm::Array. It is intended to be "short-lived" (i.e. not stored in instances) but we may still want Dejavu to be allowed to produce optimized bytecode such that it lives across calls to other scripts or even engine APIs.
A with iterator is a ptr::NonNull to some array of vm::Entitys owned by the vm::Thread or vm::World. It is "short-lived" in the same way as vm::Row, but unlike vm::Row is required by GML semantics to live across calls in the body of the with loop.

There are two separate-but-related problems we need to avoid:

The most obvious problem is that these pointers are not tracked by the borrow checker, making it easy to use one after its target is gone, on accident. This risk is mitigated somewhat for vm::Row and with iterators because they are confined to the VM stack, but it's still there.
The second problem is scarier and more subtle- the interpreter or the engine might violate Rust's aliasing rules and hit undefined behavior. This risk is also mitigated somewhat for raw pointers, but the code was not written with very much attention to this detail.

What I'd like to do to build confidence that this is sound:

Figure out a way to resolve the fact that Value contains an Rc but is also Copy- maybe this is okay if we can enforce that there is always a corresponding live Rc on the VM stack, or maybe we need to drop the Copy impl. (Make vm::Value non-Copy #45)
Fix the way with and instances work so it doesn't immediately explode if you e.g. instance_create in a with body. @Zekka suggested making the arrays copy-on-write, which solves the use-after-free issue; we just need to work out the specifics. (Implement instance_destroy based on copy-on-write entity lists #46)
Encode the above in the types exposed to the engine. We might need to expose the concept of a frame lifetime, add some reference counting, etc.
Audit the whole thing and fix any remaining shenanigans with references and other kinds of aliasing.
Expand the test suite with this in mind and run it through Miri, ideally on CI. The current test suite passes, but it doesn't exercise much of this stuff.

The text was updated successfully, but these errors were encountered:

Includes some `Symbol` clean-up from #42. Closes #43 by treating NaN as a tagged value. Future work on #40 may take another approach.

rpjohnst · 2020-04-28T00:58:12Z

Regarding the first checkbox (Value contains an Rc but is also Copy) I believe my original intent was that Values outside of the VM stack were really more like RcBorrows- they don't increment the refcount, but they don't decrement it either. But to expand on what the checkbox says: this isn't quite enough, because they don't actually borrow from anything.

This suggests a way to drop the Copy impl without drastically increasing refcount traffic- most of what the interpreter and engine APIs do can be done without a fully owned Value, so they can deal in &Values and use clone when they need to keep it around longer.

On its own, this just replaces all that refcount traffic with a bunch of extra references instead. It may be better to introduce something like a ValueRef<'v> type with the same representation as the Value it borrows from, but Copy like the original implementation of Value.

This moves us closer to a sound and safe API as discussed in #40. Because a `vm::Value` may point to an array, users must check its type and perhaps increment its refcount anywhere they duplicate the value. As part of overhauling `vm::Value`, this commit also slightly tweaks the NaN-boxing scheme. The quiet bit is now always set, moving tagged values up to the top of the available bit-pattern range. This simplifies decoding such that all real values are contiguous. (See #43) To avoid introducing new, redundant refcount traffic, there are now also `vm::ValueRef` and `vm::ArrayRef` types that share a representation with their owned counterparts, but can only exist as a borrow. The interpreter relies entirely on codegen to provide it with sound bytecode that does not violate host language rules. This means that each opcode handler must carefully choose whether to borrow, move, or clone its operands, so that codegen can rely on that behavior.

Entity lists are now copy-on-write. This lets the interpreter keep iterators pointing into them while (potentially) mutating them, and preserves GML `with` semantics in the presence of such mutation. In particular, this makes `instance_create` and `instance_destroy` memory safe when called from GML, as described by #40.

rpjohnst · 2020-08-08T21:58:19Z

I've addressed the known issues here, so the remaining work is to make sure we have good test coverage. This might include running the test suite under Miri and other sanitizers, as well as fuzzing.

rpjohnst changed the title ~~Pointer aliasing soundness for vm::Thread and vm::World is unclear~~ Soundness of pointers in vm::Thread and vm::World is unclear Apr 4, 2020

rpjohnst added a commit that referenced this issue Apr 13, 2020

Make Symbol non-zero to distinguish it from infinity in vm::Value

0003554

Includes some `Symbol` clean-up from #42. Closes #43 by treating NaN as a tagged value. Future work on #40 may take another approach.

rpjohnst mentioned this issue May 4, 2020

vm::Thread leaks arrays on the VM stack on errors #44

Closed

rpjohnst mentioned this issue May 16, 2020

Make vm::Value non-Copy #45

Merged

rpjohnst mentioned this issue May 23, 2020

Implement instance_destroy based on copy-on-write entity lists #46

Merged

rpjohnst changed the title ~~Soundness of pointers in vm::Thread and vm::World is unclear~~ Test the soundness of vm::Value Aug 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test the soundness of `vm::Value` #40

Test the soundness of `vm::Value` #40

rpjohnst commented Apr 4, 2020 •

edited

Loading

rpjohnst commented Apr 28, 2020

rpjohnst commented Aug 8, 2020

Test the soundness of vm::Value #40

Test the soundness of vm::Value #40

Comments

rpjohnst commented Apr 4, 2020 • edited Loading

rpjohnst commented Apr 28, 2020

rpjohnst commented Aug 8, 2020

Test the soundness of `vm::Value` #40

Test the soundness of `vm::Value` #40

rpjohnst commented Apr 4, 2020 •

edited

Loading