Improve types for null accesses and remove hacks #6954

tlively · 2024-09-17T23:53:22Z

When a struct.get or array.get is optimized to have a null reference
operand, its return type loses meaning since the operation will always
trap. Previously when refinalizing such expressions, we just left their
return type unchanged since there was no longer an associated struct or
array type to calculate it from. However, this could lead to a strange
setup where the stale return type was the last remaining use of some
heap type in the module. That heap type would never be emitted in the
binary, but it was still used in the IR, so type optimizations would
have to keep updating it. Our type collecting logic went out of its way
to include the return types of struct.get and array.get expressions to
account for this strange possibility, even though it otherwise collected
only types that would appear in binaries.

In principle, all of this should have applied to call_ref as well, but
the type collection logic did not have the necessary special case, so
there was probably a latent bug there.

Get rid of these special cases in the type collection logic and make it
impossible for the IR to use a stale type that no longer appears in the
binary by updating such stale types during finalization. One possibility
would have been to make the return types of null accessors unreachable,
but this violates the usual invariant that unreachable instructions must
either have unreachable children or be branches or (unreachable).
Instead, refine the return types to be uninhabitable non-nullable
references to bottom, which is nearly as good as refining them directly
to unreachable.

We can consider refining them to unreachable in the future, but
another problem with that is that it would currently allow the parsers
to admit more invalid modules with arbitrary junk after null accessor
instructions.

When a struct.get or array.get is optimized to have a null reference operand, its return type loses meaning since the operation will always trap. Previously when refinalizing such expressions, we just left their return type unchanged since there was no longer an associated struct or array type to calculate it from. However, this could lead to a strange setup where the stale return type was the last remaining use of some heap type in the module. That heap type would never be emitted in the binary, but it was still used in the IR, so type optimizations would have to keep updating it. Our type collecting logic went out of its way to include the return types of struct.get and array.get expressions to account for this strange possibility, even though it otherwise collected only types that would appear in binaries. In principle, all of this should have applied to `call_ref` as well, but the type collection logic did not have the necessary special case, so there was probably a latent bug there. Get rid of these special cases in the type collection logic and make it impossible for the IR to use a stale type that no longer appears in the binary by updating such stale types during finalization. One possibility would have been to make the return types of null accessors unreachable, but this violates the usual invariant that unreachable instructions must either have unreachable children or be branches or `(unreachable)`. Instead, refine the return types to be uninhabitable non-nullable references to bottom, which is nearly as good as refining them directly to unreachable. We can consider refining them to `unreachable` in the future, but another problem with that is that it would currently allow the parsers to admit more invalid modules with arbitrary junk after null accessor instructions.

kripken · 2024-09-18T00:42:38Z

src/wasm/wasm.cpp

@@ -1008,7 +1008,18 @@ void CallRef::finalize() {
    return;
  }
  assert(target->type.isRef());
-  if (target->type.getHeapType().isBottom()) {
+  if (target->type.isNull()) {
+    // See StructRef for explanation.


Suggested change

// See StructRef for explanation.

// See StructGet for explanation.

kripken · 2024-09-18T00:43:01Z

src/wasm/wasm.cpp

-  if (target->type.getHeapType().isBottom()) {
+  if (target->type.isNull()) {
+    // See StructRef for explanation.
+    if (type.isRef()) {


Also, StructGet is below - maybe put the comment here and refer to it from below?

tlively requested a review from kripken September 17, 2024 23:53

kripken approved these changes Sep 18, 2024

View reviewed changes

update comments

f994e9c

tlively enabled auto-merge (squash) September 18, 2024 02:28

tlively merged commit b381a8c into main Sep 18, 2024
13 checks passed

tlively deleted the improve-null-accessor-types branch September 18, 2024 02:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve types for null accesses and remove hacks #6954

Improve types for null accesses and remove hacks #6954

tlively commented Sep 17, 2024

kripken Sep 18, 2024

kripken Sep 18, 2024

	// See StructRef for explanation.
	// See StructGet for explanation.

Improve types for null accesses and remove hacks #6954

Improve types for null accesses and remove hacks #6954

Conversation

tlively commented Sep 17, 2024

kripken Sep 18, 2024

Choose a reason for hiding this comment

kripken Sep 18, 2024

Choose a reason for hiding this comment