Final opcodes #372

rossberg · 2023-05-09T09:20:52Z

Here is my proposal for the final opcode reordering (fixes #337 and #370).

Trying to group related instructions together and leave space for future extensions in respective opcode ranges.

Also reordering type opcodes to group nullary constructors together and before non-nullary ones (c.f. #337).

jakobkummerow · 2023-05-09T14:25:20Z

Looks OK.

The bulk array instructions (from #363) appear to be missing.

Personally, I would have compacted the opcode encoding space a lot more, essentially to assign a contiguous range. My reasoning is that whatever gaps we choose to leave now have an exceedingly small likelihood to turn out to have appropriate size, so as we add instructions in future proposals we'll likely end up with a wildly fragmented mix of remaining gaps that were too large and others that were too small where we then had to simply append new opcodes at the end of the range or squeeze them into "non-fitting" gaps. OTOH, it's not a big issue because (1) at the end of the day we're just choosing between different kinds of ugly for the encoding space a couple of proposals down the line, and (2) prettiness or ugliness of the binary encoding doesn't actually matter.

tlively · 2023-05-09T14:59:54Z

FWIW, we tried to leave useful gaps in the SIMD proposal opcode space, and they did not turn out to be useful at all. If we had instead used a compact opcode space, we might not have needed to use an extra LEB byte for the relaxed SIMD opcodes.

I don't feel strongly either way, though.

titzer · 2023-05-09T16:38:14Z

+1 to just compacting the opcode space without gaps. The 1-byte opcode space is important, but IMO there is marginal benefit, if any, in leaving holes in the prefixed opcode spaces.

askeksa-google · 2023-05-12T12:01:20Z

Nit: I'd suggest to keep different encodings for the same instruction together, i.e. put ref.test (ref null ht) at 0xfb41and ref.cast (ref ht) at 0xfb42.

rossberg · 2023-05-14T12:21:38Z

I agree that we should fill up gaps before going to two bytes, but currently we are far away from hitting that boundary, and never may. Is there any particular advantage in compacting before that? I thought that leaving some gaps actually worked reasonably well for some of the Wasm 1.0 code space.

@askeksa-google, good point, fixed.

jakobkummerow · 2023-05-14T14:51:26Z

Is there any particular advantage to leaving gaps?

FWIW, here's the summary:
This PR currently lists 27 0xfb-prefixed instructions (still missing 4 array bulk instructions), and pads them with 53 gaps. These gaps are located in the following ranges of opcodes:

6 in "struct allocators"
4 in "struct accessors"
11 in "array allocators" (4 of which might be used for the bulk instructions)
11 in "array accessors" (4 of which might be used for the bulk instructions)
5 in "i31 instructions"
6 in "extern conversions"
4 in "cast instructions", non-branching group
6 in "cast instructions", branching group (assuming 0x4f is seen as the end of the range)

One data point to illustrate the low likelihood of gap sizes being lucky guesses: Aske has previously suggested a set of 8 branching i31 instructions, which (if we end up deciding to adopt them in a future proposal) wouldn't have an obvious fit in any of these gaps.

titzer · 2023-06-01T17:37:55Z

I don't think we should leave gaps in the prefixed opcode spaces. First, as mentioned above, I think it's unlikely we'll predict the size of gaps properly, meaning there will likely always be vestigial gaps. Second, leaving gaps reduces the number of two-byte encodings being utilized; in the future, will new instructions start filling in gaps if they'd otherwise overflow into the 3 byte space? Third, engines may way to use some of the unused opcode space for internal things, like quickened versions of bytecodes, which benefits interpreters.

rossberg · 2023-06-26T14:18:36Z

Coming back to this:

We left gaps in the MVP opcodes, and that proved relatively useful, both for new types and for new control instructions.
The proposed opcodes do not just leave gaps, they also provide some additional prefixing hints. For example,
- 0xfb0X struct instructions
- 0xfb1X/0xfb2X array instructions
- 0xfb3X other reference instructions
- 0xfb4X cast instructions
  I occasionally found such a thing useful in the past when I had to stare at binary code.
I happily believe this approach failed for SIMD, but that's because SIMD is extremely large and irregular.
I'm not sure I see the disadvantages. Note that we still have plenty of room in the single-byte space. We can still start filling up gaps or flow over into other prefixes once the need arises.

@titzer, I didn't follow your point about custom opcodes. Isn't that orthogonal?

titzer · 2023-06-26T15:15:56Z

@titzer, I didn't follow your point about custom opcodes. Isn't that orthogonal?

It's a minor point, but it's often useful to have a few opcodes left over for the VM, in each of the opcode spaces. This is easier to do if the opcode space fills up from the lowest numbers first. It doesn't work very well to try to use an opcode that is currently a hole and later filled up.

We left gaps in the MVP opcodes, and that proved relatively useful, both for new types and for new control instructions.

The 1-byte opcode space is a much more important space, but in retrospect, I think we would have been fine to have not left holes in it.

rossberg · 2023-06-26T16:14:41Z

This is easier to do if the opcode space fills up from the lowest numbers first.

I see, but this is mostly a question of leaving sufficient space at the end, right? Which I would think we have even now.

tlively · 2023-06-27T00:31:30Z

Let's discuss this at the subgroup meeting tomorrow. If we don't come to an agreement through discussion, I propose that we take a popularity vote to settle the issue. It doesn't seem important enough to be worth spending much additional energy or time on.

titzer · 2023-06-27T01:52:59Z

I won't be able to attend tomorrow because of a conflict, but I would generally prefer that we pack the prefixed opcode spaces rather than leaving holes. Is a wider CG discussion warranted, as this could be a potentially precedent-setting decision?

askeksa-google · 2023-06-29T09:34:08Z

If the purpose is to have some nice groupings, we could still do that while leaving significantly smaller holes, e.g.:

0x00+: Struct instructions
0x10+: Array instructions (including bulk)
0x20+: Conversions (extern conversions from 0x24)
0x28+: Casts

tlively · 2023-06-29T12:45:49Z

Is a wider CG discussion warranted, as this could be a potentially precedent-setting decision?

Maybe, but it seems extremely low priority and CG time is hard to come by these days.

We didn't have a critical mass of people to resolve this at the meeting yesterday, but I'd like to take the temperature here. We are only adding 31 prefixed opcodes, so even if we leave generous holes, we will not be close to overflowing to the second LEB byte, which would require 128 opcodes. Given that, do you:

🚀 : prefer leaving holes
❤️ : prefer not leaving holes
👀 : not care at all

askeksa-google · 2023-07-06T15:58:33Z

Whatever direction this goes in, I think it will have significant aesthetic value, and probably some practical value as well, if encodings that are obvious pairs (last 8 in the currently proposed ordering in particular) are aligned to only differ in the lsb.

tlively · 2023-07-18T14:58:25Z

As discussed at the last subgroup meeting, I've pushed a commit that simply removes all of the holes in the instruction opcode space but preserves the order of instructions.

Here's a sheet to help visualize the difference: https://docs.google.com/spreadsheets/d/1eaMdpL-MwLOwWgj-mWgv_tNq5crP_nemQ-PW54ss-Q8/edit?usp=sharing

Unfortunately, when you just close the holes like that, the obvious pairs of instructions all start on odd numbers, so they differ in their low two bits rather than just their lowest bit. I added a third column to the spreadsheet skipping a single opcode to "fix" that, but I don't think it's worth it.

rossberg · 2023-07-19T08:06:39Z

proposals/gc/MVP.md

-| 0xfb43 | `ref.cast (ref null ht)` | `ht : heaptype` |
-| 0xfb48 | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` |
-| 0xfb49 | `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` |
+| 0xfb02 | `struct.get $t i` | `$t : typeidx`, `i : fieldidx` | struct accessors (0x08+) |


The notes about group prefixes are no longer accurate.

rossberg · 2023-07-19T08:24:49Z

proposals/gc/MVP.md

+| 0xfb1a | `ref.test (ref null ht)` | `ht : heaptype` |
+| 0xfb1b | `ref.cast (ref ht)` | `ht : heaptype` |
+| 0xfb1c | `ref.cast (ref null ht)` | `ht : heaptype` |
+| 0xfb1d | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` |


rossberg · 2023-07-19T08:24:59Z

proposals/gc/MVP.md

+| 0xfb1b | `ref.cast (ref ht)` | `ht : heaptype` |
+| 0xfb1c | `ref.cast (ref null ht)` | `ht : heaptype` |
+| 0xfb1d | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` |
+| 0xfb1e | `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` |


bashor · 2023-07-19T14:30:06Z

proposals/gc/MVP.md

-| -0x31  | `rec dt*`       | `dt* : vec(subtype)` | |
-| -0x32  | `sub final $t* st` | `$t* : vec(typeidx)`, `st : strtype` | shorthand |
+| -0x31  | `sub final $t* st` | `$t* : vec(typeidx)`, `st : strtype` | shorthand |
+| -0x34  | `rec dt*`       | `dt* : vec(subtype)` | |


It looks like the interpreter wasn't changed accordingly

Oops, thanks. Fixed. Also changed the rec opcode to -0x32, not sure why I picked something else before.

askeksa-google · 2023-07-19T18:27:23Z

I do think it's worth skipping an opcode to align the pairs, especially since for ref.test and ref.cast, the lsb then becomes a null flag with the same meaning as the null flags in br_on_cast[_fail].

We still stay within the first 32 opcodes.

interpreter/binary/decode.ml

rossberg · 2023-07-22T08:29:32Z

FWIW, if we care for bits that way, shouldn't we also make it so that any _u/_s pairs of instructions only differ by LSB?

tlively · 2023-07-22T19:26:23Z

Perhaps, although I would say that struct.get{,_u,_s} and array.get(,_u,_s} are triples, so it doesn't matter so much there, leaving only i31.get_u and i31.get_s. @askeksa-google, do you have an opinion here?

rossberg · 2023-07-25T08:52:32Z

I notice that this is already not the case for some MVP instructions, so I think we can leave it as is.

tlively · 2023-07-25T14:28:37Z

Cool, let's put this into something of a "final comment period." If you are ok with the currently proposed encoding, please give this comment a 🚀 react. Otherwise, please give it an 👀 react and comment with a specific change you would like to see. After a week or so, if we have consensus, we can go ahead and merge this.

tlively · 2023-07-31T15:58:29Z

Thanks, everyone. I'll go ahead and merge this now since there have been no objections.

Update the encodings for ref.as_non_null, br_on_null, (ref ht), and (ref null ht) for consistency with the final encodings chosen in WebAssembly/gc#372.

Update the encodings for ref.as_non_null, br_on_null, (ref ht), and (ref null ht) for consistency with the final encodings chosen in WebAssembly/gc#372. Fixes #103.

Fix various places in the spec text and interpreter where binary encodings were inconsistent with those we decided on in #372.

This was missed in #372.

Reorder opcodes

8a9a77d

Swap ref.test-null and ref.cast

eb80608

jakobkummerow mentioned this pull request Jun 1, 2023

Spec binary format #383

Merged

rossberg mentioned this pull request Jun 22, 2023

Spec authoring tracking issue #376

Open

53 tasks

rossberg mentioned this pull request Jun 27, 2023

Binary encoding of value type codes with heap type indices #337

Closed

close holes in opcode space

41fff69

tlively and others added 4 commits July 18, 2023 21:37

add bulk array ops

a2c20ba

add immediates to bulk array ops

dda1060

Merge branch 'main' into opcodes

67e880d

Adjust interpreter

8812bd0

rossberg commented Jul 19, 2023

View reviewed changes

bashor reviewed Jul 19, 2023

View reviewed changes

Fix deftype opcodes

020df09

rearrange opcodes to align pairs

c967b25

rossberg commented Jul 22, 2023

View reviewed changes

interpreter/binary/decode.ml Outdated Show resolved Hide resolved

Reorder cases

edb6340

tlively mentioned this pull request Jul 23, 2023

Call for agenda for July 25 GC subgroup meeting #410

Closed

rossberg added 3 commits July 24, 2023 06:31

Merge branch 'main' into opcodes

158ee0e

Sync spec

5cb71e5

Merge branch 'main' into opcodes

60e893d

bashor mentioned this pull request Jul 25, 2023

Update V8 after they switch to final Wasm GC opcodes and turn it on by default (~ 12 Sep 2023) nodejs/node#48924

Closed

tlively merged commit 9acb6e2 into main Jul 31, 2023
5 checks passed

tlively deleted the opcodes branch July 31, 2023 15:59

jakobkummerow mentioned this pull request Aug 4, 2023

Binary encoding out of sync with GC proposal WebAssembly/function-references#103

Closed

tlively added a commit to WebAssembly/function-references that referenced this pull request Aug 6, 2023

Update binary encodings

125ee61

Update the encodings for ref.as_non_null, br_on_null, (ref ht), and (ref null ht) for consistency with the final encodings chosen in WebAssembly/gc#372.

This was referenced Aug 6, 2023

Update binary encodings WebAssembly/function-references#104

Merged

Fix encoding inconsistencies #414

Merged

tlively added a commit that referenced this pull request Aug 6, 2023

Fix encoding inconsistencies

02e3c54

Fix various places in the spec text and interpreter where binary encodings were inconsistent with those we decided on in #372.

tlively mentioned this pull request Aug 7, 2023

Final WasmGC opcodes #370

Closed

jakobkummerow mentioned this pull request Aug 28, 2023

Issues with JavaScript spec tests #396

Closed

tlively added a commit that referenced this pull request Aug 28, 2023

Fix encoding inconsistencies (#414)

95ae953

Fix various places in the spec text and interpreter where binary encodings were inconsistent with those we decided on in #372.

tlively mentioned this pull request Aug 28, 2023

Updated packed type encodings in the spec and interpreter #417

Merged

tlively added a commit that referenced this pull request Aug 28, 2023

Updated packed type encodings in the spec and interpreter

d7942c8

This was missed in #372.

tlively added a commit that referenced this pull request Aug 28, 2023

Updated packed type encodings in the spec and interpreter (#417)

e80576e

This was missed in #372.

yurydelendik mentioned this pull request Sep 15, 2023

Switch GC binary enconding bytecodealliance/wasm-tools#1204

Merged

jakobkummerow mentioned this pull request Sep 18, 2023

Recgroup binary encoding out of date #431

Closed

dhil mentioned this pull request Sep 19, 2023

Update binary encoding of function references opcodes wasmfx/specfx#5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Final opcodes #372

Final opcodes #372

rossberg commented May 9, 2023

jakobkummerow commented May 9, 2023

tlively commented May 9, 2023

titzer commented May 9, 2023

askeksa-google commented May 12, 2023

rossberg commented May 14, 2023

jakobkummerow commented May 14, 2023

titzer commented Jun 1, 2023

rossberg commented Jun 26, 2023

titzer commented Jun 26, 2023

rossberg commented Jun 26, 2023

tlively commented Jun 27, 2023

titzer commented Jun 27, 2023

askeksa-google commented Jun 29, 2023

tlively commented Jun 29, 2023

askeksa-google commented Jul 6, 2023

tlively commented Jul 18, 2023 •

edited

Loading

rossberg Jul 19, 2023

rossberg Jul 19, 2023

rossberg Jul 19, 2023

bashor Jul 19, 2023

rossberg Jul 19, 2023

askeksa-google commented Jul 19, 2023

rossberg commented Jul 22, 2023

tlively commented Jul 22, 2023

rossberg commented Jul 25, 2023

tlively commented Jul 25, 2023

tlively commented Jul 31, 2023

	\| 0xfb1d \| `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` \| `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` \|
	\| 0xfb1d \| `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` \| `flags : u8`,` $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` \|

	\| 0xfb1e \| `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` \| `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` \|
	\| 0xfb1e \| `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` \| `flags : u8`, `$l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` \|

Final opcodes #372

Final opcodes #372

Conversation

rossberg commented May 9, 2023

jakobkummerow commented May 9, 2023

tlively commented May 9, 2023

titzer commented May 9, 2023

askeksa-google commented May 12, 2023

rossberg commented May 14, 2023

jakobkummerow commented May 14, 2023

titzer commented Jun 1, 2023

rossberg commented Jun 26, 2023

titzer commented Jun 26, 2023

rossberg commented Jun 26, 2023

tlively commented Jun 27, 2023

titzer commented Jun 27, 2023

askeksa-google commented Jun 29, 2023

tlively commented Jun 29, 2023

askeksa-google commented Jul 6, 2023

tlively commented Jul 18, 2023 • edited Loading

rossberg Jul 19, 2023

Choose a reason for hiding this comment

rossberg Jul 19, 2023

Choose a reason for hiding this comment

rossberg Jul 19, 2023

Choose a reason for hiding this comment

bashor Jul 19, 2023

Choose a reason for hiding this comment

rossberg Jul 19, 2023

Choose a reason for hiding this comment

askeksa-google commented Jul 19, 2023

rossberg commented Jul 22, 2023

tlively commented Jul 22, 2023

rossberg commented Jul 25, 2023

tlively commented Jul 25, 2023

tlively commented Jul 31, 2023

tlively commented Jul 18, 2023 •

edited

Loading