Fix tag import parsing in WasmOffsetConverter #18691

aheejin · 2023-02-09T00:59:42Z

To parse a tag_type, we need to skip an uint8 for an attribute and a varuint32 for a type:
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md#tag_type

Currently we only skip a byte, which I'm not sure was intended for the attribute or the type index.

Fixes #18592.

To parse a `tag_type`, we need to skip an `uint8` for an `attribute` and a `varuint32` for a `type`: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md#tag_type Currently we only skip a byte, which I'm not sure was intended for the attribute or the type index. Fixes emscripten-core#18592.

aheejin · 2023-02-09T01:01:52Z

This PR aside, I'm not sure why these two lines are marked as FIXME: should be SLEB128:

emscripten/src/wasm_offset_converter.js

Line 91 in d91e127

++offset; // FIXME: should be SLEB128

emscripten/src/wasm_offset_converter.js

Line 101 in d91e127

++offset; // // FIXME: should be SLEB128

(The latter line is removed by this PR though)

Can't we use unsignedLEB128()? (If so, I can fix one remaining use here too) (Edit: I think it is actually right to be uint8. Fixed the comment there: #18691 (comment))

sbc100 · 2023-02-09T01:39:41Z

Seems reasonable. I don't really know this code at all I'm afraid. Is there test we we can write that exercises this?

aheejin · 2023-02-09T04:08:01Z

src/wasm_offset_converter.js

              break;
            case 1: // table import
-              ++offset; // FIXME: should be SLEB128
+              ++offset; // skip elem type


Drive-by fix: I think this should be uint8, because this is an elem_type: https://github.com/WebAssembly/design/blob/main/BinaryEncoding.md#elem_type
(I know this is an old doc, but couldn't find the type in the current official spec here.. Maybe I missed it)

I think in the MVP's design it was indeed one byte, but only because it had a single possible value. With typed function references and GC we could have an arbitrary user type here, so I think it does need to be a LEB.

Will change to unsignedLEB128() then.

aheejin · 2023-02-09T05:39:26Z

Seems reasonable. I don't really know this code at all I'm afraid. Is there test we we can write that exercises this?

Not sure how to add one. I can come up with a tiny binary that includes a tag import, but not sure how to hook up a js file to run with it within the test framework. Or either I can add a test that includes all source files like the reproducer (#18592 (comment)) which has the specific settings combination (dynamic linking + Wasm EH + null sanitizer). Would you like that?

sbc100 · 2023-02-09T06:41:46Z

Seems reasonable. I don't really know this code at all I'm afraid. Is there test we we can write that exercises this?

Not sure how to add one. I can come up with a tiny binary that includes a tag import, but not sure how to hook up a js file to run with it within the test framework. Or either I can add a test that includes all source files like the reproducer (#18592 (comment)) which has the specific settings combination (dynamic linking + Wasm EH + null sanitizer). Would you like that?

It looks like we have some tests in test_core and test_other that use USE_OFFSET_CONVERTER. How about creating test_dylink_offset_coverter alongside test_pthread_offset_converter... I guess it would have to be test_dylink_offset_coverter_wasm_eh which a lot of different things to test, but maybe worth it?

aheejin · 2023-02-10T02:02:11Z

It looks like we have some tests in test_core and test_other that use USE_OFFSET_CONVERTER. How about creating test_dylink_offset_coverter alongside test_pthread_offset_converter... I guess it would have to be test_dylink_offset_coverter_wasm_eh which a lot of different things to test, but maybe worth it?

I tried to add a test, but it is kind of tricky to make it fail on the current codebase and succeed on this PR. What I did was something like this:

  @needs_dylink
  def test_dylink_offset_converter_wasm_eh(self):
    self.set_setting('USE_OFFSET_CONVERTER')
    self.emcc_args.append('-fwasm-exceptions')
    self.dylink_test(main=r'''
      #include <stdio.h>
      extern void side_throw_int();
      int main() {
        try {
          side_throw_int();
        } catch (int n) {
          printf("main: caught %d\n", n);
        }
        return 0;
      }
      void main_throw_float() {
        throw 5.3f;
      }
    ''', side=r'''
      #include <stdio.h>
      extern void main_throw_float();
      void side_throw_int() {
        try {
          main_throw_float();
        } catch (float f) {
          printf("side: caught %.1f\n", f);
        }
        throw 3;
      }
      ''', expected=['side: caught 5.3\nmain: caught 3\n'])

(The sources are copied from one of the dylink+EH tests and it doesn't use the offset converter directly within the source code, but the purpose of this test is not crashing.)

But this passes even with the current codebase due to a fluke:
So this wat line

  (import "env" "__cpp_exception" (tag (;0;) (type 4) (param i32)))
  (import "env" "memory" (memory (;0;) 256 256))

in binary corresponds to

03 65 6e 76 0f 5f 5f 63 70 70 5f 65 78 63 65 70 74 69 6f 6e 04 00 04 03 ...
1* |- 2* -| 3* |-                   4*                   -| 5* 6* 7* 8*

Tag env __cpp_exception
1*) Module string env's length (0x3)
2*) Module string env
3*) Name string __cpp_exception's length (0xf)
4*) Name string __cpp_exception
5*) Import kind: tag (0x4)
6*) Tag attribute (0 for exception)
7*) Tag type ((func (param i32))))'s index (0x4)
Memory env memory
8*) Module string env's length (0x3)
...

Without this PR, the offset converter only parse the tag attribute and not the tag type. But the tag type's index happens to be 4, even if you incorrectly parse it as a next item(memory)'s module string length, you end up incrementing the right amount of offset. In detail:

Current (incorrect) codebase:

offset += 1 // skip attribute
offset += 4 // Next item (`memory`)'s module name string length. This is actually the previous tag's type index, but incorrectly parsed

This PR:

offset += 1 // skip attribute
offset += 1 // skip tag type index (which happens to be 4)
offset += 3 // Next item (`memory`)'s module name string length (correctly parsed)

So both of them end up incrementing the same amount of offset, finishing the program without an error.

I might be able to tweak some other factors to make the type index something other than 4, but I'm not sure if the effort is worth it at this point, given that what this PR fixes is very trivial.

I also tried with other sources like core/pthread/test_pthread_dylink_exceptions.cpp but it also yielded the same type index '4'.

sbc100 · 2023-02-10T02:19:30Z

I see. Sounds like its probably not worth it to invest more effort in writing a test.

aheejin requested review from kripken and sbc100 February 9, 2023 00:59

sbc100 approved these changes Feb 9, 2023

View reviewed changes

aheejin added 3 commits February 8, 2023 19:38

Merge branch 'main' into offset_converter_tag_fix

d1888f1

Fix comments

96d8494

fix

e1b1a41

aheejin commented Feb 9, 2023

View reviewed changes

Elem type should be an LEB128

ce1075e

kripken approved these changes Feb 9, 2023

View reviewed changes

aheejin merged commit 67bb08a into emscripten-core:main Feb 10, 2023

aheejin deleted the offset_converter_tag_fix branch February 10, 2023 02:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix tag import parsing in WasmOffsetConverter #18691

Fix tag import parsing in WasmOffsetConverter #18691

Uh oh!

aheejin commented Feb 9, 2023

Uh oh!

aheejin commented Feb 9, 2023 •

edited

Loading

Uh oh!

sbc100 commented Feb 9, 2023

Uh oh!

aheejin Feb 9, 2023 •

edited

Loading

Uh oh!

kripken Feb 9, 2023

Uh oh!

aheejin Feb 9, 2023

Uh oh!

aheejin commented Feb 9, 2023

Uh oh!

sbc100 commented Feb 9, 2023

Uh oh!

aheejin commented Feb 10, 2023 •

edited

Loading

Uh oh!

sbc100 commented Feb 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix tag import parsing in WasmOffsetConverter #18691

Fix tag import parsing in WasmOffsetConverter #18691

Uh oh!

Conversation

aheejin commented Feb 9, 2023

Uh oh!

aheejin commented Feb 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 commented Feb 9, 2023

Uh oh!

aheejin Feb 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken Feb 9, 2023

Choose a reason for hiding this comment

Uh oh!

aheejin Feb 9, 2023

Choose a reason for hiding this comment

Uh oh!

aheejin commented Feb 9, 2023

Uh oh!

sbc100 commented Feb 9, 2023

Uh oh!

aheejin commented Feb 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 commented Feb 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aheejin commented Feb 9, 2023 •

edited

Loading

aheejin Feb 9, 2023 •

edited

Loading

aheejin commented Feb 10, 2023 •

edited

Loading