Skip to content

Conversation

@aheejin
Copy link
Member

@aheejin aheejin commented Feb 9, 2023

To parse a tag_type, we need to skip an uint8 for an attribute and a varuint32 for a type:
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md#tag_type

Currently we only skip a byte, which I'm not sure was intended for the attribute or the type index.

Fixes #18592.

To parse a `tag_type`, we need to skip an `uint8` for an `attribute`
and a `varuint32` for a `type`:
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md#tag_type

Currently we only skip a byte, which I'm not sure was intended for the
attribute or the type index.

Fixes emscripten-core#18592.
@aheejin aheejin requested review from kripken and sbc100 February 9, 2023 00:59
@aheejin
Copy link
Member Author

aheejin commented Feb 9, 2023

This PR aside, I'm not sure why these two lines are marked as FIXME: should be SLEB128:

++offset; // FIXME: should be SLEB128

++offset; // // FIXME: should be SLEB128

(The latter line is removed by this PR though)

Can't we use unsignedLEB128()? (If so, I can fix one remaining use here too) (Edit: I think it is actually right to be uint8. Fixed the comment there: #18691 (comment))

@sbc100
Copy link
Collaborator

sbc100 commented Feb 9, 2023

Seems reasonable. I don't really know this code at all I'm afraid. Is there test we we can write that exercises this?

break;
case 1: // table import
++offset; // FIXME: should be SLEB128
++offset; // skip elem type
Copy link
Member Author

@aheejin aheejin Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by fix: I think this should be uint8, because this is an elem_type: https://github.com/WebAssembly/design/blob/main/BinaryEncoding.md#elem_type
(I know this is an old doc, but couldn't find the type in the current official spec here.. Maybe I missed it)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the MVP's design it was indeed one byte, but only because it had a single possible value. With typed function references and GC we could have an arbitrary user type here, so I think it does need to be a LEB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change to unsignedLEB128() then.

@aheejin
Copy link
Member Author

aheejin commented Feb 9, 2023

Seems reasonable. I don't really know this code at all I'm afraid. Is there test we we can write that exercises this?

Not sure how to add one. I can come up with a tiny binary that includes a tag import, but not sure how to hook up a js file to run with it within the test framework. Or either I can add a test that includes all source files like the reproducer (#18592 (comment)) which has the specific settings combination (dynamic linking + Wasm EH + null sanitizer). Would you like that?

@sbc100
Copy link
Collaborator

sbc100 commented Feb 9, 2023

Seems reasonable. I don't really know this code at all I'm afraid. Is there test we we can write that exercises this?

Not sure how to add one. I can come up with a tiny binary that includes a tag import, but not sure how to hook up a js file to run with it within the test framework. Or either I can add a test that includes all source files like the reproducer (#18592 (comment)) which has the specific settings combination (dynamic linking + Wasm EH + null sanitizer). Would you like that?

It looks like we have some tests in test_core and test_other that use USE_OFFSET_CONVERTER. How about creating test_dylink_offset_coverter alongside test_pthread_offset_converter... I guess it would have to be test_dylink_offset_coverter_wasm_eh which a lot of different things to test, but maybe worth it?

@aheejin
Copy link
Member Author

aheejin commented Feb 10, 2023

It looks like we have some tests in test_core and test_other that use USE_OFFSET_CONVERTER. How about creating test_dylink_offset_coverter alongside test_pthread_offset_converter... I guess it would have to be test_dylink_offset_coverter_wasm_eh which a lot of different things to test, but maybe worth it?

I tried to add a test, but it is kind of tricky to make it fail on the current codebase and succeed on this PR. What I did was something like this:

  @needs_dylink
  def test_dylink_offset_converter_wasm_eh(self):
    self.set_setting('USE_OFFSET_CONVERTER')
    self.emcc_args.append('-fwasm-exceptions')
    self.dylink_test(main=r'''
      #include <stdio.h>
      extern void side_throw_int();
      int main() {
        try {
          side_throw_int();
        } catch (int n) {
          printf("main: caught %d\n", n);
        }
        return 0;
      }
      void main_throw_float() {
        throw 5.3f;
      }
    ''', side=r'''
      #include <stdio.h>
      extern void main_throw_float();
      void side_throw_int() {
        try {
          main_throw_float();
        } catch (float f) {
          printf("side: caught %.1f\n", f);
        }
        throw 3;
      }
      ''', expected=['side: caught 5.3\nmain: caught 3\n'])

(The sources are copied from one of the dylink+EH tests and it doesn't use the offset converter directly within the source code, but the purpose of this test is not crashing.)

But this passes even with the current codebase due to a fluke:
So this wat line

  (import "env" "__cpp_exception" (tag (;0;) (type 4) (param i32)))
  (import "env" "memory" (memory (;0;) 256 256))

in binary corresponds to

03 65 6e 76 0f 5f 5f 63 70 70 5f 65 78 63 65 70 74 69 6f 6e 04 00 04 03 ...
1* |- 2* -| 3* |-                   4*                   -| 5* 6* 7* 8*
  • Tag env __cpp_exception
    1*) Module string env's length (0x3)
    2*) Module string env
    3*) Name string __cpp_exception's length (0xf)
    4*) Name string __cpp_exception
    5*) Import kind: tag (0x4)
    6*) Tag attribute (0 for exception)
    7*) Tag type ((func (param i32))))'s index (0x4)
  • Memory env memory
    8*) Module string env's length (0x3)
    ...

Without this PR, the offset converter only parse the tag attribute and not the tag type. But the tag type's index happens to be 4, even if you incorrectly parse it as a next item(memory)'s module string length, you end up incrementing the right amount of offset. In detail:

  • Current (incorrect) codebase:
offset += 1 // skip attribute
offset += 4 // Next item (`memory`)'s module name string length. This is actually the previous tag's type index, but incorrectly parsed
  • This PR:
offset += 1 // skip attribute
offset += 1 // skip tag type index (which happens to be 4)
offset += 3 // Next item (`memory`)'s module name string length (correctly parsed)

So both of them end up incrementing the same amount of offset, finishing the program without an error.

I might be able to tweak some other factors to make the type index something other than 4, but I'm not sure if the effort is worth it at this point, given that what this PR fixes is very trivial.

I also tried with other sources like core/pthread/test_pthread_dylink_exceptions.cpp but it also yielded the same type index '4'.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 10, 2023

I see. Sounds like its probably not worth it to invest more effort in writing a test.

@aheejin aheejin merged commit 67bb08a into emscripten-core:main Feb 10, 2023
@aheejin aheejin deleted the offset_converter_tag_fix branch February 10, 2023 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[WASM Exception]: WasmOffsetConverter fails with: bad import kind: 12

3 participants