Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling to WASM with emscripten, parse exception: attempted pop from empty stack / beyond block start boundary #91628

Closed
zackradisic opened this issue Dec 7, 2021 · 16 comments · Fixed by #106779
Labels
C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@zackradisic
Copy link

TLDR: Auto-dereferencing seems to break compiling to WASM with emscripten when threads are enabled?

When compiling to WASM with threads enabled with emscripten I get the following error:

note: [parse exception: attempted pop from empty stack / beyond block start boundary at 5012790 (at 0:5012790)]
          Fatal: error in parsing input
          emcc: error: '/Users/zackradisic/Desktop/Code/emsdk/upstream/bin/wasm-emscripten-finalize --minimize-wasm-changes --dyncalls-i64 /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm -o /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm --detect-features' failed (returned 1)
Full error from rustc
error: linking with `emcc` failed: exit status: 1
  |
  = note: "emcc" "-s" "EXPORTED_FUNCTIONS=[\"_alloc\",\"_dealloc\",\"_init\",\"_main\",\"_print_codecs\",\"_set_opt\",\"_rust_eh_personality\"]" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.0.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.1.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.10.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.11.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.12.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.13.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.14.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.15.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.2.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.3.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.4.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.5.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.6.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.7.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.8.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.9.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.3m9l8ll8ayas3g7v.rcgu.o" "-L" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps" "-L" "/Users/zackradisic/Desktop/Code/modfy/framex/target/release/deps" "-L" "./wasm-libs/lib" "-L" "/Users/zackradisic/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libframex-7860c6c7abb794c7.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libanyhow-612bcb44f12e1d42.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libffmpeg_next-54ea0b38a1519793.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libffmpeg_sys_next-8de54784e88d927f.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/liblibc-ab293fa92d812261.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libbitflags-f0b7a29eeb4093a5.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libstd-35b925955873f6bf.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libpanic_unwind-194edf489af57a97.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libstd_detect-d58dd6fb52ab4b2d.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/librustc_demangle-a1931f836e2321a3.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libhashbrown-93cb9f33d545cf77.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/librustc_std_workspace_alloc-90ae5f2c3dc1e297.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libunwind-888b2c34d620fb42.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libcfg_if-e5bd1b9f540b98e1.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/liblibc-f003f163c4e20282.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/liballoc-b31681533b19d24b.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/librustc_std_workspace_core-e54aa39c126b63d9.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libcore-7c8d732fc023986c.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libcompiler_builtins-70ec426a1130b25e.rlib" "-l" "avcodec" "-l" "avformat" "-l" "avfilter" "-l" "avdevice" "-l" "swresample" "-l" "postproc" "-l" "swscale" "-l" "avutil" "-l" "m" "-l" "mp3lame" "-l" "x264" "-l" "workerfs.js" "-l" "c" "-s" "DISABLE_EXCEPTION_CATCHING=0" "-L" "/Users/zackradisic/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib" "-L" "/Users/zackradisic/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib/self-contained" "-o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.js" "-O3" "--memory-init-file" "0" "-g0" "-s" "DEFAULT_LIBRARY_FUNCS_TO_INCLUDE=[]" "-O3" "-pthread" "-s" "ALLOW_MEMORY_GROWTH=0" "-s" "INVOKE_RUN=0" "-s" "INITIAL_MEMORY=2146435072" "-s" "MODULARIZE=1" "-s" "EXPORT_NAME=framex" "-s" "USE_PTHREADS=1" "-s" "PROXY_TO_PTHREAD=1" "-s" "ENVIRONMENT=worker" "-s" "PTHREAD_POOL_SIZE=navigator.hardwareConcurrency" "-s" "EXPORTED_RUNTIME_METHODS=[FS,intArrayFromString,writeArrayToMemory,_malloc]" "gxx_personality_v0_stub.o" "-s" "ERROR_ON_UNDEFINED_SYMBOLS=1" "-s" "ASSERTIONS=1" "-s" "ABORTING_MALLOC=0" "-Wl,--fatal-warnings"
  = note: [parse exception: attempted pop from empty stack / beyond block start boundary at 5012790 (at 0:5012790)]
          Fatal: error in parsing input
          emcc: error: '/Users/zackradisic/Desktop/Code/emsdk/upstream/bin/wasm-emscripten-finalize --minimize-wasm-changes --dyncalls-i64 /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm -o /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm --detect-features' failed (returned 1)

I narrowed down the code causing this and it seems related to auto-dereferencing. This is the section of code, and you can see the type of parameter s in the closure is multiple nested references (&&&str or &&str). This code does not compile.

Screen Shot 2021-12-07 at 10 46 16 AM

However, when I use a reference pattern to dereference the closure parameter, the code compiles and works as expected:

Screen Shot 2021-12-07 at 10 46 02 AM

I also tested this theory by trying to compile the following code and it fails with the same error:

let test: Vec<&str> = vec![
    "hi",
    "hello",
    "mitochondria is the powerhouse of the cell",
];

let test = test
    .iter()
    .filter(|s| s.contains("h")) // `s` has the type &&&str
    .map(|s| s.to_string())  // `s` has the type &&str
    .collect::<Vec<String>>();

println!("Values: {:?}", test);

Note that this error only occurs when compiling Rust with threads.

You can view the repo with the full reproduction here.

@bjorn3 bjorn3 added C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 7, 2021
@zackradisic
Copy link
Author

There is more talk happening here. Apparently it is related to thread local storage, but it is unknown if it is a problem with rustc or emscripten. I also don't know why it specifically appears for me with auto-dereferencing

@gregbuchholz
Copy link

Apparently it is related to thread local storage

Sorry for the confusion. I don't think the parse issue you identified is related to thread local storage. It was just that when tlively was trying to reproduce your error, he had the same linker message that I get. And when I try your example program, I don't reproduce the parse exception, but instead get the thread local storage issue.

@gregbuchholz
Copy link

With the latest emscripten/llvm from git, the linking error with thread local storage is fixed. See emscripten-core/emscripten#15891

@gregbuchholz
Copy link

gregbuchholz commented Jan 31, 2022

Now I can confirm that I get the parse exception error as well after upgrading my rust and emscripten:

  = note: emcc: warning: `EMMAKEN_CFLAGS` is deprecated, please use `EMCC_CFLAGS` instead.  See https://github.com/emscripten-core/emscripten/issues/15684 [-Wdeprecated]
          [parse exception: attempted pop from empty stack / beyond block start boundary at 23280 (at 0:23280)]
          Fatal: error in parsing input
          emcc: error: '/home/greg/Extras/temp/emsdk/binaryen/main_64bit_binaryen/bin/wasm-emscripten-finalize --minimize-wasm-changes --dyncalls-i64 /home/greg/rust-examples/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.wasm -o /home/greg/rust-examples/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.wasm --detect-features' failed (returned 1)

...where I previously didn't get that error with zackradisic's repository. I'm getting that with other simple examples as well, with no references in the Rust code. Versions in question:

$ emcc --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.3-git (a1a755948a6e25c0fa62fc8fdcb89dc372618a63)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

...and...

$ rustc +nightly --version
rustc 1.60.0-nightly (a00e130da 2022-01-29)

Interesting enough, the error does not occur on my example when using the same version of emscripten above, but using an older v1.59 that I had complied previously on my machine:

$ rustc +stage1 --version
rustc 1.59.0-dev

...but the error does occur with v1.60.

@gregbuchholz
Copy link

Seems like the error message comes from here?

@gregbuchholz
Copy link

wasm-validate doesn't seem to like the file that gets feed to wasm-emscripten-finalize

$ wasm-validate target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.wasm --enable-threads 
0000d0e: error: invalid section code: 12

Running wasm-emscripten-finalize with --debug

    $ wasm-emscripten-finalize --minimize-wasm-changes --dyncalls-i64 target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.wasm -o target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.wasm  --detect-features --debug

<snip>

readExpression seeing 16
zz node: Call
<==
getInt8: 128 (at 23275)
getInt8: 128 (at 23276)
getInt8: 128 (at 23277)
getInt8: 128 (at 23278)
getInt8: 0 (at 23279)
getU32LEB: 0 ==>
== popExpression
== popExpression
== popExpression
== popExpression
== popExpression
== popExpression
== popExpression
== popExpression
[parse exception: attempted pop from empty stack / beyond block start boundary at 23280 (at 0:23280)]
Fatal: error in parsing input

@gregbuchholz
Copy link

I've put a few more details over at this repository which is a simple program that uses threads, and includes the wasm files for the working with the 1.59.0 compiler, and the broken wasm file from 1.60.0.

@gregbuchholz
Copy link

Running cargo-bisect-rustc results in the following regression report for this issue:

searched nightlies: from nightly-2021-12-06 to nightly-2022-02-03
regressed nightly: nightly-2022-01-25
searched commit range: https://github.com/rust-lang/rust/compare/84322efad553c7a79c80189f2d1b9197c1aa005f...51126be1b260216b41143469086e6e6ee567647e
regressed commit: https://github.com/rust-lang/rust/commit/42313dd29b3edb0ab453a0d43d12876ec7e48ce0

<details>
<summary>bisected with <a href='https://github.com/rust-lang/cargo-bisect-rustc'>cargo-bisect-rustc</a> v0.6.1</summary>


Host triple: x86_64-unknown-linux-gnu
Reproduce with:
```bash
cargo bisect-rustc -vv --start=2021-12-06 --with-src -- build --release 
```

@gregbuchholz
Copy link

gregbuchholz commented Feb 5, 2022

Anyone have thoughts on the most likely patch from that merge to be the regression in question? Issue 92555 at least mentions threads in the description.

@RReverser
Copy link
Contributor

RReverser commented Nov 3, 2022

I'm running into the same issue when compiling with threads on wasm32-unknown-emscripten target. I'm getting the generic error above about "pop from empty stack" when using with various binaryen tools:

[parse exception: attempted pop from empty stack / beyond block start boundary at 3966808 (at 0:3966808)]

but a more actionable error message near the same offset when compiling Wasm in Node.js:

Aborted(CompileError: WebAssembly.instantiate(): Compiling function #6558:"std::thread::local::fast::Key$LT$T$GT$::try_ini..." failed: not enough arguments on the stack for call (need 4, got 3) @+3966802)

Looks like some potential ABI mismatch or miscompilation issue in some monomorphisation of this function.

@RReverser
Copy link
Contributor

@alexcrichton

Looks like some potential ABI mismatch or miscompilation issue in some monomorphisation of this function.

@alexcrichton just for reality check - this couldn't be related to the Wasm ABI feature, because that one only affects wasm32-unknown-unknown and not wasm32-unknown-emscripten, right?

@RReverser
Copy link
Contributor

@gregbuchholz I'm not sure the bisect start is right, because I'm getting the same issue when built with nightly-2021-12-06 too.

@RReverser
Copy link
Contributor

Looks like some potential ABI mismatch or miscompilation issue in some monomorphisation of this function.

Tried to extract the full name of the function by index, but it doesn't help much:

<std::thread::local::fast::Key$LT$T$GT$::try_initialize::h739de623137ad5a2 (.llvm.15420069509463605308)>

@RReverser
Copy link
Contributor

Oh interesting, looks like this only happens after optimizations. Wasm binary built in debug mode is fine. So it's definitely some miscompilation or, at least, wasm-opt issue.

@RReverser
Copy link
Contributor

RReverser commented Nov 3, 2022

By playing with different optimisation options, narrowed it down a little further: it's the thread-local init callback in std::sys_common::thread_info::current_thread that produces invalid number of arguments after optimisation. Not sure yet whether the .clone() itself or the wrapper callback in ThreadInfo::with (I guess the latter).

That said, it might just happen to be the first thread-local that parsing is failing on.

@RReverser
Copy link
Contributor

This increasingly looks like an issue at the linking stage, because this same code compiled into individual cdylib instead of being linked with other C/C++, works just fine.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 14, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 14, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 26, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
@bors bors closed this as completed in 6155b9a Jan 26, 2023
thomcc pushed a commit to tcdi/postgrestd that referenced this issue May 31, 2023
 - Fixes rust-lang/rust#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.

r? @alexcrichton
thomcc pushed a commit to tcdi/postgrestd that referenced this issue May 31, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang/rust#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants