Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When compiling rust with threads, parse exception: attempted pop from empty stack / beyond block start boundary #15722

Closed
zackradisic opened this issue Dec 7, 2021 · 42 comments · Fixed by rust-lang/rust#106779

Comments

@zackradisic
Copy link

Sorry if this is too specific to Rust, but I thought someone might have a clue as to what is going on. When compiling Rust with threads enabled with emscripten I get the following error:

note: [parse exception: attempted pop from empty stack / beyond block start boundary at 5012790 (at 0:5012790)]
          Fatal: error in parsing input
          emcc: error: '/Users/zackradisic/Desktop/Code/emsdk/upstream/bin/wasm-emscripten-finalize --minimize-wasm-changes --dyncalls-i64 /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm -o /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm --detect-features' failed (returned 1)
Full error from rustc
error: linking with `emcc` failed: exit status: 1
  |
  = note: "emcc" "-s" "EXPORTED_FUNCTIONS=[\"_alloc\",\"_dealloc\",\"_init\",\"_main\",\"_print_codecs\",\"_set_opt\",\"_rust_eh_personality\"]" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.0.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.1.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.10.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.11.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.12.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.13.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.14.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.15.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.2.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.3.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.4.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.5.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.6.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.7.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.8.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.framex.f5802fb7-cgu.9.rcgu.o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.3m9l8ll8ayas3g7v.rcgu.o" "-L" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps" "-L" "/Users/zackradisic/Desktop/Code/modfy/framex/target/release/deps" "-L" "./wasm-libs/lib" "-L" "/Users/zackradisic/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libframex-7860c6c7abb794c7.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libanyhow-612bcb44f12e1d42.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libffmpeg_next-54ea0b38a1519793.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libffmpeg_sys_next-8de54784e88d927f.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/liblibc-ab293fa92d812261.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libbitflags-f0b7a29eeb4093a5.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libstd-35b925955873f6bf.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libpanic_unwind-194edf489af57a97.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libstd_detect-d58dd6fb52ab4b2d.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/librustc_demangle-a1931f836e2321a3.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libhashbrown-93cb9f33d545cf77.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/librustc_std_workspace_alloc-90ae5f2c3dc1e297.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libunwind-888b2c34d620fb42.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libcfg_if-e5bd1b9f540b98e1.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/liblibc-f003f163c4e20282.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/liballoc-b31681533b19d24b.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/librustc_std_workspace_core-e54aa39c126b63d9.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libcore-7c8d732fc023986c.rlib" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/libcompiler_builtins-70ec426a1130b25e.rlib" "-l" "avcodec" "-l" "avformat" "-l" "avfilter" "-l" "avdevice" "-l" "swresample" "-l" "postproc" "-l" "swscale" "-l" "avutil" "-l" "m" "-l" "mp3lame" "-l" "x264" "-l" "workerfs.js" "-l" "c" "-s" "DISABLE_EXCEPTION_CATCHING=0" "-L" "/Users/zackradisic/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib" "-L" "/Users/zackradisic/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib/self-contained" "-o" "/Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.js" "-O3" "--memory-init-file" "0" "-g0" "-s" "DEFAULT_LIBRARY_FUNCS_TO_INCLUDE=[]" "-O3" "-pthread" "-s" "ALLOW_MEMORY_GROWTH=0" "-s" "INVOKE_RUN=0" "-s" "INITIAL_MEMORY=2146435072" "-s" "MODULARIZE=1" "-s" "EXPORT_NAME=framex" "-s" "USE_PTHREADS=1" "-s" "PROXY_TO_PTHREAD=1" "-s" "ENVIRONMENT=worker" "-s" "PTHREAD_POOL_SIZE=navigator.hardwareConcurrency" "-s" "EXPORTED_RUNTIME_METHODS=[FS,intArrayFromString,writeArrayToMemory,_malloc]" "gxx_personality_v0_stub.o" "-s" "ERROR_ON_UNDEFINED_SYMBOLS=1" "-s" "ASSERTIONS=1" "-s" "ABORTING_MALLOC=0" "-Wl,--fatal-warnings"
  = note: [parse exception: attempted pop from empty stack / beyond block start boundary at 5012790 (at 0:5012790)]
          Fatal: error in parsing input
          emcc: error: '/Users/zackradisic/Desktop/Code/emsdk/upstream/bin/wasm-emscripten-finalize --minimize-wasm-changes --dyncalls-i64 /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm -o /Users/zackradisic/Desktop/Code/modfy/framex/target/wasm32-unknown-emscripten/release/deps/framex.wasm --detect-features' failed (returned 1)

I narrowed down the code causing this and it seems related to Rust's auto-dereferencing. This is the section of code, and you can see the type of parameter s in the closure is multiple nested references (&&&str or &&str). Rust's compiler will auto-dereference them to the appropriate type &str. This code does not compile.

Screen Shot 2021-12-07 at 10 46 16 AM

However, when I use a reference pattern to dereference the closure parameter, the code compiles and works as expected:

Screen Shot 2021-12-07 at 10 46 02 AM

I also tested this theory by trying to compile the following code and it fails with the same error:

let test: Vec<&str> = vec![
    "hi",
    "hello",
    "mitochondria is the powerhouse of the cell",
];

let test = test
    .iter()
    .filter(|s| s.contains("h")) // `s` has the type &&&str
    .map(|s| s.to_string())  // `s` has the type &&str
    .collect::<Vec<String>>();

println!("Values: {:?}", test);

Note that this error only occurs when compiling Rust with threads.

You can view the repo with the full reproduction here.

@tlively
Copy link
Member

tlively commented Dec 7, 2021

Thanks for the bug report! What version of emscripten are you using? I tried reproducing this problem with my local development version but I got a bunch of unrelated wasm-ld errors.

@tlively
Copy link
Member

tlively commented Dec 7, 2021

@sbc100 these are the linker errors I'm seeing:

Errors wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.13.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY::h1012c9102a13ce23` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.13.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY::h1012c9102a13ce23` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.13.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY::h1012c9102a13ce23` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.13.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY::h1012c9102a13ce23` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::STATE::h169d526b77f4357a (.0.0.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::STATE::h169d526b77f4357a (.0.0.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::STATE::h169d526b77f4357a (.0.0.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::STATE::h169d526b77f4357a (.0.0.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::STATE::h169d526b77f4357a (.0.0.llvm.11106815098784871829)` wasm-ld: error: /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib(std-bb9bd8e6be0c4a41.std.27211dd0-cgu.2.rcgu.o): relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::VAL::ha754faf8838943fe (.llvm.11106815098784871829)` wasm-ld: error: too many errors emitted, stopping now (use -error-limit=0 to see all errors) emcc: error: '/usr/local/google/home/tlively/code/llvm-local/bin/wasm-ld -o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.wasm /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.0.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.1.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.10.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.11.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.12.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.13.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.14.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.15.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.2.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.3.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.4.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.5.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.6.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.7.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.8.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.rust_emscripten_bug.1a649d5b-cgu.9.rcgu.o /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/rust_emscripten_bug.45h9o81x6uy5xph2.rcgu.o -L/usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps -L/usr/local/google/home/tlively/code/rust-emscripten-bug/target/release/deps -L/usr/local/google/home/tlively/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/wasm32-unknown-emscripten/lib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd-bb9bd8e6be0c4a41.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libpanic_unwind-25d50f5331cee422.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libstd_detect-d25150c87dd99d00.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/librustc_demangle-a30115b140842a8e.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libhashbrown-2559fafd0d42a7e1.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/librustc_std_workspace_alloc-843536d29cb089db.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libunwind-4284a5761490eb82.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libcfg_if-023e08fc339a59ec.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/liblibc-24e7f8b4a46aac29.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/liballoc-e6221b06d7026b33.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/librustc_std_workspace_core-554b671000871935.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libcore-80f108c08196e677.rlib /usr/local/google/home/tlively/code/rust-emscripten-bug/target/wasm32-unknown-emscripten/release/deps/libcompiler_builtins-6da11e675034d1a4.rlib -lc-mt-debug -L/usr/local/google/home/tlively/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/wasm32-unknown-emscripten/lib -L/usr/local/google/home/tlively/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/wasm32-unknown-emscripten/lib/self-contained --fatal-warnings -L/usr/local/google/home/tlively/code/emscripten/cache/sysroot/lib/wasm32-emscripten /usr/local/google/home/tlively/code/emscripten/cache/sysroot/lib/wasm32-emscripten/crtbegin.o -lGL-mt -lal -lhtml5 -lstubs-debug -lc-mt-debug -lcompiler_rt-mt -lc++-mt -lc++abi-mt -ldlmalloc-mt -lc_rt-mt -lsockets-mt -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --import-undefined --import-memory --shared-memory --strip-debug --export-if-defined=main --export-if-defined=rust_eh_personality --export-if-defined=emscripten_stack_get_end --export-if-defined=emscripten_stack_get_free --export-if-defined=emscripten_stack_init --export-if-defined=stackSave --export-if-defined=stackRestore --export-if-defined=stackAlloc --export-if-defined=__wasm_call_ctors --export-if-defined=fflush --export-if-defined=__errno_location --export-if-defined=__emscripten_init_main_thread --export-if-defined=emscripten_dispatch_to_thread_ --export-if-defined=_emscripten_main_thread_futex --export-if-defined=_emscripten_thread_init --export-if-defined=_emscripten_thread_exit --export-if-defined=_emscripten_thread_free_data --export-if-defined=emscripten_current_thread_process_queued_calls --export-if-defined=_emscripten_allow_main_runtime_queued_calls --export-if-defined=emscripten_futex_wake --export-if-defined=emscripten_get_global_libc --export-if-defined=emscripten_main_browser_thread_id --export-if-defined=emscripten_main_thread_process_queued_calls --export-if-defined=emscripten_run_in_main_runtime_thread_js --export-if-defined=emscripten_stack_set_limits --export-if-defined=emscripten_sync_run_in_main_thread_2 --export-if-defined=emscripten_sync_run_in_main_thread_4 --export-if-defined=emscripten_tls_init --export-if-defined=pthread_self --export-if-defined=pthread_testcancel --export-if-defined=exit --export-if-defined=memalign --export-if-defined=emscripten_proxy_main --export-if-defined=malloc --export-if-defined=free --export-if-defined=__cxa_is_pointer_type --export-if-defined=__cxa_can_catch --export-if-defined=setThrew --export-if-defined=ntohs --export-if-defined=htons --export-if-defined=htonl --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-table -z stack-size=5242880 --initial-memory=2146435072 --no-entry --max-memory=2146435072 --global-base=1024' failed (returned 1)

Does that look like a known issue?

@sbc100
Copy link
Collaborator

sbc100 commented Dec 7, 2021

Its hard to tell if this is a linker error or an actual code gen issue. Relocations of type R_WASM_MEMORY_ADDR_TLS_SLEB should only be made against TLS symbols. So if the symbols in questions are supposed to be TLS then the relocation are correct but the symbols are not being marked correctly. If the converse is true and the symbols in questions are not supposed to be TLS then the codegen should be be creating relocations of type R_WASM_MEMORY_ADDR_TLS_SLEB.

I suppose it could be a bug in lld's ability to devine which symbols are TLS. Support for an explicit TLS symbol flag was added just a few months ago in https://reviews.llvm.org/D109426. Prior to that we would decide if a symbol was TLS or not purely based on the name of the data segment in which is was defined.

Perhaps you could attach on of the object files generating these error (e.g. std-bb9bd8e6be0c4a41.std.27211dd0-cgu.13.rcgu.o) and tell me if the symbol in question should be TLS or how (e.g. std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY::h1012c9102a13ce23)

@zackradisic
Copy link
Author

What version of emscripten are you using?

I was using 2.0.27. I got the same errors you got when using the latest version

@gregbuchholz
Copy link

gregbuchholz commented Dec 17, 2021

Its hard to tell if this is a linker error or an actual code gen issue. Relocations of type R_WASM_MEMORY_ADDR_TLS_SLEB should only be made against TLS symbols. So if the symbols in questions are supposed to be TLS then the relocation are correct but the symbols are not being marked correctly. If the converse is true and the symbols in questions are not supposed to be TLS then the codegen should be be creating relocations of type R_WASM_MEMORY_ADDR_TLS_SLEB.

@sbc100 Can I hijack this thread a bit? I've come across the same linker error message about thread local storage when trying to see if threads might work on Rust with its wasm32-unknown-emscripten target. I've put a very small example program here, which demonstrates the error message:

tls.tls.d9f5019e-cgu.0.rcgu.o: relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `tls::VAR1::__getit::__KEY::hf0a0b0a039c53e14

...and I've also included the object file it references (tls.tls.d9f5019e-cgu.0.rcgu.o) along with the assembly file ("tls.s"). Those symbols are supposed to be part of the thread local storage, so it sounds like the symbols aren't getting marked correctly somewhere along the line.

There is some more background that I've been able to gather over at: Thread Local Storage, wasm-ld, thread_local!, and Emscripten, and here.

This is using the following version of Emscripten:

$ emcc -v
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.0.0 (3fd52e107187b8a169bb04a02b9f982c8a075205)
clang version 14.0.0 (https://github.com/llvm/llvm-project 4348cd42c385e71b63e5da7e492172cff6a79d7b)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /home/greg/emsdk/upstream/bin

Let me know if there is a more appropriate place to post this question.

Thanks!

@RReverser
Copy link
Collaborator

It seems like it's just a binaryen issue at this point; looks like even changing linker opts to -O1 instead of -O2 or higher is enough to work around the parsing error mentioned in the beginning.

@kripken
Copy link
Member

kripken commented Jan 11, 2023

@RReverser Does that wasm file parse ok in wabt? (similar errors in the past have been wasm-ld bugs in my experience, and not parsed ok in wabt)

If it does parse then it is likely a binaryen error, and I can take a look at that if you attach the wasm file.

@RReverser
Copy link
Collaborator

Heh it doesn't:

lib/vips-es6.wasm:001372a: error: type mismatch in drop, expected [any] but got []
lib/vips-es6.wasm:0013d5e: error: type mismatch at end of function, expected [] but got [i32]
lib/vips-es6.wasm:0014096: error: type mismatch at end of function, expected [] but got [i32]
lib/vips-es6.wasm:00142f5: error: type mismatch at end of function, expected [] but got [i32]
lib/vips-es6.wasm:0014554: error: type mismatch at end of function, expected [] but got [i32]
lib/vips-es6.wasm:0014b22: error: type mismatch at end of function, expected [] but got [i32, i32]
lib/vips-es6.wasm:0014ff3: error: type mismatch at end of function, expected [] but got [i32, i32]
lib/vips-es6.wasm:001531c: error: type mismatch at end of function, expected [] but got [i32]
lib/vips-es6.wasm:0015639: error: type mismatch at end of function, expected [] but got [i32]
...

I guess it is wasm-ld then :(

@tlively
Copy link
Member

tlively commented Jan 12, 2023

Actually this is probably a bug in the LLVM backend, since wasm-ld doesn't change function contents except for relocations.

@gregbuchholz
Copy link

I haven't followed this closely at all, but is the following patch applied?

#15891 (comment)

@RReverser
Copy link
Collaborator

RReverser commented Jan 12, 2023

I haven't followed this closely at all, but is the following patch applied?

Yeah it was long merged, and it was about a linking error which I no longer see.

looks like even changing linker opts to -O1 instead of -O2 or higher is enough to work around the parsing error mentioned in the beginning

FWIW this was wrong. -O1 at link time simply suppresses the error (because Emscripten doesn't need to parse Wasm), but the Wasm is still corrupted. The only thing that does help for now is if I build the Rust module in --debug mode, or if I build Rust in --release separately with wasm-bindgen and don't link with any C++.

In November I also left some further comments with my investigation on the rust-lang/rust variant of this issue that pointed to invalid number of arguments among other things: rust-lang/rust#91628

I have no idea where to go from here and can't reproduce this with any minimal repros so far either (the one linked in the beginning of the post doesn't seem to exhibit this issue anymore).

@tlively
Copy link
Member

tlively commented Jan 12, 2023

It may be sufficient to have any reproducer, even if it's not minimal. If you can get LLVM IR out of rustc that exhibits the bad code gen, that should be enough to go on.

@RReverser
Copy link
Collaborator

So as I mentioned earlier, if I build Rust crate alone to a bin (essentially equivalent of cdylib in Wasm, since this is a library), then everything works fine, so I was sure it's a linking issue between Rust built to staticlib + C++. However, now I decided to try to unpack the generated .a archive and run wasm-validate over all the individual object files.

This looks promising, at least it did narrow down the issue to somewhere in stdlib, so it's indeed not a linker issue:

...
siphasher-12578d3836bafc0c.siphasher.c8c35879-cgu.4.rcgu.o 0
siphasher-12578d3836bafc0c.siphasher.c8c35879-cgu.5.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.0.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.1.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.2.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.3.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:0004e3a: error: type mismatch at end of block, expected [] but got [i32]
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:0006173: error: type mismatch at end of function, expected [] but got [i32]
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:0006b96: error: type mismatch at end of function, expected [] but got [i32]
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:0006daf: error: type mismatch at end of function, expected [] but got [i32]
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:00078d1: error: type mismatch at end of block, expected [] but got [i32]   
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:000a191: error: type mismatch at end of function, expected [] but got [i32,
 i32]
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:000a332: error: type mismatch at end of function, expected [] but got [i32,
 i64]
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o:000a57a: error: type mismatch at end of block, expected [] but got [i32]   
std-8348dc5e440aab06.std.3bd46ef2-cgu.4.rcgu.o 1
std-8348dc5e440aab06.std.3bd46ef2-cgu.5.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.6.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.7.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.8.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.9.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.10.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.11.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.12.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.13.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.14.rcgu.o 0
std-8348dc5e440aab06.std.3bd46ef2-cgu.15.rcgu.o:000cc5f: error: type mismatch at end of function, expected [] but got [i32
]
std-8348dc5e440aab06.std.3bd46ef2-cgu.15.rcgu.o 1
std_detect-d55062222593b70a.std_detect.0445155b-cgu.0.rcgu.o 0
std_detect-d55062222593b70a.std_detect.0445155b-cgu.1.rcgu.o 0
...

I'll try to get the LLVM IR now.

@RReverser
Copy link
Collaborator

RReverser commented Jan 12, 2023

Got the LLVM IR, confirmed that it produces invalid Wasm object via llc -filetype=obj + wasm-validate.

Then, I decided to try reducing it via llvm-reduce + test above, and it ended up with this as an example:

target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-f128:64-n32:64-S128-ni:1:10:20"
target triple = "wasm32-unknown-emscripten"

@__cxa_thread_atexit_impl = extern_weak global i8

define void @_ZN3std3sys4unix17thread_local_dtor13register_dtor17hb4d1ef11a2635878E() personality ptr null {
bb1:
  %_7 = tail call i32 @__cxa_thread_atexit_impl(ptr null, ptr null, ptr null)
  ret void
}

; uselistorder directives
uselistorder ptr null, { 1, 2, 3, 4, 5, 0 }

That __cxa_thread_atexit_impl declaration looks odd (shouldn't it have a function signature even for external symbol?) and, when compiled and validated, I'm getting a similar error:

reduced.ll.wasm:000004e: error: type mismatch at end of function, expected [] but got [i32, i32]

This is what wasm-objdump (one of few tools that handle this broken Wasm) shows inside:

reduced.wasm:   file format wasm 0x1

Section Details:

Type[1]:
 - type[0] () -> nil
Import[1]:
 - memory[0] pages: initial=0 <- env.__linear_memory
Function[1]:
 - func[0] sig=0 <_ZN3std3sys4unix17thread_local_dtor13register_dtor17hb4d1ef11a2635878E>
Code[1]:
 - func[0] size=15 <_ZN3std3sys4unix17thread_local_dtor13register_dtor17hb4d1ef11a2635878E>
Custom:
 - name: "linking"
  - symbol table [count=2]
   - 0: F <_ZN3std3sys4unix17thread_local_dtor13register_dtor17hb4d1ef11a2635878E> func=0 [ binding=global vis=default ]  
   - 1: D <__cxa_thread_atexit_impl> [ undefined binding=weak vis=default ]
Custom:
 - name: "reloc.CODE"
  - relocations for section: 3 (Code) [1]
   - R_WASM_MEMORY_ADDR_LEB offset=0x00000a(file=0x000048) symbol=1 <__cxa_thread_atexit_impl>

Code Disassembly:

000040 func[0] <_ZN3std3sys4unix17thread_local_dtor13register_dtor17hb4d1ef11a2635878E>:
 000041: 41 00                      | i32.const 0
 000043: 41 00                      | i32.const 0
 000045: 41 00                      | i32.const 0
 000047: 10 80 80 80 80 00          | call 0 <_ZN3std3sys4unix17thread_local_dtor13register_dtor17hb4d1ef11a2635878E>     
           000048: R_WASM_MEMORY_ADDR_LEB 1 <__cxa_thread_atexit_impl>
 00004d: 1a                         | drop
 00004e: 0b                         | end

And I can see that the same __cxa_thread_atexit_impl declaration among others is present in the full LLVM IR too. Attaching the full one if you want to dig deeper:
std-8348dc5e440aab06.zip

@RReverser
Copy link
Collaborator

RReverser commented Jan 12, 2023

The __cxa_thread_atexit_impl seems to be coming from here: https://github.com/rust-lang/rust/blob/222d1ff68d5bfe1dc2d7f3f0c42811fe12964af9/library/std/src/sys/unix/thread_local_dtor.rs#L29

I guess this is a classic problem of mismatching function pointers in Wasm? Although it's not clear to me why it doesn't cause issues under other circumstances / for other projects, and that code has been there since 2016 so surely it would've been caught by now...

Or is the problem elsewhere?

@kleisauke
Copy link
Collaborator

Prior to commit 9b98d42, __cxa_thread_atexit_impl would end-up calling atexit for non-pthread builds.

Perhaps that should be re-added as a stub for all builds? Similar to what is done for __gxx_personality_v0:

#if !defined(__USING_WASM_EXCEPTIONS__)
// Until recently, Rust's `rust_eh_personality` for emscripten referred to this
// symbol. If Emscripten doesn't provide it, there will be errors when linking
// rust. The rust personality function is never called so we can just abort.
// We need this to support old versions of Rust.
// https://github.com/rust-lang/rust/pull/97888
// TODO: Remove this when Rust doesn't need it anymore.
extern "C" _LIBCXXABI_FUNC_VIS _Unwind_Reason_Code
__gxx_personality_v0(int version,
_Unwind_Action actions,
uint64_t exceptionClass,
_Unwind_Exception* unwind_exception,
_Unwind_Context* context) {
abort();
}
#endif // !defined(__USING_WASM_EXCEPTIONS__)

@RReverser
Copy link
Collaborator

Ok so it's indeed __cxa_thread_atexit_impl. I removed it in my local Rust checkout so the linked code became just

#[cfg_attr(target_family = "wasm", allow(unused))] // might remain unused depending on target details (e.g. wasm32-unknown-emscripten)
pub unsafe fn register_dtor(t: *mut u8, dtor: unsafe extern "C" fn(*mut u8)) {
    use crate::mem;
    use crate::sys_common::thread_local_dtor::register_dtor_fallback;

    extern "C" {
        #[linkage = "extern_weak"]
        static __dso_handle: *mut u8;
    }
    register_dtor_fallback(t, dtor);
}

and reran with same -Zbuild-std=... build that rebuild standard library in addition to the actual code, and now it produces valid Wasm.

I'm a bit surprised because the original log has a lot of functions with mismatching signatures, but perhaps parser / validator is just in a broken state after first error and those other messages should be ignored.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 12, 2023

Nice, so you have a rust-side fix?

@RReverser
Copy link
Collaborator

No, it's just a hotfix to see if removing the function would help. I'm not sure if it's the right thing to do in general though - I assume we do want to call __cxa_thread_atexit_impl when available?

The actual problem seems to be to do with either the function pointer conversion, or with the weak linkage declarator producing the problematic LLVM IR above. I think it's still something that LLVM should take care of or at least throw an error earlier in the process...

@RReverser
Copy link
Collaborator

What I'm still most confused about is

why it doesn't cause issues under other circumstances / for other projects, and that code has been there since 2016 so surely it would've been caught by now...

I even tried minimal repros that use all the same feature flags, link with threads etc., but they don't exhibit this problem. Not sure what's different in this project :/

@RReverser
Copy link
Collaborator

RReverser commented Jan 12, 2023

If we ignore the Rust source for now, is this LLVM IR

target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-f128:64-n32:64-S128-ni:1:10:20"
target triple = "wasm32-unknown-emscripten"

@__cxa_thread_atexit_impl = extern_weak global i8

define void @_ZN3std3sys4unix17thread_local_dtor13register_dtor17hb4d1ef11a2635878E() personality ptr null {
bb1:
  %_7 = tail call i32 @__cxa_thread_atexit_impl(ptr null, ptr null, ptr null)
  ret void
}

; uselistorder directives
uselistorder ptr null, { 1, 2, 3, 4, 5, 0 }

expected to produce a broken Wasm? Is there anything LLVM can do here - like figure out that __cxa_thread_atexit_impl should actually be a function pointer with the given signature and generate correct Wasm for it?

kleisauke added a commit to kleisauke/emscripten that referenced this issue Jan 12, 2023
@RReverser
Copy link
Collaborator

Given that Emscripten doesn't have __cxa_thread_atexit_impl these days, I'll make a PR to Rust to fix this particular issue by simply not calling this function.

I'd love to understand the underlying issue with weak linkage though, as it seems the same could've happened on any other weak function declaration.

RReverser added a commit to RReverser/rust that referenced this issue Jan 12, 2023
 - Fixes rust-lang#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.

r? @alexcrichton
@RReverser
Copy link
Collaborator

Made a PR here: rust-lang/rust#106779

@sbc100
Copy link
Collaborator

sbc100 commented Jan 12, 2023

Given that Emscripten doesn't have __cxa_thread_atexit_impl these days, I'll make a PR to Rust to fix this particular issue by simply not calling this function.

IIUC we do include __cxa_thread_atexit_impl form libc++, but this doesn't get included for C programs.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 12, 2023

What you could do instead perhaps is provide a weak definition of __cxa_thread_atexit_impl that does what you want to happen when the real one is missing.

@tlively
Copy link
Member

tlively commented Jan 12, 2023

Is there anything LLVM can do here - like figure out that __cxa_thread_atexit_impl should actually be a function pointer with the given signature and generate correct Wasm for it?

We have this pass to fix up a similar issue that can arise with bitcasts between function types: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyFixFunctionBitcasts.cpp. I wonder if the weak linkage makes the i8 global callable as a function without a cast? I would be surprised if that were allowed normally. My guess is that we could fix this in LLVM side by adding a similar pass or adding functionality to that pass.

@kleisauke
Copy link
Collaborator

Perhaps that should be re-added as a stub for all builds?

After a further thought, this is probably not a good idea since that would cause TLS destructors to abort() or to become no-op.

- 1: D <__cxa_thread_atexit_impl> [ undefined binding=weak vis=default ]

Curious, why does this symbol appear in the data section (D / WASM_SYMBOL_TYPE_DATA)? IIUC, that should be U if it's trying to weak reference __cxa_thread_atexit_impl.

What you could do instead perhaps is provide a weak definition of __cxa_thread_atexit_impl that does what you want to happen when the real one is missing.

With the provided std-8348dc5e440aab06.ll mentioned in comment #15722 (comment), I see:

$ cat thread_atexit.c
int ___cxa_thread_atexit_impl(void (*func)(void *), void *arg, void *dso)
{
	return 0;
}

#define weak_alias(old, new) \
	extern __typeof(old) new __attribute__((__weak__, __alias__(#old)))

weak_alias(___cxa_thread_atexit_impl, __cxa_thread_atexit_impl);

$ clang --target=wasm32 -emit-llvm -c -S -o thread_atexit.ll thread_atexit.c
$ llc -march=wasm32 -filetype=obj -o thread_atexit.a thread_atexit.ll
$ llc -march=wasm32 -filetype=obj -o std-8348dc5e440aab06.a ~/Downloads/std-8348dc5e440aab06.ll
$ wasm-ld --no-entry --export-all -o test.wasm std-8348dc5e440aab06.a thread_atexit.a
wasm-ld: error: symbol type mismatch: __cxa_thread_atexit_impl
>>> defined as WASM_SYMBOL_TYPE_DATA in std-8348dc5e440aab06.a
>>> defined as WASM_SYMBOL_TYPE_FUNCTION in thread_atexit.a

@sbc100
Copy link
Collaborator

sbc100 commented Jan 12, 2023

The first answer to that question is that its because the IR specifies it as a global: @__cxa_thread_atexit_impl = extern_weak global i8. So llvm think its a global i8... but why is it this way in the IR? That I don't know.

@RReverser
Copy link
Collaborator

Curious, why does this symbol appear in the data section (D / WASM_SYMBOL_TYPE_DATA)? IIUC, that should be U if it's trying to weak reference __cxa_thread_atexit_impl.

Pretty sure that's the result of the same bitcast issue - because it's declared as a void/data pointer and not a function pointer, which are treated differently in Wasm, it ends up as a data reference and not function reference.

@RReverser
Copy link
Collaborator

@tlively That solution sounds most promising long-term.

@RReverser
Copy link
Collaborator

IIUC we do include __cxa_thread_atexit_impl form libc++, but this doesn't get included for C programs.

@sbc100 Okay I'm confused - here you say we include it, but in #18501 (comment) you said

How about updating the comment to say We know emscripten doesn't implement __cxa_thread_atexit_impl so we can simply avoid this check.

So... does Emscripten implement it or not?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 13, 2023

Sorry, I was confused. Emscripten implements __cxa_thread_atexit in libc++.

However, nowhere does it implement __cxa_thread_atexit_impl

@RReverser
Copy link
Collaborator

Ah, okay, thanks for the clarification. So the Rust PR is still a valid fix (if somewhat limited compared to the LLVM bitcode one) for this then.

@RReverser
Copy link
Collaborator

RReverser commented Jan 13, 2023

I tried to craft a minimal repro in C that replicates what Rust is doing / what that LLVM IR is demonstrating, and I think this is it (well, you can also remove the if if you want):

__attribute__((weak))
extern void maybe_func;

int main() {
    int(*func)() = &maybe_func;
    if (func) {
      func();
    }
}

Compiling with emcc results in the same issue:

> emcc temp.c -O
[parse exception: attempted pop from empty stack / beyond block start boundary at 262 (at 0:262)]
Fatal: error in parsing input
emcc: error: '/home/rreverser/emsdk/upstream/bin/wasm-emscripten-finalize --dyncalls-i64 --pass-arg=legalize-js-interface-exported-helpers a.out.wasm -o a.out.wasm --detect-features' failed (returned 1)

Godbolt link for LLVM IR: https://clang.godbolt.org/z/odzKMsrPx

@tlively
Copy link
Member

tlively commented Jan 13, 2023

Great work narrowing down the issue. Could you file an LLVM bug at https://github.com/llvm/llvm-project for this?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 13, 2023

I tried to craft a minimal repro in C that replicates what Rust is doing / what that LLVM IR is demonstrating, and I think this is it (well, you can also remove the if if you want):

__attribute__((weak))
extern void maybe_func;

This is rather unusual C code though. Normally one would write

extern int maybe_func();

So unusual, in fact, that I don't think we have seen that in all the millions of lines of C/C++ that emscripten is fed.

I don't think I've ever seen a non-function declared with a a void (just void without the *).

I assume this doesn't work in C++?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 13, 2023

BTW, you don't need to weak attribute here:

$ cat test.c
extern void maybe_func;

int main() {
    int(*func)() = &maybe_func;
    func();
}
$ ./emcc test.c -O -Wl,--allow-undefined
[parse exception: attempted pop from empty stack / beyond block start boundary at 251 (at 0:251)]
Fatal: error in parsing input
emcc: error: '/usr/local/google/home/sbc/dev/wasm/binaryen-out/bin/wasm-emscripten-finalize --dyncalls-i64 --pass-arg=legalize-js-interface-exported-helpers a.out.wasm -o a.out.wasm --detect-features' failed (returned 1)

Also, FWIW C++ rejects this:

$ ./em++ test.c -O -Wl,--allow-undefined
clang-16: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated]
test.c:1:13: error: variable has incomplete type 'void'
extern void maybe_func;
            ^
1 error generated.

@RReverser
Copy link
Collaborator

So unusual, in fact, that I don't think we have seen that in all the millions of lines of C/C++ that emscripten is fed.

Indeed, that's what I assumed but wanted to replicate anyway after our DM chat to understand whether this code would be simply unusual enough to never occur or actually impossible to represent in C.

@RReverser
Copy link
Collaborator

RReverser commented Jan 13, 2023

BTW, you don't need to weak attribute here:

Ah yeah, but then you need to allow undefined symbols via command line instead.

Also, FWIW C++ rejects this:

Interesting. Void, like the weak attribute, is not important though, it was just closer to the repro LLVM IR.

E.g. this also fails similarly in both C and C++:

extern char maybe_func;

int main() {
    int(*func)() = (int(*)())&maybe_func;
    if (func) {
      func();
    }
}

@RReverser
Copy link
Collaborator

Could you file an LLVM bug at llvm/llvm-project for this?

Sure, will do later today.

@RReverser
Copy link
Collaborator

Created here: llvm/llvm-project#60003

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 14, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 14, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 26, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
@RReverser
Copy link
Collaborator

Closing as this is fixed on the Rust side now (will be available on stable from Rust 1.69), and the LLVM issue is tracked separately.

thomcc pushed a commit to tcdi/postgrestd that referenced this issue May 31, 2023
 - Fixes rust-lang/rust#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.

r? @alexcrichton
thomcc pushed a commit to tcdi/postgrestd that referenced this issue May 31, 2023
Avoid __cxa_thread_atexit_impl on Emscripten

 - Fixes rust-lang/rust#91628.
 - Fixes emscripten-core/emscripten#15722.

See discussion in both issues.

The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.

Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.

Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants