Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export_name with unusual utf8 breaks new version script based linker #38238

Open
m4b opened this issue Dec 8, 2016 · 15 comments
Open

export_name with unusual utf8 breaks new version script based linker #38238

m4b opened this issue Dec 8, 2016 · 15 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries A-unicode Area: Unicode C-feature-request Category: A feature request, i.e: not implemented / a PR.

Comments

@m4b
Copy link
Contributor

m4b commented Dec 8, 2016

Unfortunately it looks like the awesome changes in #38117 caused breakage while linking when a weird export name is used (probably due to the version script requiring ascii, or some other esoterica):

        #[export_name="bad_∢"]
        pub extern fn bad(i: usize) {}

NOTE haven't checked this particular version, but certain combinations of values cause linker error complaining about invalid chars in version script.

@michaelwoerister
Copy link
Member

I guess, we should just restrict export_name to ASCII.

cc @rust-lang/compiler

@m4b
Copy link
Contributor Author

m4b commented Dec 8, 2016

So this is an artificial restriction, for no technical merit. I'd really like utf8 support for symbol names, just on principle.

Nevertheless it doesn't work as is now (and used to) (albeit entirely because of a broken linker toolchain), so ASCII might be required... Iirc, swift re-encodes utf symbols in their name mangler, but that's not the same as real utf8 symbol names in the binary, which is just literally the coolest. I don't know of another language that allows that and can run (and the dynamic linker doesn't have any problem with it, because it just sees null terminated bytes, which utf8 preserves.)

@michaelwoerister
Copy link
Member

With which linkers do you run into the problem?

@m4b
Copy link
Contributor Author

m4b commented Dec 21, 2016

Both gold and ld (-fuse-ld=bfd) are broken for me:

#[export_name="󠆷∀🢫"]
#[no_mangle]
pub extern fn whatever() {
    println!("nothing");
}

I guess unicode is too hard for C ppls.

rustc --crate-type=cdylib src/lib.rs 
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-m64" "-L" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "lib.0.o" "-o" "liblib.so" "-Wl,--version-script=/tmp/rustc.AxbnbIEKM7Z3/list" "-Wl,--gc-sections" "-nodefaultlibs" "-L" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "-Wl,-Bdynamic" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-b4054fae3db32020.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librand-1c6ed188684e7d33.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcollections-63f7707126c5a809.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_unicode-a9711770523833d4.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-d2ecc8049920bea8.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-5837d7d3490e00c5.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-0720511b45a7223a.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc_system-34e7f110f175a258.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-ab203041f1ec5313.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-93f19628b61beb76.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-35d2bc471c7ce467.rlib" "-l" "dl" "-l" "rt" "-l" "pthread" "-l" "gcc_s" "-l" "pthread" "-l" "c" "-l" "m" "-l" "rt" "-l" "util" "-shared"
  = note: /usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\363' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\240' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\206' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\267' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\342' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\210' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\200' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\360' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\237' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\242' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: ignoring invalid character `\253' in script
/usr/bin/ld:/tmp/rustc.AxbnbIEKM7Z3/list:3: syntax error in VERSION script
collect2: error: ld returned 1 exit status
rustc --crate-type=cdylib -C link-args="-fuse-ld=gold" src/lib.rs 
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-m64" "-L" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "lib.0.o" "-o" "liblib.so" "-Wl,--version-script=/tmp/rustc.5LjCdyxrgOsb/list" "-Wl,--gc-sections" "-nodefaultlibs" "-L" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "-Wl,-Bdynamic" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-b4054fae3db32020.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librand-1c6ed188684e7d33.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcollections-63f7707126c5a809.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_unicode-a9711770523833d4.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-d2ecc8049920bea8.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-5837d7d3490e00c5.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-0720511b45a7223a.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc_system-34e7f110f175a258.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-ab203041f1ec5313.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-93f19628b61beb76.rlib" "/home/m4b/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-35d2bc471c7ce467.rlib" "-l" "dl" "-l" "rt" "-l" "pthread" "-l" "gcc_s" "-l" "pthread" "-l" "c" "-l" "m" "-l" "rt" "-l" "util" "-shared" "-fuse-ld=gold"
  = note: /usr/bin/ld.gold: error: /tmp/rustc.5LjCdyxrgOsb/list:3:5: invalid character
/usr/bin/ld.gold: error: /tmp/rustc.5LjCdyxrgOsb/list:3:5: syntax error, unexpected $end, expecting STRING or QUOTED_STRING or EXTERN
/usr/bin/ld.gold: fatal error: unable to parse version script file /tmp/rustc.5LjCdyxrgOsb/list
collect2: error: ld returned 1 exit status

@DemiMarie
Copy link
Contributor

DemiMarie commented Mar 1, 2017

I think we should report this as a bug against GNU LD, unless there is a way to quote symbol names.

@Mark-Simulacrum Mark-Simulacrum added A-linkage Area: linking into static, shared libraries and binaries A-unicode Area: Unicode labels Jun 23, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Jul 26, 2017
@steveklabnik
Copy link
Member

Triage: not aware of any changes

@adamspofford-dfinity
Copy link

adamspofford-dfinity commented May 12, 2023

Worth noting, this is not restricted to Unicode. Any non-alphanumeric characters like : or break it too. This is a problem for us as our architecture involves WASM functions named canister_query foo or canister_update foo. When developing on Linux, you can cargo build --target wasm32-unknown-unknown without errors, but cargo build produces a giant linker error; most commonly comes up when swapping cargo check with cargo rustc.

@bjorn3
Copy link
Member

bjorn3 commented May 12, 2023

Why are you using spaces in the symbol names instead of something like canister_query__foo or _ZN14canister_query3foo? Spaces are not portable across all targets even ignoring version scripts afaik. Symbol mangling is done by C++ and Rust because a lot of characters are not portable.

@adamspofford-dfinity
Copy link

Because then there is absolutely no ambiguity about what's an intended export of the wasm module vs random compiler (or user-written!) garbage, and because you can easily write it or interpret it by hand, and because we don't have to be portable when the only intended target is wasm32 and wasm32 supports it. It works on Mac, it works on Linux with rustc 1.2.0, but it doesn't work on Linux with current rustc.

@bjorn3
Copy link
Member

bjorn3 commented May 12, 2023

If you want to avoid ambiguity just add a random string to it. I did expect spaces to not work with GCC and when using an external assembler with LLVM (as rustc used to do for some targets due to LLVM missing an internal assembler for them)

@DemiMarie
Copy link
Contributor

This needs to be fixed in linkers.

@crlf0710
Copy link
Member

It might worth a separate lint at the same time... I don't believe linker behavior will be tweaked at the near future.

@DemiMarie
Copy link
Contributor

Why is a linker script needed?

@DemiMarie
Copy link
Contributor

One can have a C function with a non-ASCII name, FYI.

@bjorn3
Copy link
Member

bjorn3 commented Sep 21, 2024

We are using a version script, not a linker script. This to tell the linker which symbols to export from the dylib/cdylib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries A-unicode Area: Unicode C-feature-request Category: A feature request, i.e: not implemented / a PR.
Projects
None yet
Development

No branches or pull requests

8 participants