-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only emit #[link_name] attribute if necessary. #1558
Conversation
a4cd80a
to
d466754
Compare
This looks reasonable to me. As far as I can tell, regarding this bit:
This wouldn't be a regression, right? I agree that trying to un-apply the mangling depending on current target would be a bit brittle. It's what we were doing before doing we switched to We could try to add an API to LLVM for this, if somebody ends up needing it. But all the high-level APIs that libclang exposes also don't expose any options for this. |
src/codegen/mod.rs
Outdated
|
||
// This is something we don't recognize, stay on the safe side | ||
// by emitting the `#[link_name]` attribute | ||
Some(_) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth logging a debug!
message or such? Otherwise no braces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't think of a good message, so I removed the braces.
|
||
// Check that the suffix starts with '@' and is all ASCII decimals | ||
// after that. | ||
if suffix[0] != b'@' || !suffix[1..].iter().all(u8::is_ascii_digit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to have tests for this. Do we have them? I see call-conv-field has a change for @0
, do we have one for longer suffixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add one with a longer suffix.
Apparently (according to https://bugzilla.mozilla.org/show_bug.cgi?id=1486042#c42) things already work on macOS (that is things seem to already work on macOs without this patch). That is a bit surprising to me. I need to double check if Clang includes the leading underscore there already in LLVM IR (instead of it being added by the backend).
LLVM has a few useful methods for this on DataLayout, e.g.:
Replicating the logic would still be complicated but at least the actual per-target settings would be defined in LLVM. Unfortunately, LLVM's C interface does not expose any of these methods. |
I've added some |
For reference: I just verified that the current version of this PR solves the cross-language ThinLTO problem for Firefox on x86 Windows. |
51ef923
to
01c0d17
Compare
01c0d17
to
8a97e04
Compare
8a97e04
to
920d285
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, for some reason I didn't get the notification from Github that you had pushed.
Yeah, GH doesn't notify about that. I planned to ping you but wanted to wait until travis went green. Can I help with publishing this on crates.io somehow? |
This PR makes bindgen not emit a
#[link_name]
attribute if it detects that it is not necessary.Why do we want to do this?
Because when ThinLTO is performed across Rust/C/C++ language boundaries, the linker plugin will see both sides of the code at the LLVM IR level (as opposed to the machine code level). The
#[link_name]
attribute causes calls from Rust to C/C++ code to reference symbol names with backend-mangling applied to them (e.g._foo@4
for astdcall
function) while the definition of the function on the C/C++ side has the name before this kind of mangling is applied (i.e. justfoo
for that same function). The linker (or respectively, the LLVM linker plugin) will thus not find any function named_foo@4
and report anundefined symbol
error. The changes in this PR try to keep symbol names in sync at the LLVM IR level too (in more cases than before).How does it work?
Before emitting an
#[link_name]
attribute,bindgen
now checks if the Rust name will end up as the intended name anyway and in that case just not emit the attribute. It does so by looking at the Rust name (canonical_name
), the proposedlink_name
, and the calling convention of a function. If the Rust name, with calling convention specific mangling applied to it, is equal to thelink_name
, then the attribute is not needed.What are the downsides of this approach?
This approach does not work for C++ manglings with additional backend mangling applied, like for example
__ZN3bar3FOOE
(Itanium mangled C++ global with additionalmacOS
-specific leading underscore). I think this applies to anything C++ on macOS:/
Could we do better?
Ideally, we'd add a
#[link_name]
attribute with the mangled name without the backend-specific mangling and without the leading\0
char. Then LLVM could just do the right thing. Unfortunately it seems thatlibClang
does not provide this.We could also try to actively remove backend part of mangled names; but doing so robustly on all platforms might be tricky.
@emilio, what do you think?