-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for splitting linker invocation to a second execution of rustc
#64191
Comments
Would the This would be useful for linking hybrid Rust/C++/(other) programs where the final executable is non-Rust. In other words, we could have C++ depending on Rust without needing to use staticlib/cdylib. |
I don't think it'd just be a thin wrapper around |
Firstly, we'd want the final linker doing LTO in order to get it cross-language, regardless of whatever language the final target is in and what mix of languages went into the target. Secondly, since Buck has full dependency information, including Rust dependencies on C/C++, it will arrange for all the right libraries to be on the final link line. As a result we never want to use or honor (Even if that weren't true, at least on Unix systems, the I like this proposal because it allows us to factor out the Rust-specific details from the language-independent ones. For example there's no reason for rustc to implement LTO if we're already having to solve that for other languages - especially when that solution pretty infrastructure-specific (distributed thin LTO, for example). There's also no real reason for us to use Ultimately, Rust lives in the world of linkable object files, and a final artifact is generated by calling the linker with a certain set of inputs. Since Rust doesn't have unusual requirements that make it incompatible with C/C++ linkage (eg special linkage requirements or elaborate linker scripts) then the final linker stage could be broadly language agnostic. |
+1 to wanting the ability to turn off / disable |
I'm generally in favor of this. Some thoughts:
|
I don't disagree that y'all's sorts of projects don't want to use the rustc-baked-in LTO, but I don't think we can remove it because many other projects do use it (and rightfully want to). Also this is still just a sort of high-level concept, but if a lot of feature requests are piled onto this it's unfortunately unlikely to happen. |
Hi, my name is Victor and I'm workin with @tmandry. |
Great @0dvictor! The steps I'd recommend for doing this would probably look like:
As for the actual change itself I haven't looked too much into this, so I wouldn't know where best to start there. |
Some implementation notes: Recommended reading
Current state of thingsThe main For the LLVM backend (which is the only one right now), this method is implemented here. Finally, it calls StrategyObviously, we need to split apart all the code that assumes codegen and linking happen at the same time. This starts with the For the flags, we can start with unstable options ( |
I don't much like using a directory as output, since some build systems might not support this. Probably the best thing to do is to make an ar file (which I should note is what an That said, the choice of extension should be up to whoever is invoking rustc, and we should use the I think there might be details we need to pay attention to regarding the linker's default behavior when linking object files vs archive files (like symbol visibility), but not sure what those details are. cc @petrhosek |
Bundling it together in a single ar file would do unnecessary work (both IO and CPU) Object files are always written to the disk. When building an ar file, they are copied then copied to the ar file and a symtab is created (ranlib). Creating a symtab can't be avoided if you dont want to unpack the ar file again before linking, as the linker requires a symtab to be present. |
I suppose that in build systems that need it, we can wrap invocations with
a script that tars and untars the files, for example. I'm fine with having
an output directory in that case. Ideally the compiler would support both,
but that doesn't seem realistic for an initial implementation.
…On Tue, Nov 19, 2019 at 10:58 AM bjorn3 ***@***.***> wrote:
Bundling it together in a single ar file would do unnecessary work (both
IO and CPU) Object files are always written to the disk. When building an
ar file, they are copied then copied to the ar file and a symtab is created
(ranlib). Creating a symtab can't be avoided if you dont want to unpack the
ar file again before linking, as the linker requires a symtab to be present.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#64191?email_source=notifications&email_token=AARMYYDGTTWB44PHNDV6OKTQUP5JBA5CNFSM4IUB7HWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEOPEEI#issuecomment-555545105>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARMYYFKOYAUILYEUDBUCB3QUP5JBANCNFSM4IUB7HWA>
.
|
I finally get a prototype working for the first part - generate a linkable object or bitcode file, as well as the linking command to invoke manually to finish linking. In addition, I also successfully ran LTO linking with a native lib. While starting on the second stage, I found Rust is moving to LLD per #39915. Can I make an assumption that I only need to support LLD or "clang/gcc -fuse-ld=lld"? |
While moving to LLD is nice, it's unlikely to happen any time soon, so it's best to not make the assumption of LLD. |
@0dvictor As a suggestion, you may want to file a PR that includes only the rustc flags are in |
Good idea, let me polish my changes and create a PR. |
Sorry about the delay. After studying the code and making some experiments, I found all linker arguments comes from the following four sources:
At linking stage, assume user always pass the required CLI arguments:
Therefore, in my experiment, to compile without linking is basically: I have to make the following three changes to get it to work [PR #67195]:
To minimize the impact of existing code, all changes are guarded by I have not included (5) yet. My plan is to save it in either the rmeta file, or the bitcode/object file using LLVM’s If we want one single |
Then for the linking stage, I plan to insert code here to read the |
Finally, some thoughts on LTO: once this Issue finishes, we should be able to do LTO easily when we use LLD (either directly or via Out of curiosity, why does an |
|
Yeah, those should probably be the default. Would you mind opening an issue to track this? |
Sure, but from our point of view we want to break the build up into atomic actions with well-defined inputs and outputs and then be able to freely schedule them across a build cluster. I don't want to have to treat Rust build + link as a special case - adding a constraint that they have to execute on the same machine would make it much harder to schedule. But I think if we can use a thin ar to logically bundle them all up, it will be managable.
OK, so I think there's too much stuff in the
That doesn't scale. There could be hundreds of C++ libraries linked in, any of which could be using some combination of Rust libraries. |
No, link args can come from |
Fair enough - they can be encoded in the |
Linking, in theory, depends on a lot more artifacts than codegen does. Codegen should only require source code and the rmeta files from any crates you depend on. Linking requires all the generated code. In our sccache-like environment, this would mean uploading many rlib files and possibly system libraries to the worker. Network bandwidth becomes a bottleneck. So it's much better to send compile steps to the workers, hitting cache when possible, and do linking locally.
Link args don't need to be stable, just the file format which contains them. I don't think the fact that the file contains references to implementation details like compiler-builtins is a problem, actually. As long as those details can change without changing the schema, a well-written tool should be able to consume them without breakage. That said, stabilizing rlink seems more ambitious than having a rustc option which spits out the final linker line, allowing you to run it yourself. |
Apologize for my such late reply. After experimenting the archiving approach (using
Similarly, simply archiving the files with Therefore, I am experimenting a different approach: after finishing LLVM's optimizations, linking all CGUs into a combined one using
WDYT? |
That won't work for non LLVM based backends.
What about a crate with many codegen units? During optimizations only a few codegen units are in memory at any time, while during linking with |
Correct, but we can implement similar, if not the same, way to combine the CGUs for that non-LLVM backend.
Great points. I should've been clear that this proposed feature would be guarded by an option, say Regardless of splitting out the linker invocation, Being able to generate a single or defined number of |
Can @0dvictor, can we compare the build time using the flag in your working branch versus |
Unfortunately,
Good idea, let me do that. |
@jsgf The project I'm working on seems to share similar properties to yours:
We can't:
Therefore we want to link
Therefore one of the solutions you and @tmandry suggest would be awesome: either stabilizing I think, though, that this request is a little bit orthogonal to this issue. I wonder if we should submit a new issue? I think it's quite a big request. |
That would be useful and it's also orthogonal. A new issue sounds like a
good idea.
…On Fri, May 1, 2020 at 4:17 PM adetaylor ***@***.***> wrote:
@jsgf <https://github.com/jsgf> The project I'm working on seems to share
similar properties to yours:
- Using a non-Cargo build system based on static dependency resolution
(and has 20000+ targets)
- Final linking performed by an existing C++ toolchain
- A few Rust .rlibs scattered throughout a very deep dependency tree,
which may eventually roll up into one or multiple binaries
We can't:
- Switch from our existing linker to rustc for final linking. C++ is
the boss in our codebase; we're not ready to make the commitment to put
Rust in charge of our final linking.
- Create a Rust static library for each of our .rlibs. This works if
we're using Rust in only one place. For any binary containing several Rust
subsystems, there would be binary bloat and often violations of the
one-definition-rule
- Create a Rust static library for each of our output binaries. The
build directives for the Rust .rlibs don't know what final binaries
they'll end up in; and the build directives for the final binaries don't
know what .rlibs they've pulled in from deep in their dependency tree.
Our build system forbids that sort of global knowledge, or it would be too
slow across so many targets.
- Create a single Rust static library containing all our Rust .rlibs.
That monster static library would depend on many C++ symbols, so each
binary would become huge (and in fact not actually link properly)
Therefore we want to link rlibs directly into our final linker
invocation. This is, in fact, what we're doing, but we have to add some
magic:
- The final C++ linker needs to pull in all the Rust stdlib .rlibs,
which would be easy apart from the fact they contain the symbol metadata
hash in their names.
- We need to remap __rust_alloc to __rdl_alloc etc.
- In future the final rustc-driven linker invocation might add extra
magic.
Therefore one of the solutions you and @tmandry
<https://github.com/tmandry> suggest would be awesome: either stabilizing
.rlink (which sounds unlikely) or having some official way to build a
linker command line which is not under the control of rustc (
-Zbinary-dep-depinfo helps a bit).
I think, though, that this request is a little bit orthogonal to this
issue. I wonder if we should submit a new issue? I think it's quite a big
request.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#64191 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARMYYFU42EPAMD5YNWZ5XLRPNJ77ANCNFSM4IUB7HWA>
.
|
This issue came up in a discussion about @dtolnay's cxx which wants to be able to describe callback relationships between Rust and C++ code - eg Rust calls C++, which then calls back into Rust. If you're using linker symbols to resolve these calls (vs runtime indirect function calls via pointers), then that effectively means you have a rust.o and cxx.o with mutual references. AFAIK the only way to correctly link this is with something like |
I'm interested in the outcome of this. Is it currently in a state where cargo could potentially implement a useful subset of it? Or is there further work needed on the final link step before cargo can do anything? |
@joshtriplett Kernel? |
I apologize for the long delay in continuing this work. PR #75094 is created to allow generating a single object file for a crate. Performance was measured by compiling
Though |
Add `-Z combine_cgu` flag Introduce a compiler option to let rustc combines all regular CGUs into a single one at the end of compilation. Part of Issue rust-lang#64191
I tried using the -Zno-link/-Zlink-only mechanism in earnest, and unfortunately I think its deficient in a number of ways. I have some thoughts on a simpler to use mechanism. More detail: https://internals.rust-lang.org/t/alternative-approach-to-zno-link-zlink-only-split-linking/14842 |
…twco,bjorn3 Store rlink data in opaque binary format on disk This removes one of the only uses of JSON decoding (to Rust structs) from the compiler, and fixes the FIXME comment. It's not clear to me what the reason for using JSON here originally was, and from what I can tell nothing outside of rustc expects to read the emitted information, so it seems like a reasonable step to move it to the metadata-encoding format (rustc_serialize::opaque). Mostly intended as a FIXME fix, though potentially a stepping stone to dropping the support for Decodable to be used to decode JSON entirely (allowing for better/faster APIs on the Decoder trait). cc rust-lang#64191
This issue is intended to track support for splitting a
rustc
invocation that ends up invoking a system linker (e.g.cdylib
,proc-macro
,bin
,dylib
, and evenstaticlib
in the sense that everything is assembled) into two differentrustc
invocations. There are a number of reasons to do this, including:This can improved pipelined compilation support. The initial pass of pipelined compilation explicitly did not pipeline linkable compilations because the linking step needs to wait for codegen of all previous steps. By literally splitting it out build systems could then synchronize with previous codegen steps and only execute the link step once everything is finished.
This makes more artifacts cacheable with caching solutions like
sccache
. Anything involving the system linker cannot be cached bysccache
because it pulls in too many system dependencies. The output of the first half of these linkable compilations, however, is effectively anrlib
which can already be cached.This can provide build systems which desire more control over the linker step with, well, more control over the linker step. We could presumably extend the second half here with more options eventually. This is a somewhat amorphous reason to do this, the previous two are the most compelling ones so far.
This is a relatively major feature of rustc, and as such this may even require an RFC. This issue is intended to get the conversation around this feature started and see if we can drum up support and/or more use cases. To give a bit of an idea about what I'm thinking, though, a strawman for this might be:
rustc
,--only-link
and--do-not-link
.bin
crate type by passing the--do-not-link
flag, passing all the flags it normally does today.rustc
again, only this time passing the--only-link
flag.These two flags would indicate to
rustc
what's happening, notably:--do-not-link
indicates that rustc should be creating a linkable artifact, such as a one of the ones mentioned above. This means that rustc should not actually perform the link phase of compilation, but rather it's skipped entirely. In lieu of this a temporary artifact is emitted in the output directory, such as*.rlink
. Maybe this artifact is a folder of files? Unsure. (maybe it's just an rlib!)The converse of
--do-not-link
,--only-link
, is then passed to indicate that the compiler's normal phases should all be entirely skipped except for the link phase. Note that for performance this is crucial in that this does not rely on incremental compilation, nor does this rely on queries, or anything like that. Instead the compiler forcibly skips all this work and goes straight to linking. Anything the compiler needs as input for linking should either be in command line flags (which are reparsed and guaranteed to be the same as the--do-not-link
invocation) or the input would be an output of the--do-not-link
invocation. For example maybe the--do-not-link
invocation emits an file that indicates where to find everything to link (or something like that).The general gist is that
--do-not-link
says "prepare to emit the final crate type, likebin
, but only do the crate-local stuff". This step can be pipelined, doesn't require upstream objects, and can be cached. This is also the longest step for most final compilations. The gist of--only-link
is that it's execution time is 99% the linker. The compiler should do the absolute minimal amount of work to figure out how to invoke the linker, it then invokes the linker, and then exits. To reiterate again, this will not rely on incremental compilation because engaging all of the incremental infrastructure takes quite some time, and additionally the "inputs" to this phase are just object files, not source code.In any case this is just a strawman, I think it'd be best to prototype this in rustc, learn some requirements, and then perhaps open an RFC asking for feedback on the implementation. This is a big enough change it'd want to get a good deal of buy-in! That being said I would believe (without data at this time, but have a strong hunch) that the improvements to both pipelining and the ability to use
sccache
would be quite significant and worthwhile pursuing.The text was updated successfully, but these errors were encountered: