-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dumb NRVO #72205
Dumb NRVO #72205
Conversation
Ah, that's surprising. I thought the perf run in #71003 (comment) indicated that there was a positive effect for some crates. My hypothesis is that having one less local in the MIR makes codegen a bit faster, but the quality of the generated code (at least for |
Yeah, we also thought that the improvements happen because the NRVO pass simplifies MIR enough that it speeds up codegen and metadata/incremental en/decoding of it, and it might also be faster when doing const evaluation on it. |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 20678a2347705402cc4c64b29f5d76b44077e36b with merge 23312ba3b09177c369a0e8ce37563b26993dbee2... |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
So I've verified that we generate much better assembly for the following function: fn foo(init: fn(&mut [u8; 1024])) -> [u8; 1024] {
let mut buf = [0; 1024];
init(&mut buf);
buf
} Obviously, simpler versions that call a concrete function or assign to #62446 is another example of a function that would benefit from even this dumb version of NRVO. |
☀️ Try build successful - checks-azure |
Queued 23312ba3b09177c369a0e8ce37563b26993dbee2 with parent af6d886, future comparison URL. |
Does #57077 or |
The call to If you have a specific example for |
#57077 (comment) playground link for example, but AFAIR it doesn't require any specific steps to trigger the unnecessary memcpy. |
No change in generated ASM for either of the This makes sense to me because the returned data is on the heap. It's not getting stored in the return place by this PR. Only the pointer is. |
Finished benchmarking try commit 23312ba3b09177c369a0e8ce37563b26993dbee2, comparison URL. |
So @rust-lang/wg-mir-opt Do we want to merge this sort of simplistic version of NRVO while we wait for a more robust one? I would need to update debuginfo and also validate the implicit assumption this PR makes about MIR building: that I think this might actually be worthwhile, despite the title of this PR, since it does resolve some open issues and appears to make Also, I want to note that this wasn't possible until relatively recently. #71005 was the last in a series of PRs that help to make this work. Thanks @jonas-schievink! |
Whoa, nice! |
src/librustc_mir/transform/nrvo.rs
Outdated
/// Looks at all basic blocks that are terminated with a `Return` to see what local is copied to | ||
/// the return place in that basic block. If all `Return`-terminated blocks assign the same local | ||
/// to the return place, return that local. | ||
fn single_local_assigned_to_return_place(body: &mut mir::Body<'_>) -> Option<Local> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are any uses of _0
without it being a projectionless place on the lhs of an assignment, we should also return None
. This may not be a problem right now, but it may very well occur after a combination of applying this optimization, then inlining the function the optimization was applied to, and then applying the optimization on the outer function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is resolved via the IsReturnPlaceUsed
visitor. It also will fire for assignments of projections of _0
, although that's probably not obvious on first glance. I kinda want to change PlaceContext
to remain unchanged instead of falling back to Projection
when we see something like _1.field
. I'd want to introduce a PlaceContext::Deref
though.
@ecstatic-morse Oh, just remembered: What I measured for my PR above is not quite the same as what you measured here: I only ran the NRVO pass on rustc, not on any crates compiled by it, but you always run it. This might explain the performance difference. Your pass is cheap to run and presumably removes enough MIR to speed up further operations (codegen, const eval, (de)serialization). It might not speed up rustc in general though. |
@jonas-schievink Do we run MIR optimizations for check builds? |
At least for const eval, we do, yes |
That could partially explain it, but I'm not convinced that |
Oh it's really gross: if !tcx.crate_name(LOCAL_CRATE).as_str().starts_with("rustc_") {
// Only run this pass on the compiler.
return;
} |
Don't we miss the standard library with that? |
Yeah. I left it out because I didn't think NRVO would do too much on it. Do you think it will benefit from NRVO a lot? |
⌛ Testing commit 856cd66 with merge f52a10e5837d9b7ae995e1548ff729a31c595f51... |
💔 Test failed - checks-azure |
Hmm. Apparently I didn't get to @bors r=oli-obk |
📌 Commit f509862 has been approved by |
💡 This pull request was already approved, no need to approve it again.
|
📌 Commit f509862 has been approved by |
@ecstatic-morse assignment of |
☀️ Test successful - checks-azure |
The final perf win from the landing. Nice work! |
This change replaces the default return variable `$0` with the variable in the outer context where the return value will end up after leaving the function. This saves us an instruction when we compile the trace. More importantly however, this guards us against a future optimisation in rustc that allows SIR to assign to $0 multiple times and at the beginning of a block, which could lead to another function overwriting its value (see rust-lang/rust#72205).
This change replaces the default return variable `$0` with the variable in the outer context where the return value will end up after leaving the function. This saves us an instruction when we compile the trace. More importantly however, this guards us against a future optimisation in rustc that allows SIR to assign to $0 multiple times and at the beginning of a block, which could lead to another function overwriting its value (see rust-lang/rust#72205).
67: Replace return variable with its destination. r=vext01 a=ptersilie This change replaces the default return variable `$0` with the variable in the outer context where the return value will end up after leaving the function. This saves us an instruction when we compile the trace. More importantly however, this guards us against a future optimisation in rustc that allows SIR to assign to $0 multiple times and at the beginning of a block, which could lead to another function overwriting its value (see rust-lang/rust#72205). While we are here, also fix a bug in the variable renaming code. Currently, functions that are called consecutively (i.e. they are not nested) use the same offset to rename their variables (the offset of their outer context), which leads to them sharing the same variable names. By using an accumulator which is continuously increased to calculate the offset, we make sure consecutive functions have increasing variable names, even when after leaving a function the offset is temporarily reset to that of the outer context. Co-authored-by: Lukas Diekmann <[email protected]>
This is a very simple version of an NRVO pass, which scans backwards from the
return
terminator to see if there is an an assignment like_0 = _1
. If a basic block with two or more predecessors is encountered during this scan without first seeing an assignment to the return place, we bail out. This avoids running a full "reaching definitions" dataflow analysis.I wanted to see how much
rustc
would benefit from even a very limited version of this optimization. We should be able to use this as a point of comparison for more advanced versions that are based on live ranges.r? @ghost