Recycle storage after move #61849

tmandry · 2019-06-14T19:09:13Z

We should experiment with "re-allocating" storage for a local after it's moved from if it gets re-initialized. This would enable more optimizations, but could have some potential fallout.

EDIT: See this comment for further explanation of what kinds of optimizations this would enable.

Quoth @RalfJung, from #59123 (comment):

Currently, the following (entirely safe) code will definitely return true:
let mut x = String::new();
let addr_x = &x as *const _ as usize;
drop(x);
// later
x = String::new();
let addr_x2 = &x as *const _ as usize;
return addr_x == addr_x2;
If we want to do optimizations like yours here (and I am totally sympathetic to that), we have to explain in the "Rust Abstract Machine" (and in Miri) why this program might return false. And the answer cannot be "there is UB", because this is safe code.

This is a topic that @nikomatsakis, @eddyb and me have touched on several times already, without ever hashing out a full plan. But in the current state of affairs, the only mechanism we have to "defeat" pointer equality tests like the above is to make sure that this is not the same allocation any more.

So, one thing we might do is to do StorageDead(x); StorageLive(x); immediately after every move. This "re-allocates" x and thus definitely kills any existing pointers and also "defeats" pointer comparisons. The immediate StorageLive is to keep the liveness state in sync in both branches of a conditional (which might or might not be relevant -- unfortunately LLVM's semantics for these intrinsics is less than clear). I guess the StorageLive could be moved down in cases where there is no merging control flow, which should give you your optimization in many cases.

It is possible to do a subset of the optimization discussed in #59123 without this, but this would help cover more cases, in this optimization and others.

cc @cramertj @eddyb @nikomatsakis @RalfJung

The text was updated successfully, but these errors were encountered:

eddyb · 2019-06-18T18:12:59Z

There's further discussion in #59123 (comment), I wish we could just fork the thread.

RalfJung · 2019-06-18T19:48:18Z

@eddyb writes

No, I was referring to the case where we want the semantics of Operand::Move to be that they invalidate any outstanding borrows, similar to Storage{Dead,Live} but without bloating the MIR/impacting analyses which rely on some sort of dominance relationship.

I don't understand your position now. Are you saying you don't mind if source-level moves invalidate borrows, you just don't want it be be encoded into Operand in MIR, but rather something more like StorageDead? That would make sense, I just kept thinking you were worried about source-level moves.

Yes, I'd find it rather strange to have this encoded in the Operand. It would also be non-uniform as Operand does not have to be a local. I could live with encoding basically the same thing in something more like StorageDead. That's explicit and easy to give semantics.

Perhaps surprisingly, I have much less strong opinions about surface language semantics than about MIR semantics.^^ We'd still want to make sure of course to make that translation not too weird -- magic UB-inducing MIR instructions like StorageDead or Retag should be inserted in a predictable way.

eddyb · 2019-06-19T16:34:10Z

Right, the confusing thing was that all along I've assumed you were against unsafe code not being able to reinitialize indirectly (i.e. through a raw pointer) some local that had been moved out of.

I consider the MIR more flexible and even its "place-oriented non-SSA CFG" nature to be subject to change, whereas the surface-level semantics to be the important thing to discuss, UB-wise.

In the short term, to keep things simple, I wouldn't mind a StorageRefresh (name up for bikeshed) after each Operand::Move, that has the effects of a Storage{Dead,Live} pair but doesn't risk breaking certain kinds of analysis (which can just ignore it if they want to remain conservative).

Long-term, if we go in this direction, I would expect us to be shrinking storage ranges, although I'm not sure what to do about the simple notion that a StorageLive dominates all uses - it's almost as if "storage-live" is becoming very close to (or even replaceable with) "(partially) initialized".

RalfJung · 2019-06-21T07:30:06Z

Right, the confusing thing was that all along I've assumed you were against unsafe code not being able to reinitialize indirectly (i.e. through a raw pointer) some local that had been moved out of.

Oh I see. No I am not necessarily opposed to forbid unsafe code from doing that. I just think the way we forbid it should be as clean and principled as we can make it.

I consider the MIR more flexible and even its "place-oriented non-SSA CFG" nature to be subject to change, whereas the surface-level semantics to be the important thing to discuss, UB-wise.

That's fair. On the other hand, I think surface language semantics of a language as complex as Rust is best explained by elaboration into some simpler language. And I think an idealized version of MIR is a reasonable candidate for the target language of this elaboration.

@cramertj

…, r=matthewjasper Don't store locals that have been moved from in generators This avoids reserving storage in generators for locals that are moved out of (and not re-initialized) prior to yield points. Fixes rust-lang#59123. This adds a new dataflow analysis, `RequiresStorage`, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are: 1. StorageLive(x) => mark x live 2. StorageDead(x) => mark x dead 3. If a local is moved from, _and has never had its address taken_, mark it dead 4. If (any part of) a local is initialized, mark it live' This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout. Here's the size in bytes of all testcases included in the change, before and after the change: async fn test |Size before |Size after -----------------|------------|---------- single | 1028 | 1028 single_with_noop | 2056 | 1032 joined | 5132 | 3084 joined_with_noop | 8208 | 3084 generator test |Size before |Size after ----------------------------|------------|---------- move_before_yield | 1028 | 1028 move_before_yield_with_noop | 2056 | 1032 overlap_move_points | 3080 | 2056 ## Future work Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, _**and either has never had its address taken, or is Freeze and has never been mutably borrowed**_, mark it dead." This was discussed at length in rust-lang#59123 and then rust-lang#61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully. A more immediate priority for me is inlining `std::mem::size_of_val(&x)` so it becomes apparent that the address of `x` is not taken. This way, using `size_of_val` to look at the size of your inner futures does not affect the size of your outer future. cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc

@cramertj

…jasper Don't store locals that have been moved from in generators This avoids reserving storage in generators for locals that are moved out of (and not re-initialized) prior to yield points. Fixes #59123. This adds a new dataflow analysis, `RequiresStorage`, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are: 1. StorageLive(x) => mark x live 2. StorageDead(x) => mark x dead 3. If a local is moved from, _and has never had its address taken_, mark it dead 4. If (any part of) a local is initialized, mark it live' This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout. Here's the size in bytes of all testcases included in the change, before and after the change: async fn test |Size before |Size after -----------------|------------|---------- single | 1028 | 1028 single_with_noop | 2056 | 1032 joined | 5132 | 3084 joined_with_noop | 8208 | 3084 generator test |Size before |Size after ----------------------------|------------|---------- move_before_yield | 1028 | 1028 move_before_yield_with_noop | 2056 | 1032 overlap_move_points | 3080 | 2056 ## Future work Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, _**and either has never had its address taken, or is Freeze and has never been mutably borrowed**_, mark it dead." This was discussed at length in #59123 and then #61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully. A more immediate priority for me is inlining `std::mem::size_of_val(&x)` so it becomes apparent that the address of `x` is not taken. This way, using `size_of_val` to look at the size of your inner futures does not affect the size of your outer future. cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc

tmandry · 2019-07-03T01:01:24Z

To further elaborate on the kinds of optimizations we could do with this..

If a local x is moved from, but

Has only had its address taken using &x
Is Freeze (does not contain interior mutability)

Then we could optimize away its storage after it's moved from. Rust will enforce that any borrow is dead after the move, of course, and transmuting *const to *mut is already considered UB. So in order to do this type of optimization soundly, we must declare that

accessing the old address of x after it is moved is UB
the address of x may change after it is re-initialized

RalfJung · 2019-07-03T21:00:49Z

accessing the old address of x after it is moved is UB

the address of x may change after it is re-initialized

(1) follows from (2). But my concern is that we should specify (2) in an operational way -- that is, we can't just say "the address may change", we have to describe the operational behavior that causes it to change. Something like StorageDead+StorageLive. Something that can be implemented in Miri or similar checkers.

tmandry · 2019-07-03T21:57:00Z

Something like StorageDead+StorageLive. Something that can be implemented in Miri or similar checkers.

Yes, agreed.

In theory, this should also help us answer the question of "who are we breaking with this change?" (Ideally no one, but probably someone.) In practice, though, it doesn't seem feasible to run miri on crater because of slowness.

Maybe if we could produce a smaller list of crates likely to break. All crates that use unsafe, for instance. But,

that list might still be too long
I'm guessing many breakages would actually manifest in tests of downstream crates which depend on the crates we're breaking

tmandry · 2019-09-15T01:15:35Z

I'd like to propose adding something like StorageRefresh(x) which @eddyb suggested earlier inside generators, ideally before async/await stabilizes (but soon after is probably good enough). That way, if we do decide to make this optimization for generators later, we won't be breaking any code that was considered valid by miri at the time it was written.

StorageRefresh would have no effect on codegen, so LLVM couldn't rely on it. Nor will any optimization passes in rustc rely on it at first (this will probably change).

We can also go back and add optimizations for non-generator code, but that's a step we would want to be more careful about. First we would add these StorageRefresh statements to all Rust code (not just generators) and see if people report any unexpected miri validation errors.

This would create a nice "ratcheting" of semantics by helping us keep our options open for generators today, and putting out a canary for non-generator code in the future.

The only problem I see with it is, there are certain unsafe patterns (which should be rare in practice, if they exist at all) which will pass validation outside of generators but fail inside of them. We'd like to discourage any of these patterns anyway, so I don't see this as a big problem.

@rust-lang/wg-async-await

RalfJung · 2019-09-16T18:20:36Z

@tmandry the semantics would be the same as StorageDead(x); StorageLive(x);, right? It's just compressed for performance reasons? That should be clearly documented as such.

The only problem I see with it is, there are certain unsafe patterns (which should be rare in practice, if they exist at all) which will pass validation outside of generators but fail inside of them. We'd like to discourage any of these patterns anyway, so I don't see this as a big problem.

So you plan to only insert this for generators? That's not great, but I can understand why.
Can we have a -Z flag (that Miri will set) to insert them everywhere? We already have -Zmir-emit-retag, this would be similar in spirit. That way, code compiled for / run in Miri would still get the extra checking.

tmandry · 2019-09-16T21:18:11Z

@tmandry the semantics would be the same as StorageDead(x); StorageLive(x);, right? It's just compressed for performance reasons? That should be clearly documented as such.

Sorry this was unclear. The main reason to introduce a new kind of statement it is so that it can be ignored in codegen, and any MIR passes which wish to be conservative. Then over time we can start using it in more passes. Specifically I don't want LLVM relying on this yet, since it could break unsafe code in the wild.

Performance is a secondary benefit.

RalfJung · 2019-09-17T07:04:51Z

it could break unsafe code in the wild.

What kind of code would be broken by this? A concrete example would be nice, ideally with some indication of whether anything like this has been seen in the wild.
Or, to ask basically the same question, where do you want to emit those instructions?

tmandry · 2019-09-17T23:39:43Z

What kind of code would be broken by this?

Any code which does something like this:

let x = String::from("hello");
let x_ptr = &mut x as *mut String;
std::mem::drop(x);
unsafe { *x_ptr = String::from("goodbye"); }

I suppose reading the bytes would be too, but I think that's already UB.

I don't know of any code which does this in the wild, but it may well exist. I don't know of an easy way to check for it today.

Or, to ask basically the same question, where do you want to emit those instructions?

After a local is moved from. Today, just in generators (both to be conservative, and because adding it elsewhere is likely to regress performance).

This is basically your suggestion to emit StorageDead(x); StorageLive(x); after every move, but in a way that lets us be more conservative.

RalfJung · 2019-09-18T06:42:26Z

I see, makes sense.

Having Miri test for this (everywhere, not just in generators) would be a good first step towards finding at least some uses in the wild, I guess.

nikomatsakis · 2019-09-18T19:53:31Z

I do think it'd be great to know whether this pattern exists in the wild, but I also feel pretty strongly that we are going to want to call it UB in the end. It seems like this comes up fairly regularly when discussing possible optimizations.

In any case, I'm in favor of moving forward with steps towards allowing us to recycle memory, but I don't really think this has to be done before/after async-await stabilizes, as I think this kind of code falls pretty squarely into the "grey area" of unsafe code that could well become invalid.

Do we know of even one real instance of such code?

RalfJung · 2019-09-18T20:39:31Z

In case the question was directed at me, I am totally fine making that code UB. I'd just hope we can have it fail in Miri before we do optimizations based on it. ;)

RalfJung · 2020-06-19T06:46:48Z

In terms of recycling storage, I learned some strange things about LLVM:

Eli Friedman writes

If you're talking about the semantics of LLVM, according to the LangRef, two allocas can't have the same address. The rules for alloca specify that the memory is not freed until the function returns. And the rules for llvm.lifetime.start/end don't say anything about the address, only that accessing the memory is undefined outside the lifetime. Therefore, it's illegal for stack coloring to allocate two variables whose address escapes at the same address.

That would mean that even with StorageEnd, if the address if a local gets observed, LLVM would not be allowed to put another local at the same address -- which seems like a problem for us because we do want LLVM to reuse those addresses.

tmiasko · 2021-07-03T09:28:41Z

Code generation of StorageRefresh as a pair of lifetime.end and lifetime.start in LLVM makes it possible to detect use after move with AddressSanitizer and MemorySanitizer. That would be quite useful to canvas the usage in the wild and detect bugs.

AddressSanitizer detecting use after move

#[derive(Debug)]
struct Wrapper<T: ?Sized>(T);

fn main () {
    let mut w = Wrapper("A");
    let p: *const _ = &w as *const _;
    println!("{:?}", unsafe { &*p});
    drop(w);
    println!("{:?}", unsafe { &*p});
}

     Running `target/x86_64-unknown-linux-gnu/debug/use-after-move`
Wrapper("A")
=================================================================
==184053==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe619d05c0 at pc 0x563fd541589f bp 0x7ffe619ce700 sp 0x7ffe619ce6f8
READ of size 8 at 0x7ffe619d05c0 thread T0
    #0 0x563fd541589e in <&str as core::fmt::Debug>::fmt library/core/src/fmt/mod.rs:2033:71
    ...
    #13 0x563fd4f65f9d in use_after_move::main src/main.rs:9:5
    ...

Address 0x7ffe619d05c0 is located in stack of thread T0 at offset 384 in frame
    #0 0x563fd4f65a0f in use_after_move::main src/main.rs:4

  This frame has 9 object(s):
    [32, 40) '_34' (line 9)
    [64, 72) '_32' (line 9)
    [96, 112) '_31' (line 9)
    [128, 176) '_24' (line 9)
    [208, 216) '_16' (line 7)
    [240, 248) '_14' (line 7)
    [272, 288) '_13' (line 7)
    [304, 352) '_6' (line 7)
    [384, 400) 'w' (line 5) <== Memory access at offset 384 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope library/core/src/fmt/mod.rs:2033:71 in <&str as core::fmt::Debug>::fmt

Potential utility for LLVM stack coloring to reduce stack usage is a different question. As far as I can see, in the current implementation of stack coloring, such a pair of intrinsics doesn't shorten a live range. In fact, it could do the opposite.

If an alloca is used with more than a single pair of lifetime intrinsics, the alloca is considered to be "degenerate" and processed in more conservative fashion. The effective live range of "degenerate" alloca starts with lifetime.start, while for non-"degenerate" allocas it is shortened towards the first use (which itself seems hard to explain operationally, but that's for another discussion).

tmandry mentioned this issue Jun 14, 2019

Generator size: unwinding and drops force extra generator state allocation #59123

Closed

tmandry mentioned this issue Jun 21, 2019

Don't store locals that have been moved from in generators #61922

Merged

RalfJung mentioned this issue Oct 9, 2019

When are things guaranteed to have unique addresses? rust-lang/unsafe-code-guidelines#206

Open

nikomatsakis mentioned this issue Oct 9, 2019

Cannot await in scope that contains call to format! #64960

Closed

tmandry mentioned this issue Mar 22, 2020

StorageLive (and even StorageDead) may be unnecessary in MIR. #68622

Open

RalfJung mentioned this issue May 11, 2020

Avoid Operand::Copy with &mut T #72093

Merged

RalfJung mentioned this issue May 31, 2020

[validate-mir] validate that all accessed locals are initialized #72803

Closed

tmandry mentioned this issue Apr 22, 2022

rebinding values in an async function can double the memory usage #96084

Open

This was referenced Jul 9, 2022

Clarify MIR semantics of storage statements #99050

Merged

Should we allow StorageLive on a live local? #99160

Closed

tmandry mentioned this issue Jul 13, 2022

Resolve the interaction between StorageDead, drops, and unwinding #61015

Closed

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recycle storage after move #61849

Recycle storage after move #61849

tmandry commented Jun 14, 2019 •

edited

Loading

eddyb commented Jun 18, 2019

RalfJung commented Jun 18, 2019

eddyb commented Jun 19, 2019

RalfJung commented Jun 21, 2019

tmandry commented Jul 3, 2019

RalfJung commented Jul 3, 2019

tmandry commented Jul 3, 2019

tmandry commented Sep 15, 2019 •

edited

Loading

RalfJung commented Sep 16, 2019

tmandry commented Sep 16, 2019 •

edited

Loading

RalfJung commented Sep 17, 2019 •

edited

Loading

tmandry commented Sep 17, 2019 •

edited

Loading

RalfJung commented Sep 18, 2019

nikomatsakis commented Sep 18, 2019

RalfJung commented Sep 18, 2019

RalfJung commented Jun 19, 2020

tmiasko commented Jul 3, 2021

Recycle storage after move #61849

Recycle storage after move #61849

Comments

tmandry commented Jun 14, 2019 • edited Loading

eddyb commented Jun 18, 2019

RalfJung commented Jun 18, 2019

eddyb commented Jun 19, 2019

RalfJung commented Jun 21, 2019

tmandry commented Jul 3, 2019

RalfJung commented Jul 3, 2019

tmandry commented Jul 3, 2019

tmandry commented Sep 15, 2019 • edited Loading

RalfJung commented Sep 16, 2019

tmandry commented Sep 16, 2019 • edited Loading

RalfJung commented Sep 17, 2019 • edited Loading

tmandry commented Sep 17, 2019 • edited Loading

RalfJung commented Sep 18, 2019

nikomatsakis commented Sep 18, 2019

RalfJung commented Sep 18, 2019

RalfJung commented Jun 19, 2020

tmiasko commented Jul 3, 2021

tmandry commented Jun 14, 2019 •

edited

Loading

tmandry commented Sep 15, 2019 •

edited

Loading

tmandry commented Sep 16, 2019 •

edited

Loading

RalfJung commented Sep 17, 2019 •

edited

Loading

tmandry commented Sep 17, 2019 •

edited

Loading