Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve the interaction between StorageDead, drops, and unwinding #61015

Closed
cramertj opened this issue May 21, 2019 · 20 comments
Closed

Resolve the interaction between StorageDead, drops, and unwinding #61015

cramertj opened this issue May 21, 2019 · 20 comments
Labels
A-coroutines Area: Coroutines A-destructors Area: Destructors (`Drop`, …) A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@cramertj
Copy link
Member

#60840 introduces the expectation that once drop runs on a local, it is UB for that local to be accessed, regardless of the success or failure (panic) of the drop. Unfortunately, adding extra StorageDeads for all of these values caused a performance regression, so at the moment we don't do this. #60840 instead uses drop to indicate that a value is implicitly StorageDead. However, this interacts poorly with drop elaboration, which can turn a drop of a structure into drops of its fields.

cc @Zoxc @tmandry @RalfJung

@jonas-schievink jonas-schievink added A-destructors Area: Destructors (`Drop`, …) A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 21, 2019
@tmandry
Copy link
Member

tmandry commented May 21, 2019

Also, for locals which are not Drop, #60840 adds StorageDead, but only on generator bodies. Adding it for all functions would be a big performance regression.

Technically, we could also emit StorageDead for locals which are Drop in generators, but it would have been slower (unclear how much) and would have required more significant changes to the MIR generation code.

@RalfJung
Copy link
Member

To repeat what I said in the other thread:

This changes the semantics of Drop to implicitly have a StorageDead in both the return and the unwind edge. If we decide to go with this approach, please open an issue in Miri so that we can implement this in Miri's semantics. It would also be helpful if you could come up with some example code that should be flagged as UB.

I have a few questions about this:

  • Drop works on an arbitrary place, StorageDead works on a local. I suppose the new semantics only occurs when the place is exactly a local, and does not involve any projections or so?

  • In early passes of MIR construction (before drop flag elaboration), Drop is actually a conditional drop. Does the StorageDead only occur if the Drop happens, or does it always occur? Are we sure that drop flag elaboration is semantics-preserving under whatever choice we make here (and document, I hope)?

  • What about DropAndReplace?

However, this interacts poorly with drop elaboration, which can turn a drop of a structure into drops of its fields.

To elaborate, the problem here is that this removes a StorageDead, and it is not clear that that is a correct program transformation.

@RalfJung
Copy link
Member

Cc @rust-lang/wg-unsafe-code-guidelines

@RalfJung
Copy link
Member

Regarding DropAndReplace, @eddyb writes

Also, I thought DropAndReplace expanded to a Drop and an assignment, wouldn't this make that assignment UB?

@tmandry that's why I asked about DropAndReplace above. What is your analysis doing for them? Also can someone reading this show an example where DropAndReplace gets used? Miri never sees them AFAIK, so they presumably get lowered away to something else at some point.

@eddyb
Copy link
Member

eddyb commented May 23, 2019

@RalfJung Any potentially-drop-needing source-level assignment of an already-initialized place should result in a DropAndReplace - e.g. let mut s = String::new(); s = String::from("foo");.
AFAIK (but @arielb1 / @pnkfelix might know better), the DropAndReplace is rewritten in drop elaboration to a Drop (of the previous value) followed by an Assign (of the new value).

@eddyb
Copy link
Member

eddyb commented May 23, 2019

Oh, also, #61060 is open now (and @RalfJung already left a comment on it: #61060 (comment)).

@RalfJung
Copy link
Member

Any potentially-drop-needing source-level assignment of an already-initialized place should result in a DropAndReplace - e.g. let mut s = String::new(); s = String::from("foo");.
AFAIK (but @arielb1 / @pnkfelix might know better), the DropAndReplace is rewritten in drop elaboration to a Drop (of the previous value) followed by an Assign (of the new value).

I see, thanks. I guess we do guarantee that this keeps the address stable? Seems strange not to. And indeed see this example for a function where making the Drop implicitly deallocate the storage would be rather fatal!

In that case, a drop generated from a DropAndReplace should not be considered as implicitly doing StorageDead, it seems. So maybe the Drop terminator should carry a bool indicating whether the storage is doing to be reused, and @tmandry's analysis only looks for those that say it will not get reused?

@eddyb
Copy link
Member

eddyb commented May 24, 2019

I talked some more with @tmandry and it seems like this entire direction arose from the assumption that it would be easier than inserting StorageDeads on the unwind path.
But StorageDead in unwind (aka "cleanup") blocks is much easier to ignore (e.g. in LLVM codegen), if necessary, compared to ignoring the extra semantics attached to Drops when they don't apply.

So I think we should just insert StorageDead in the cleanup path as well.

@RalfJung
Copy link
Member

I'm certainly in favor of explicit StorageDead over implicit side-effects. :)

@arielb1
Copy link
Contributor

arielb1 commented Jun 2, 2019

I also think that a Drop should be just-a-Drop and you'll need a StorageDead to be able to recycle the address.

@tmandry
Copy link
Member

tmandry commented Jul 8, 2019

It seems we can emit StorageDead in all the cases we care about optimizing. However, there are some very measurable compiler performance hits to doing so (see this comparison for what happens when we enable StorageDead along unwind paths for all functions). Ignoring these StorageDead during codegen did not solve the performance issue, which is why we only emit them for generators today.

One alternative is to use LLVM's stack coloring scheme by @arielb1, merged in https://reviews.llvm.org/D31583. The basic idea is to look for a path between StorageLive(x) and StorageLive(y) (or vice versa) to decide whether x and y are ever StorageLive at the same time. This means we won't have to care about StorageDead.

Some have raised the concern that if we don't emit StorageDead, then the MIR we emit is less correct/complete. But that's already the case today (except for in generators where we emit StorageDead everywhere).

@RalfJung
Copy link
Member

Some have raised the concern that if we don't emit StorageDead, then the MIR we emit is less correct/complete. But that's already the case today (except for in generators where we emit StorageDead everywhere).

If we don't emit StorageDead, that basically means there is an implicit StorageDead when the function returns. That seems fine for me.

#42371 is a real problem IMO, but likely unrelated.

@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-lang Relevant to the language team, which will review and decide on the PR/issue. A-coroutines Area: Coroutines labels Mar 21, 2020
@RalfJung
Copy link
Member

RalfJung commented Jul 9, 2022

@tmandry What is the current status of this? It looks like this needs a special exception to be added to what #99050 says? Also Cc @JakobDegen

Drop is meant to call a destructor, which generally has nothing to do with actually deallocating the underlying storage, so I still think it is a bad idea to conflate those two operations. If this is just a question of MIR representation, we could investigate alternative schemes to encode these extra StorageDead more efficiently. For example, in BasicBlockData, we could encode a list of locals that are implicitly StorageDead when entering this block.

@tmandry
Copy link
Member

tmandry commented Jul 13, 2022

It looks like the approach in #60840 was generalized to consider all locals that were moved as not requiring storage. This is only done if the local is proven not to be aliased at the point it was moved. The check is done for both statements and terminators. (It's unclear to me how this works with DropAndReplace.)

Drop is meant to call a destructor, which generally has nothing to do with actually deallocating the underlying storage, so I still think it is a bad idea to conflate those two operations.

Separating concerns sounds like a good thing. If we can do it without regressing compiler performance too much we should probably do it. If only for the reason that it simplifies the code and mental models of how these MIR transforms work.

On the other hand I still think the generator transform is sound, because of the MaybeBorrowedLocals alias check I mentioned above. This is like other optimizations in that we give up if there's a chance of changing the meaning of your code. If you can possibly observe the storage "disappearing" and "reappearing" later we will fall back and handle it conservatively. There shouldn't be any exception needed for what #99050 says.

It might be nice to find ways of being more aggressive in future optimizations, but I think that's a separate discussion (#61849).

@RalfJung
Copy link
Member

RalfJung commented Jul 14, 2022

This is like other optimizations in that we give up if there's a chance of changing the meaning of your code.

Oh, I see -- then I guess I misunderstood. The original issue description says

#60840 instead uses drop to indicate that a value is implicitly StorageDead.

That sounded to me like a semantics-changing assumption, not an optimization backed by an analysis which ensures that it does not change program behavior.

Are the semantics of borrowed_locals described somewhere? Is any local that ever had & (or addr_of!) applied to it (or any part of it) considered "borrowed" perpetually, and hence exempt from this "storage dead on move" rule?

What about the issue that Drop itself "borrows" the local, i.e., makes its address visible to user-defined code? I guess we could argue that we have implicitly moved the local to some other temporary location, did the drop there, and then deallocated that location, which would keep the actual local still never-borrowed -- but this would be something that we'd have to actually describe in a precise specification of drop elaboration.

@RalfJung
Copy link
Member

What about the issue that Drop itself "borrows" the local, i.e., makes its address visible to user-defined code?

Looking at the analysis, it actually takes this into account. :)

mir::TerminatorKind::Drop { place: dropped_place, .. }
| mir::TerminatorKind::DropAndReplace { place: dropped_place, .. } => {
// Drop terminators may call custom drop glue (`Drop::drop`), which takes `&mut
// self` as a parameter. In the general case, a drop impl could launder that
// reference into the surrounding environment through a raw pointer, thus creating
// a valid `*mut` pointing to the dropped local. We are not yet willing to declare
// this particular case UB, so we must treat all dropped locals as mutably borrowed
// for now. See discussion on [#61069].
//
// [#61069]: https://github.com/rust-lang/rust/pull/61069
self.trans.gen(dropped_place.local);

but then, all locals that are dropped are also borrowed, which means that a Drop can never signal that storage is not needed again, which seems to contradict the original issue description?

@tmiasko
Copy link
Contributor

tmiasko commented Jul 14, 2022

The implementation described in the original comment was reverted in #61373 and replaced / complemented with emission of storage markers along unwind paths.

@RalfJung
Copy link
Member

Oh, okay. So... is this issue resolved then? What is still being tracked here?

Certainly the issue description is entirely outdated. It should be updated, or the issue closed.

@tmandry
Copy link
Member

tmandry commented Jul 14, 2022

Ah thanks @tmiasko, I thought I remembered changing course at one point but it was a long time ago.

Then I don't think there's anything left to track and I agree the issue should be closed.

@tmandry tmandry closed this as completed Jul 14, 2022
@RalfJung
Copy link
Member

Indeed, thanks a lot for resolving this. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-coroutines Area: Coroutines A-destructors Area: Destructors (`Drop`, …) A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants