-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use MaybeUninit to produce more correct layouts #63035
Conversation
9f71527
to
678f5b0
Compare
This is great! I think this means you can remove the following: rust/src/librustc_mir/interpret/visitor.rs Lines 251 to 257 in 0cfb2f7
|
Unfortunately I don't think we can do this yet. The original types will still be reported by Maybe there's a way of doing this without requiring those projections. Otherwise, the "right way" to implement this is to decide at MIR transform time which fields are aggregate types and should be wrapped in Note for my future self: we don't have to do any of this for the prefix fields, because the layout describes them as being members of variants, not the generator struct itself. We wrap those prefix fields in |
678f5b0
to
516f3e5
Compare
This prevents uninhabited fields from "infecting" the abi and largest_niche of the generator layout. This fixes a latent bug, where an uninhabited field could be promoted to the generator prefix and cause the entire generator to become uninhabited.
516f3e5
to
9892845
Compare
@RalfJung If we were to make #60450 a hard error, I think that would mean that aggregates don't have to be considered MaybeUninit, does that seem correct? We would also have to decide that this won't be changed in the future (see #54987 (comment)). |
I understand -- but then I don't understand what this PR even changes.
I don't think I understand the connection. Last time I checked, |
It keeps layouts from being incorrectly marked with There are more efficient ways of accomplishing this. I thought we were going to use
Okay, so a nice property that was introduced in #60187 is that we only add a generator field to the layout variants which correspond to the suspension points it is live across. Not storage-live, but actually live (i.e. it has been initialized, and that value is potentially going to be used again in a subsequent resume, before the next time it is reinitialized). If my thinking is correct, this is a strong property that gets us most of the way to not needing The only remaining problem is aggregates, which can be partially initialized across a |
I see.
Ah. Yes that sounds like it would make a difference. It would be great if you could come up with a few interesting examples for this (something you would consider corner-cases of this treatment) and add them as tests to Miri here or here. That would help check if
The way you put it, yes, this sounds reasonable. But this is subtle stuff and I have nowhere near anything like a good overview of the things that are going on here. |
src/librustc/ty/layout.rs
Outdated
// MaybeUninit above. | ||
let layout = layout?; | ||
if layout.is_aggregate() { | ||
self.layout_of(tcx.mk_maybe_uninit(layout.ty)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is a bad idea: this won't properly trigger for newtypes of immediates (scalars/vectors).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the correct way would be with a type visitor then.
Or it's probably best to focus efforts on making #60450 a hard error, so we don't need this hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this in favor of #63230
ty, | ||
&prefix_layouts, | ||
&ReprOptions::default(), | ||
StructKind::AlwaysSized, | ||
)?; | ||
// FIXME(eddyb) need `MaybeUninit` around promoted types (see above). | ||
prefix.largest_niche = None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add a test for some generator that captures a bool
or something, that it's the same size when wrapped in Option
.
In the future, the state/discriminant field would have the right validity range and would be used itself as the niche in many situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding a test, but I'm afraid that if we move the upvars into a variant (#62958 (comment)) this test won't work anymore. =/
9892845
to
9d4ca87
Compare
Sure, though I'm not sure how to test "things are really initialized when we say they are". I suppose I could test the negative case, by writing an invalid value to a local which is supposed to be initialized and then return, expecting miri validation to catch it (once the hack is removed). Does the test harness support that? The treatment of uninhabited types would make an interesting testcase, too. Though you would get that and more if you ran miri on the rustc run-pass test suite. |
Primarily I was thinking of code that should pass. After all, the point of this PR is to make some code correctly work that didn't work correctly before, right? Something where a generator is created or suspended in a state where some of its fields are not yet initialized. That's how I originally found out about this issue.
Miri has compile-fail tests for this. However, right now we treat generators as unions and don't validate their fields (that is the code I linked above), so to catch such errors we would have to change that first. This would also be tricky to write because we have to circumvent the validation that happens inside the generator itself. |
This is one of the behaviors we no longer allow in NLL. Since it can lead to undefined behavior, I think it's definitely worth making it a hard error without waiting to turn off migration mode (rust-lang#58781). Closes rust-lang#60450. My ulterior motive here is making it impossible to leave variables partially initialized across a yield (see discussion at rust-lang#63035), so tests are included for that.
…ized, r=Centril Make use of possibly uninitialized data [E0381] a hard error This is one of the behaviors we no longer allow in NLL. Since it can lead to undefined behavior, I think it's definitely worth making it a hard error without waiting to turn off migration mode (rust-lang#58781). Closes rust-lang#60450. My ulterior motive here is making it impossible to leave variables partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so tests are included for that. cc rust-lang#54987 --- I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me. r? @Centril cc @cramertj @nikomatsakis @pnkfelix @RalfJung
…ized, r=Centril Make use of possibly uninitialized data [E0381] a hard error This is one of the behaviors we no longer allow in NLL. Since it can lead to undefined behavior, I think it's definitely worth making it a hard error without waiting to turn off migration mode (rust-lang#58781). Closes rust-lang#60450. My ulterior motive here is making it impossible to leave variables partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so tests are included for that. cc rust-lang#54987 --- I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me. r? @Centril cc @cramertj @nikomatsakis @pnkfelix @RalfJung
The change looks fine to me. I mean treating some value as a union in layout computation is always fine, just pessimistic. So the only "dangerous" change is removing the niche override, but that's also ok now. I'm guessing that All in all, I think it's fine, but I don't have the best overview of downstream effects. |
Yep.
It means that according to the type system the type hasn't changed. We're just "pessimizing" the layout computation so it doesn't do thinks like propagate Uninhabited abi's to the outer type. The reason for this change is that the "promoted" fields might indeed be uninitialized, so it should give us the correct behavior with respect to layouts. |
Isn't the type system based on It also means that some code looking at the field's types will see |
It's quite subtle, so sorry for not giving a more detailed explanation before. The promoted fields we are wrapping in this PR are still only part of a variant in the final layout. So the layout is correct in that each field (including the promoted ones) will be fully initialized when that variant is active. This PR wraps these promoted fields in During this phase, we want any uninhabited upvars to cause our generator to have |
I know "upvar" from closures, and I think that's just a captured variable, right? What's a "promoted local", does that have anything to do with static promotion? |
No, it's just a term within the layout computation. There are two zones of a generator: the non-overlap zone ("prefix") and the overlap zone. Saved locals default to being in the overlap zone. If two locals in the overlap zone aren't allowed to overlap with each other, but might, one of them is "promoted" to the non-overlap zone. These "promoted" locals were causing trouble, because of a bug in the layout computation that treated them as a field of the generator itself instead of a field of one variant. This wasn't correct with respect to uninhabitedness and largest niche. I'm now realizing that there needs to be some symmetric kind of change which propagates uninhabitedness of the promoted fields out to the variants which contain them. Largest niche I'm less sure about, since a promoted field can be a member of more than one variant at the same time. |
📌 Commit 9d4ca87 has been approved by |
I chatted with @eddyb about this and they said it's safer not to do it, so let's skip it for now. |
Rollup of 9 pull requests Successful merges: - #63034 (Fix generator size regressions due to optimization) - #63035 (Use MaybeUninit to produce more correct layouts) - #63163 (add a pair of whitespace after remove parentheses) - #63294 (tests for async/await drop order) - #63307 (don't ignore mir_dump folder) - #63308 (PlaceRef's base is already a reference) - #63310 (Tests around moving parts of structs and tuples across await points) - #63314 (doc: the content has since been moved to the guide) - #63333 (Remove unnecessary features from compiler error code list) Failed merges: r? @ghost
@tmandry would be great to get some of these explanations written down somewhere, probably in the rustc-guide, and/or appropriate places in the layout computation code? I am still a bit confused by the two zones, it would be worth explaining how they differ. My guess is that one zone is basically a struct and one is an enum (with one variant per "generator state" aka yield point), is that correct? And the non-overlap zone is the struct part? But then it seems odd to say that locals are in the enum part unless we figure out they cannot be there; from a safety perspective it seems they should be in the struct part unless we can prove that they can be moved to the enum part? |
@bors retry r- |
@RalfJung There is some explanation in the code, along with inline comments in the implementation below: rust/src/librustc/ty/layout.rs Lines 1285 to 1302 in 647ed20
These are implementation details I'd like to change at some point, but it's possible they may persist for some time. See also the PR description for #60187, which probably has the best summary.
Yes.
We rule out some vars initially, but the main constraint is that we have a matrix |
You mean fields that are allowed to coexist and hence must not overlap? Or am I mixing something up here? So every field is either in exactly one variant (if it is in the overlap part) or in all of them? That is a surprising restriction and might have contributed to my confusion. After all, there is a continuum between this: a field could be in 2 out of N variants, so it can still overlap with fields from the remaining variants. I suppose that's not done to keep everything simpler? Might be worth putting that somewhere in the introduction for this documentation: In general, every variant corresponds to a yield point and every variable (field) is live across some yield points, and then it will be in exactly those variants, and the task in layout would be to find the most compact way to represent this. But to keep things simpler, we add all fields either in one variant or in all of them. But then I wonder why this needs any matrix... the variables that can be overlapped are exactly those that are live for only one yield point, are they not? I must still be missing something, sorry. :/ But at least the PR makes sense now: fields in the prefix are basically in all variants, including those in which they are not actually live, and hence must be considered |
Sorry, I meant "overlap". I edited the original message to make this clearer.
No, fields are only in the variants that need them. That can indeed mean a field is in 2 or 3 variants out of many. What I am saying about the "zones" only applies to the byte layout of fields in the generator. Nothing needs to care about this except the layout code. Think of the variant as a "view" into this structure. If instead you had said, "every field has space reserved either in exactly one variant or in all of them," then that would be correct. But again, this is an implementation detail of the layout code. From the point of view of a MIR consumer, if I look at what fields are present in a given variant, I will only see the fields that are live across the corresponding yield point. The rest of the bytes will just look to me like padding between fields.
Whether or not two variables can overlap is determined by analysis on the MIR. Here's a case where two vars are each live for a single yield point, but cannot overlap: || {
let x = "hello".to_string();
yield;
let y = "world".to_string();;
drop(x);
yield;
drop(y);
} We have to do dataflow analysis to see that |
Ah, I think now it clicked, finally! Thanks. Maybe this feedback can help improve the docs for the next time you need to explain this. ;)
Awesome. So then indeed I suppose an intersting testcase for Miri would be one where there is an uninhabited type live across an unreachable yield point... fn test(b: bool, x: fn() -> !) {
yield;
if b { return; }
let val = x();
yield;
drop(val);
} Except this is a ZST so maybe that actually doesn't make it interesting? Not sure how smart this all is. |
Yes, noted :)
Yeah, I was thinking of a case just like this. FYI, we have tests like this in rust here. |
Add generator, async tests with uninhabited saved local See discussion in rust-lang/rust#63035.
cc #60889 |
This fixes a latent instance of #59972 where we would promote an uninhabited field to the generator prefix, causing the entire generator to be uninhabited.
r? @eddyb
cc @cramertj @Zoxc