-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix variant index / discriminant confusion in uninhabited enum branching #89764
Conversation
@RalfJung could you check if this makes sense from MIR semantics perspective. Thanks. |
I was looking it if is possible to construct an example demonstrating bugs in The use of variant index instead of discriminant for single variant layout Alternatively, one could construct a case were one variant becomes uninhabited The user written match on discriminant doesn't trigger the optimization either #![feature(core_intrinsics)]
use core::intrinsics::discriminant_value;
pub enum E { A = 1 }
pub fn f(e: &E) -> u32 {
match {discriminant_value(e)} {
1 => 1,
_ => 0,
}
}
fn main() {
assert_eq!(f(&E::A), 1);
} So this might not be an immediate soundness concern, but who knows ... |
I'm sorry, but it is rather hard to figure out from a diff of a MIR optimization what the semantic question is. Could you be more explicit? Sadly I won't have time to review this PR.
This (for the 2nd time today^^) reminds me of discussions with @eddyb about enum layouts... yeah,
The |
I think question is: is it valid to turn bb1: {
_2 = discriminant(_1); // typeof(_1) is an uninhabited enum
switchInt(move _2) -> [0_isize: bb3, 1_isize: bb4, otherwise: bb2];
} into bb1: {
_2 = discriminant(_1); // typeof(_1) is an uninhabited enum
unreachable;
} At a glance, that seems valid to me since if the type is uninhabited, we can't have a valid value in |
I guess the question is if we want to consider MIR like this to be UB:
Due to #89765, Miri would currently ICE on such MIR. My preferred solution in that issue would say that asking for the discriminant of But I take it from your question that you would prefer to make this UB? What about the case where |
The semantics question would be: if MIR reads an enum discriminant and switches on its value, do we require the discriminant value to be valid, and are valid discriminants those that correspond to inhabited enum variants only? (Note that this pass is currently concerned only with enums). As far as I understand, the LLVM code generation already makes this assumption. If type is uninhabited, trying to read a discriminant produces undef and subsequently branching introduces undefined behaviour. If an enum is inhabited, there is a range metadata attached when loading the encoded discriminant and it accounts for uninhabited variants as well (although it is only a single range). The main motivation for this PR was a fix mismatch between variant index and variant discriminant use, but given the current behavior of I also wanted to make sure we are on the same page with regards to the semantics, especially given #89765. It seems perfectly fine to me to make this more conservative if needs be. |
It does? Is that some explicit case in our codegen backend? |
rust/compiler/rustc_codegen_ssa/src/mir/place.rs Lines 206 to 215 in 81117ff
|
Okay, that does not even check if the type is an enum. So either we need to special-case uninhabited types in Miri (and the semantics) or we need to say that the operand needs to satisfy the validity invariant. I think the former is a bad idea, the only thing that is special about uninhabited types is that their validity invariant is never satisfied. |
If the current state is that there is a basis for the code generation and this MIR pass (in the valididy requirement for operand), but we have yet to decided this more formally, I think, it would make sense to land this, since it is a bug fix and does not make any new assumptions. At the same time we can independently consider if we want to continue relying on this semantics, and whether the code generation and this MIR pass need to be changed both. |
Fair -- but then Miri should also be changed to actually check the validity invariant in |
Sorry for the delay, just getting caught up on GH notifications now.
@RalfJung do you think we should change Miri before/in conjunction with this PR or are you simply saying we should do that at some point in the future? |
I am doing the Miri change in #90895. |
Thanks @RalfJung! |
…=oli-obk require full validity when determining the discriminant of a value This resolves (for now) the semantic question that came up in rust-lang#89764: arguably, reading the discriminant of a value is 'using' that value, so we are in our right to demand full validity. Reading a discriminant is somewhat special in that it works for values of *arbitrary* type; all the other primitive MIR operations work on specific types (e.g. `bool` or an integer) and basically implicitly require validity as part of just "doing their job". The alternative would be to just require that the discriminant itself is valid, if any -- but then what do we do for types that do not have a discriminant, which kind of validity do we check? [This code](https://github.com/rust-lang/rust/blob/81117ff930fbf3792b4f9504e3c6bccc87b10823/compiler/rustc_codegen_ssa/src/mir/place.rs#L206-L215) means we have to at least reject uninhabited types, but I would rather not special case that. I don't think this can be tested in CTFE (since validity is not enforced there), I will add a compile-fail test to Miri: ```rust #[allow(enum_intrinsics_non_enums)] fn main() { let i = 2u8; std::mem::discriminant(unsafe { &*(&i as *const _ as *const bool) }); // UB } ``` (I tried running the check even on the CTFE machines, but then it runs during ConstProp and that causes all sorts of problems. We could run it for ConstEval but not ConstProp, but that simply does not seem worth the effort currently.) r? `@oli-obk`
…=oli-obk require full validity when determining the discriminant of a value This resolves (for now) the semantic question that came up in rust-lang#89764: arguably, reading the discriminant of a value is 'using' that value, so we are in our right to demand full validity. Reading a discriminant is somewhat special in that it works for values of *arbitrary* type; all the other primitive MIR operations work on specific types (e.g. `bool` or an integer) and basically implicitly require validity as part of just "doing their job". The alternative would be to just require that the discriminant itself is valid, if any -- but then what do we do for types that do not have a discriminant, which kind of validity do we check? [This code](https://github.com/rust-lang/rust/blob/81117ff930fbf3792b4f9504e3c6bccc87b10823/compiler/rustc_codegen_ssa/src/mir/place.rs#L206-L215) means we have to at least reject uninhabited types, but I would rather not special case that. I don't think this can be tested in CTFE (since validity is not enforced there), I will add a compile-fail test to Miri: ```rust #[allow(enum_intrinsics_non_enums)] fn main() { let i = 2u8; std::mem::discriminant(unsafe { &*(&i as *const _ as *const bool) }); // UB } ``` (I tried running the check even on the CTFE machines, but then it runs during ConstProp and that causes all sorts of problems. We could run it for ConstEval but not ConstProp, but that simply does not seem worth the effort currently.) r? ``@oli-obk``
It seems like MIR building does not agree with the idea that only valid values have their discriminant read: #91029. |
Furthermore, this triggers a Stacked Borrows violation here because This seems like a pretty clear sign that reading a discriminant should do the minimal amount of work necessary to determine the discriminant value -- but then there is no justification for this part of codegen. Cc @eddyb |
Ah, after a partial move from a variant an elaborated drop still need to read the enum discriminant to determine which variant is active.
This suggest an approach which requires only a discriminant to be valid? |
I did not realize we support partial moves out of enums. In principle we do not need to re-read the discriminant since the after the partial move we know which variant it is in, I think... but that is probably tricky to do.
Well, that's what we did before. But enums without variants do not have a discriminant so it is trivially valid. Thus the optimization you want to do is unsound under this approach. |
Another way to look at this would be to say that they are never valid.
Anyway, I am worried that this discussion distracted us from the actual purpose of this pull request, and delayed fixing the issues it was intended to fix, while being completely about preexisting state of the matter. |
That would be a peculiar special case. Many types do not have discriminants and asking for their discriminant is always fine. I am not a fan if special-casing zero-variant enums. The codegen backend goes even further and specializes uninhbaited types. But again it seems entirely ad-hoc that If we really want to treat zero-variant enums separately we need a
Well, we learned that the preeexistng semantics are weird at best, and IMO we should change the codegen backend to no longer special-case uninhabited types. |
For some time CTFE has been using a dedicated MIR which is never optimized, so the check for promoted became redundant.
The change is limited to the iteration over indices instead of using `basic_blocks_mut()` directly, in the case the previous implementation intentionally avoided invalidating the caches stored in MIR body.
Previously for enums using the `Variants::Single` layout, the variant index was being confused with its discriminant. For example, in the case of `enum E { A = 1 }`. Use `discriminant_for_variant` to avoid the issue.
f664a55
to
c3e71d8
Compare
|
…value" This reverts commit 0a2b7d7, reversing changes made to 47c1bd1. This caused several unforeseen problems: - rust-lang#91029 - rust-lang#89764 (comment)
Revert "require full validity when determining the discriminant of a value" This reverts commit 0a2b7d7, reversing changes made to 47c1bd1. This caused several unforeseen problems: - rust-lang#91029 - rust-lang#89764 (comment) So I think it's best to revert for now while we keep discussing the MIR semantics of getting a discriminant. r? `@oli-obk`
Revert "require full validity when determining the discriminant of a value" This reverts commit 0a2b7d7, reversing changes made to 47c1bd1. This caused several unforeseen problems: - rust-lang#91029 - rust-lang#89764 (comment) So I think it's best to revert for now while we keep discussing the MIR semantics of getting a discriminant. r? `@oli-obk`
I opened #91095 for the MIR semantics issue. This already exists in master so it does not have to block this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a test that shows the effect of this change but I don't think it needs to block merging this.
@bors r+ |
📌 Commit c3e71d8 has been approved by |
🌲 The tree is currently closed for pull requests below priority 600. This pull request will be tested once the tree is reopened. |
…askrgr Rollup of 13 pull requests Successful merges: - rust-lang#89747 (Add MaybeUninit::(slice_)as_bytes(_mut)) - rust-lang#89764 (Fix variant index / discriminant confusion in uninhabited enum branching) - rust-lang#91606 (Stabilize `-Z print-link-args` as `--print link-args`) - rust-lang#91694 (rustdoc: decouple stability and const-stability) - rust-lang#92183 (Point at correct argument when async fn output type lifetime disagrees with signature) - rust-lang#92582 (improve `_` constants in item signature handling) - rust-lang#92680 (intra-doc: Use the impl's assoc item where possible) - rust-lang#92704 (Change lint message to be stronger for &T -> &mut T transmute) - rust-lang#92861 (Rustdoc mobile: put out-of-band info on its own line) - rust-lang#92992 (Help optimize out backtraces when disabled) - rust-lang#93038 (Fix star handling in block doc comments) - rust-lang#93108 (:arrow_up: rust-analyzer) - rust-lang#93112 (Fix CVE-2022-21658) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Fix confusion between variant index and variant discriminant. The pass
incorrectly assumed that for
Variants::Single
variant index is the same asvariant discriminant.
r? @wesleywiser