Add dataflow analysis of enum variants #96991

JulianKnodt · 2022-05-12T19:27:05Z

This adds a dataflow analysis in MIR, in the case that at compile time it is known that a variable will have a specific value for an enum, it converts Discriminant calls to constants. There seemed to be a number of issues where codegen could not remove matches when a constant was known, so adding an explicit analysis in MIR should help give more control.

I still need to add tests that it works, but it seems to apply to a number of tests already. If possible, a perf-run would be appreciated.

rust-highfive · 2022-05-12T19:27:08Z

r? @nagisa

(rust-highfive has picked a reviewer for you, use r? to override)

compiler/rustc_mir_transform/src/single_enum.rs

tschuett · 2022-05-13T17:01:03Z

ultra nit: I like line breaks around functions.

lqd · 2022-05-13T18:01:03Z

@bors try @rust-timer queue

rust-timer · 2022-05-13T18:01:04Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-05-13T18:01:13Z

⌛ Trying commit 1b69305eac7d333c321c35e810a5303c49d1dec2 with merge 20dd8008d52592f40dfc3aa396d7f10ce1dcda5c...

JulianKnodt · 2022-05-13T18:11:47Z

@lqd will a perf run work if CI failed?

lqd · 2022-05-13T19:08:57Z

Usually yes; this time, probably not :)

JulianKnodt · 2022-05-13T19:44:37Z

@lqd now mir-opt tests are failing (since I didn't bless them), would it be possible to do a perf-run now?

nagisa · 2022-06-05T12:55:18Z

I'm going to pass on this PR over to

r? @oli-obk

given that this is closely related to constant propagation/evaluation. I don't really have enough capacity to really dig into the implementation here.

My observation broadly is that it isn't entirely clear if this should be a separate pass or a part of a larger constant propagation pass. Having it be separate seems like it is going to potentially require repeated runs of this pass interleaved with other passes in order to obtain meaningful IR quality improvements. That usually isn't great for compilation performance.

JakobDegen · 2022-06-05T21:37:48Z

If possible, I'd like to get a chance to review this as well, just to check soundness and interaction with unsafe code, MIR semantics, etc. I'll do my best to make sure that it's before the rest of review is completed, so that nothing is blocked on me. (Actually, is there a way that I can ask to be cc'd on all new MIR opts/significant changes to existing ones?)

JakobDegen

The direction here is definitely the right one. There are some soundness issues that need to be fixed though. More importantly, I'd also like the documentation to be fleshed out. With these types of analyses its important that we are very explicit about exactly what conclusions we are drawing, and the line of inference we are using to get there. I've noted a couple places where I think this could be done.

Edit: Forgot to mention, but all the below code snippets were compiled with

rustc +stage1 -C opt-level=0 -Z mir-opt-level=0 -Z mir-enable-passes=+SingleEnum -Z dump-mir=all test.rs

compiler/rustc_mir_dataflow/src/framework/lattice.rs

JakobDegen · 2022-07-01T02:00:40Z

compiler/rustc_mir_dataflow/src/impls/single_enum_variant.rs

+                    if let Some(rhs_local) = rhs.local_or_deref_local() {
+                        state.get(rhs_local).map(|f| f.1).copied()
+                    } else {
+                        rhs.ty(self.body, self.tcx).variant_index.map(|var_idx| var_idx)


Suggested change

rhs.ty(self.body, self.tcx).variant_index.map(|var_idx| var_idx)

None

Afaik this is always None already anyway. I do want to investigate adding variants as types in MIR at some point (for a bunch of reasons), but we don't have that today.

Also, even if this is not None, the semantics here are different from what const eval does - const eval just ignores the variant index. That means that if we did want to support having a variant index in "top level" types like this, we need to adjust Miri to also check that the validity invariant of that variant is actually upheld.

JakobDegen · 2022-07-10T08:17:09Z

compiler/rustc_mir_dataflow/src/impls/single_enum_variant.rs

+                // Assigning a constant does not affect discriminant?
+                Operand::Constant(_c) => return,


It certainly can. Consider:

#![feature(inline_const)] enum E { A, B, } fn main() { dbg!(foo()); } #[inline(never)] fn foo() -> u32 { let mut x = E::A; x = const { E::B }; match x { E::A => 1, E::B => 2, } }

Ideally you would be able to compute the discriminant of the const here so that you can introduce the correct fact - I'm not so sure how to do this though, maybe @oli-obk knows?

for now I just killed the fact, const prop will likely handle this optimization but I can put it back in the future

compiler/rustc_mir_dataflow/src/impls/single_enum_variant.rs

JakobDegen · 2022-07-10T09:18:07Z

compiler/rustc_mir_dataflow/src/impls/single_enum_variant.rs

+
+            if let Some(new_fact) = new_fact {
+                let loc = Location { block: target.target, statement_index: 0 };
+                state.insert(local, loc, new_fact);


I think things are slightly mixed up here. In MIR like this:

_5 = Discrim(_8); switchInt(_5, 0 => bb3, otherwise: bb5)

Your local will store _5, but really you want to insert a new fact about _8. Take a look at https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_dataflow/impls/fn.switch_on_enum_discriminant.html for inspiration on how to do this kind of thing. Essentially, you will just want to check if the previous statement was an assignment of the shape given above.

src/test/mir-opt/enum_prop.rs

JakobDegen · 2022-07-10T09:29:35Z

src/test/ui/let-else/let-else-run-pass.rs

@@ -11,7 +11,7 @@ fn main() {
    }
    // ref binding to non-copy value and or-pattern
    let (MyEnum::A(ref x) | MyEnum::B { f: ref x }) = (MyEnum::B { f: String::new() }) else {
-        panic!();
+        panic!("Shouldn't have matched enum");


What prompted the changes?

I couldn't tell which one was matching the panic, I forget why line numbers weren't sufficient, but this was miscompiling so I had to check where it was miscompiling.

compiler/rustc_mir_dataflow/src/impls/single_enum_variant.rs

rustbot · 2022-07-10T22:47:05Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Constify `Discriminant` Fix wrong size scalar in const_from_scalar Kill aliasable enums bless mir Praise the mir Switch to FxHashMap Rm bots Attempt btreemap instead of hash Maybe there were issues with queries? Reattempt index vec Implement cache for dataflow analysis Add test Move single enum after const prop

JakobDegen

Some more thoughts about documentation

JakobDegen · 2022-07-16T01:52:37Z

compiler/rustc_mir_dataflow/src/framework/lattice.rs

+pub struct FactCache<I, L, F, const N: usize> {
+    facts: [F; N],
+    ord: [(I, L); N],
+    len: usize,
+}


Thanks for the documentation update. What is the partial order under which this is a lattice? If this is a well-known result from theory (I'm not terribly familar) feel free to just provide a link to a reference

there's no theory, it's just that each fact gets inserted at a specific location ( body + stmt idx ), so I attempt to keep later facts more than more recent ones, like an LRU cache.

What I mean is that we could represent the entire set of facts using a HashMap<Idx, Value>, but this will grow linearly with the size of the program (or at least whatever is being analyzed). Instead, to use constant space, we keep a set of facts that evicts older ones, which we assume are no longer relevant, and replaces them with the most recent updates. This a strict subset of the HashMap<Idx, Value>, so should also be a lattice.

Ok, yeah, sorry, what I said before was silly. Yes, the subset partial order is an option, but it's not clear to me that that results in a lattice. Under the subset partial order the least upper bound of two states each storing N facts is a state that would have to store 2*N facts, which you can't represent

I should start thinking before I say things. Yeah, ok, this does seem like a lattice - let me go and re-read some stuff then

Yeah, ok I thought about this a bunch more and I think it's ok. We can imagine this fact cache as implicitly storing

enum LocalResult { Top, Fact(F, L), Bot(L), }

for each I. Locals which do not have an entry in the fact cache are implicitly Top. The partial order within each LocalResult is then Bot < Fact(F, min) < ... < Fact(F, max) < Top for each F, and the partial order for the fact cache as a whole is then the product partial order. This does indeed form a lattice. This should be documented though.

compiler/rustc_mir_dataflow/src/impls/single_enum_variant.rs

Add many of the additional lattice methods with comments, as well as clean up some issues which did not appear in the tests (i.e. SwitchIntEdgeEffect).

JakobDegen · 2022-07-16T02:38:09Z

I think I'd like to see the implementation of the FactCache and the analysis itself separated. Both of them are sufficiently subtle on their own. This would mean temporarily re-writing the analysis in terms of a domain that is much simpler and much easier to reason about. A good choice here would be something like an IndexVec<Local, VariantFact> where

enum VariantFact {
    Top,
    Variant(VariantIdx),
    Bot,
}

This should make things much easier to test. I wouldn't worry about performance for the initial version - we'll turn it off by default, so it only has to perform well enough to get through its tests. We can then hook it up with smarter domains and other performance optimizations later, once the parts are sound on their own.

JulianKnodt · 2022-07-16T03:44:17Z

I can break them into separate commits if you'd like, but breaking them into separate PRs would be quite slow since it would require re-review.

JakobDegen · 2022-07-30T01:05:17Z

I do think splitting this up would be a good idea. I can review both PRs, and that will be easier for me, not harder, especially because it'll be easier to reproduce bugs if we separate out the simpler part. I would also like to see some more justification for the need to reduce the optimization power of this pass over the "fully precise" version for perf reasons. We should first try to do all the perf improvements we can without resorting to that (I'm happy to help with this)

JohnCSimon · 2022-09-11T05:39:58Z

@JulianKnodt
Ping from triage - can you post your status on this PR? Thanks.

JulianKnodt · 2022-09-11T05:43:51Z

I haven't had time to update this for quite a while, I may get back to it at some point but am working on other things

JohnCSimon · 2022-09-11T05:57:08Z

@JulianKnodt
Ok - I'll close it as inactive

Please reopen when you are ready to continue with this.
Note: if you do please open the PR BEFORE you push to it, else you won't be able to reopen - this is a quirk of github.
Thanks for your contribution.

@rustbot label: +S-inactive

rust-highfive assigned nagisa May 12, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 12, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 12, 2022

This comment has been minimized.

Sign in to view

JulianKnodt force-pushed the enum_dflow branch 3 times, most recently from 7b2fb12 to 987bd34 Compare May 12, 2022 19:48

This comment has been minimized.

Sign in to view

JulianKnodt force-pushed the enum_dflow branch 2 times, most recently from e3027e7 to be51dcf Compare May 12, 2022 21:22

This comment has been minimized.

Sign in to view

JulianKnodt force-pushed the enum_dflow branch from be51dcf to c1e462d Compare May 12, 2022 21:49

erikdesjardins reviewed May 12, 2022

View reviewed changes

compiler/rustc_mir_transform/src/single_enum.rs Outdated Show resolved Hide resolved

JulianKnodt force-pushed the enum_dflow branch 3 times, most recently from a39d410 to 1b69305 Compare May 13, 2022 14:46

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 13, 2022

JulianKnodt force-pushed the enum_dflow branch from 1b69305 to 47399d6 Compare May 13, 2022 18:19

This comment has been minimized.

Sign in to view

JulianKnodt force-pushed the enum_dflow branch from a5a6cc8 to 60398f2 Compare May 13, 2022 19:55

Dylan-DPC removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 31, 2022

JulianKnodt force-pushed the enum_dflow branch from c021da1 to 1bc29cd Compare May 31, 2022 09:21

This comment has been minimized.

Sign in to view

JulianKnodt force-pushed the enum_dflow branch from 1bc29cd to 8be9edf Compare May 31, 2022 18:45

rust-highfive assigned oli-obk and unassigned nagisa Jun 5, 2022

oli-obk mentioned this pull request Jun 13, 2022

Make sure we keep up with mir optimizations rust-lang/highfive#407

Merged

JakobDegen requested changes Jul 10, 2022

View reviewed changes

JulianKnodt force-pushed the enum_dflow branch from ef0127d to 75f6274 Compare July 10, 2022 23:03

This comment has been minimized.

Sign in to view

JulianKnodt force-pushed the enum_dflow branch from 75f6274 to c0b28ad Compare July 11, 2022 06:53

This comment has been minimized.

Sign in to view

JulianKnodt force-pushed the enum_dflow branch from c0b28ad to f02a777 Compare July 11, 2022 08:48

oli-obk assigned JakobDegen and unassigned oli-obk Jul 11, 2022

JakobDegen reviewed Jul 16, 2022

View reviewed changes

Update with PR comments (+docs, +fixes)

cd37d5c

Add many of the additional lattice methods with comments, as well as clean up some issues which did not appear in the tests (i.e. SwitchIntEdgeEffect).

JulianKnodt force-pushed the enum_dflow branch from f02a777 to cd37d5c Compare July 16, 2022 02:28

JohnCSimon closed this Sep 11, 2022

rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Sep 11, 2022

	rhs.ty(self.body, self.tcx).variant_index.map(\|var_idx\| var_idx)
	None

		// Assigning a constant does not affect discriminant?
		Operand::Constant(_c) => return,

Add dataflow analysis of enum variants #96991

Add dataflow analysis of enum variants #96991

Conversation

JulianKnodt commented May 12, 2022 • edited Loading

rust-highfive commented May 12, 2022

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

tschuett commented May 13, 2022

lqd commented May 13, 2022

rust-timer commented May 13, 2022

bors commented May 13, 2022

JulianKnodt commented May 13, 2022

This comment has been minimized.

lqd commented May 13, 2022

This comment has been minimized.

JulianKnodt commented May 13, 2022

This comment has been minimized.

nagisa commented Jun 5, 2022 • edited Loading

JakobDegen commented Jun 5, 2022 • edited Loading

JakobDegen left a comment • edited Loading

Choose a reason for hiding this comment

JakobDegen Jul 1, 2022

Choose a reason for hiding this comment

JakobDegen Jul 10, 2022

Choose a reason for hiding this comment

JulianKnodt Jul 10, 2022

Choose a reason for hiding this comment

JakobDegen Jul 10, 2022

Choose a reason for hiding this comment

JakobDegen Jul 10, 2022

Choose a reason for hiding this comment

JulianKnodt Jul 10, 2022 • edited Loading

Choose a reason for hiding this comment

rustbot commented Jul 10, 2022

This comment has been minimized.

This comment has been minimized.

JakobDegen left a comment

Choose a reason for hiding this comment

JakobDegen Jul 16, 2022

Choose a reason for hiding this comment

JulianKnodt Jul 16, 2022 • edited Loading

Choose a reason for hiding this comment

This comment was marked as outdated.

JakobDegen Jul 16, 2022 • edited Loading

Choose a reason for hiding this comment

JakobDegen Jul 30, 2022

Choose a reason for hiding this comment

JakobDegen commented Jul 16, 2022

JulianKnodt commented Jul 16, 2022

JakobDegen commented Jul 30, 2022

JohnCSimon commented Sep 11, 2022

JulianKnodt commented Sep 11, 2022

JohnCSimon commented Sep 11, 2022

JulianKnodt commented May 12, 2022 •

edited

Loading

nagisa commented Jun 5, 2022 •

edited

Loading

JakobDegen commented Jun 5, 2022 •

edited

Loading

JakobDegen left a comment •

edited

Loading

JulianKnodt Jul 10, 2022 •

edited

Loading

JulianKnodt Jul 16, 2022 •

edited

Loading

JakobDegen Jul 16, 2022 •

edited

Loading