Convert the default-case optimisation to a MIR pass #33567

nagisa · 2016-05-11T20:31:54Z

#33566 introduces an optimisation which speeds up LLVM greatly. My suspicion is that this optimisation is also very relevant to the MIR pipeline (i.e. MIR transformations ought to be slower to analyse and transform the switches with more branches). Thus we should port this optimisation over to a MIR pass down the line.

Moreover, the code in the PR relies on the assumption that the otherwise branch of the switches is always unreachable. This assumption has potential to become not true after we get more complex optimisations on the MIR going. Having such optimisation to be a plain MIR transformation would help with this too.

cc @dotdash

The text was updated successfully, but these errors were encountered:

Aatch · 2016-05-19T23:56:38Z

The question is whether to change the Switch terminator to be non-exhaustive or make a new terminator instead. Making Switch non-exhaustive seems like the better option to me, as I'm not sure what the value of having two almost-identical instructions is.

Also, it being non-exhaustive means we can build the initial MIR a bit smarter. There are often cases where we can narrow down the candidates based on the fact that previous arms weren't taken:

enum Foo {
  A,
  B,
  C
}

match (x, y) {
  (_, A) => {},
  (A, _) => {} // We know that the `_` can't be A here because otherwise the earlier case would have been taken
}

I'm pretty sure the existing code tries to handle this already, I know old trans does.

nagisa · 2016-05-20T11:59:39Z

The question is whether to change the Switch terminator to be non-exhaustive or make a new terminator instead. Making Switch non-exhaustive seems like the better option to me, as I'm not sure what the value of having two almost-identical instructions is.

IMO we should remove switch altogether. A big part of MIR is getting rid of things tied to language (e.g. we have no for, while etc loops, only a goto), but we (due to historical accidents, already!) have both the Switch and SwitchInt, where Switch is just a glorified version of SwitchInt that works on enum discriminants only.

What I think we ought to do, is:

Add a Lvalue projection for the discriminant;
Remove Switch and replace that with SwitchInt on discr(Lvalue).

The only downside I see – we would end up duplicating the values being Switched on with ones in AdtDef, but if we want to make Switch non-exhaustive that is necessary anyway.

nagisa · 2016-05-20T12:01:26Z

Incidentally that should also decrease our reliance on trans::adt greatly.

nagisa · 2016-05-20T12:02:06Z

also cc @nikomatsakis, in case you want to discuss this.

nikomatsakis · 2016-05-20T13:13:54Z

Unsurprisingly, I do have an opinion:

I definitely agree we should make the MIR translation for switches more intelligent. The primary goal with the current version was that it should be commented and (relatively) readable -- a tradition I hope we will continue, in strong contrast to the old trans version. :) But I think it's clear it doesn't generate the most optimal code.
I am not sure about unifying Switch and SwitchInt. I see why it's appealing, and it may well make sense -- along these lines, maybe we should unify SwitchInt and If -- but I have two reservations:
- I expect we will eventually want to read MIR as input, for example from different front-ends; in these cases, we'll need something to verify the various correctness assumptions that are currently assured "by construction". Having a more "semantic" switch (that clearly indicates it is testing the enum variant, rather than some random integer) will make this mildly easier. OTOH, we could address this in various other ways.
- Does ADT translation always produce an integer to be tested? I guess probably it does; I was just imagining that sometimes (e.g., null vs non-null), this might introduce some silly transformations, though LLVM would probably optimize them away. (For example, the equivalent of let discrim = if x != null { 1 } else { 0 }; if discrim == 1 { ... } else { ... })

I'm not sure how much to weigh the first point. It seems like it's probably good to keep in mind, but bad to be overly constrained by this -- after all, if we want to change things later, we can do so, and it's clear that we don't know the full set of constraints right now.

Another way to look at unifying Switch and SwitchInt is that it's the kind of lowering that makes sense to do before trans for sure, but it's a semantic distinction I could imagine wanting to preserve during static checking.

Finally, I imagine we could bridge the gap just by making Switch kind of polymorphic over the type of discriminant:

integer, bool: the integer labels are the value itself
enum: the integer labels are variant numbers

In other words, we wouldn't generate code like:

TMP = LOAD_DISCRIMINANT(ENUM)
SWITCH_INT(TMP, ...)

but just

SWITCH_INT(ENUM, ...)

nikomatsakis · 2016-05-23T22:18:41Z

So @scottcarr is starting up an internship and I was thinking this would make a good starter problem for him (presuming nobody else is hacking on it just now?). I was thinking specifically not of modifying the MIR in any particular way, but just modifying the MIR generation to not build a basic block for every variant, but instead to be more selective. (In other words, it would directly generate the IR that we wind up optimizing to.)

We could also change the IR, but the two things seem somewhat orthogonal to me.

nikomatsakis · 2016-05-26T16:37:45Z

Here are some test cases (just for reference) that exhibit the N^2 blowup we observed (at least initially). Note that the one that uses integral constants does not: I think the simplest fix to the blowup is just to have the enum case use a similar pattern to what constants do, where it figures out which variants are "of interest" and creates an otherwise-like block to use as the target for the remainder (we might also change the IR; just seems orthogonal).

https://gist.github.com/nikomatsakis/dc3641659919d11b01a6fc14051831bd

generate fewer basic blocks for variant switches CC #33567 Adds a new field to TestKind::Switch that tracks the variants that are actually matched against. The other candidates target a common "otherwise" block.

nagisa · 2017-03-12T13:17:15Z

I think this is done now after we switched to the SwitchInt everywhere.

nagisa added the A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html label May 11, 2016

nagisa mentioned this issue May 12, 2016

MIR: Don't generate 3-armed boolean switch from match. #33583

Merged

scottcarr mentioned this issue Jun 1, 2016

generate fewer basic blocks for variant switches #33999

Merged

nagisa closed this as completed Mar 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert the default-case optimisation to a MIR pass #33567

Convert the default-case optimisation to a MIR pass #33567

nagisa commented May 11, 2016

Aatch commented May 19, 2016 •

edited

Loading

nagisa commented May 20, 2016 •

edited

Loading

nagisa commented May 20, 2016

nagisa commented May 20, 2016

nikomatsakis commented May 20, 2016

nikomatsakis commented May 23, 2016

nikomatsakis commented May 26, 2016

nagisa commented Mar 12, 2017

Convert the default-case optimisation to a MIR pass #33567

Convert the default-case optimisation to a MIR pass #33567

Comments

nagisa commented May 11, 2016

Aatch commented May 19, 2016 • edited Loading

nagisa commented May 20, 2016 • edited Loading

nagisa commented May 20, 2016

nagisa commented May 20, 2016

nikomatsakis commented May 20, 2016

nikomatsakis commented May 23, 2016

nikomatsakis commented May 26, 2016

nagisa commented Mar 12, 2017

Aatch commented May 19, 2016 •

edited

Loading

nagisa commented May 20, 2016 •

edited

Loading