Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document wf constraints on control flow in cleanup blocks #106612

Merged
merged 3 commits into from
Jan 17, 2023

Conversation

JakobDegen
Copy link
Contributor

Was recently made aware of this code, which has this potential ICE:

span_bug!(
mir.span,
"funclet {:?} has 2 parents - {:?} and {:?}",
funclet,
s,
succ
);

Roughly speaking, the code there is attempting to partition the cleanup blocks into funclets that satisfy a "unique successor" property, and the ICE is set off if that's not possible. This PR documents the well-formedness constraints that MIR must satisfy to avoid setting off that ICE.

The constraints documented are slightly stronger than the cases in which the ICE would have been set off in that code. This is necessary though, since whether or not that ICE gets set off can depend on iteration order in some graphs.

This sort of constraint is kind of ugly, but I don't know a better alternative at the moment. It's worth knowing that two important optimizations are still correct:

  • Removing edges in the cfg: Fewer edges => fewer paths => stronger dominance relations => more contractions, and more contractions can't turn a forest into not-a-forest.
  • Contracting an edge u -> v when u only has one successor and v only has one predecessor: u already dominated v, so this contraction was going to happen anyway.

There is definitely a MIR opt somewhere that can run afoul of this, but I don't know where it is. @saethlin was able to set it off though, so maybe he'll be able to shed some light on it.

r? @RalfJung I suppose, and cc @tmiasko who might have insight/opinions on this

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 9, 2023
@rustbot
Copy link
Collaborator

rustbot commented Jan 9, 2023

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras

@JakobDegen
Copy link
Contributor Author

Advice on how to phrase an "I don't know graph theory"-readable description of this constraint much appreciated. It's unfortunately not exactly intuitive, but I also don't know a simpler version that is still within the bounds of Mir we generate and Mir codegen ICEs on

@saethlin
Copy link
Member

saethlin commented Jan 9, 2023

For completeness, the MIR opt that can run afoul of this is #106613 (though it's definitely an optimization opportunity that this optimization opens up that is the direct problem)

@tmiasko
Copy link
Contributor

tmiasko commented Jan 9, 2023

The constraints documented are slightly stronger than the cases in which the ICE would have been set off in that code. This is necessary though, since whether or not that ICE gets set off can depend on iteration order in some graphs.

Can you give an example?

It would be nice to add a test case that fails validation once custom MIR supports cleanup blocks.

Implementations check that each vertex has at most one successor in the resulting graph. What rules out cycles?

@JakobDegen
Copy link
Contributor Author

JakobDegen commented Jan 9, 2023

Can you give an example?

Consider this cfg:

image

Everything besides a is a cleanup block.

If our reverse postorder is abed we make three funclets and everything is fine. If our reverse postorder is instead adeb, then when processing the b->e edge, we'll attempt to make a new funclet at e, which will cause an ICE when it adds a second successor to d.

Implementations check that each vertex has at most one successor in the resulting graph. What rules out cycles?

Oops. Will fix

@tmiasko
Copy link
Contributor

tmiasko commented Jan 9, 2023

Thanks for explanation, I see now what you mean regarding iteration order.

As far as I understand, with caveat that I am not really familiar with Windows specific parts of exception handling code generation, in the example resulting nesting is invalid, since it doesn't satisfy "the funclet pads’ unwind destinations cannot form a cycle" condition. The existing checks are simply incomplete.

@JakobDegen
Copy link
Contributor Author

Yeah, I don't think these were really intended to be "checks" in the first place, just "ICEing is the most reasonable thing to do in this code path."

@JakobDegen
Copy link
Contributor Author

@rustbot author as well, for the cycle fix

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 10, 2023
@RalfJung
Copy link
Member

@tmiasko can you take over the review? I'm a bit lost here...

Comment on lines 515 to 516
/// 4. The induced subgraph on cleanup blocks must look roughly like an upside down tree. This is
/// necessary to ensure that landing pad information can be correctly codegened. More precisely:
Copy link
Member

@RalfJung RalfJung Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the "subgraph on cleanup blocks"? Is it the graph where you remove all non-cleanup blocks and only keep edges between cleanup blocks?

What exactly is forbidden here? Is it essentially any kind of merging control flow? As in, every cleanup block must only have one cleanup block predecessor -- is that the condition? The analysis you implemented seems a lot more complicated than that, and I am lost in all the domination talk...

Or does the "upside down" mean that every cleanup block most only have one cleanup block successor? Branching is upside down merging. Though why branching control flow would be an issue is beyond me.

But anyway the usual definition of a (directed) forest is that each node has at most one predecessor and there are no cycles, so that might be a less graph-theoretic way of saying it.

Copy link
Contributor Author

@JakobDegen JakobDegen Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the "subgraph on cleanup blocks"? Is it the graph where you remove all non-cleanup blocks and only keep edges between cleanup blocks?

Yes. I'll spell this out. I also realize this should have said "subgraph on reachable cleanup blocks" so will fix that as well.

Or does the "upside down" mean that every cleanup block most only have one cleanup block successor?

This is the idea, but only after contracting dominators. One of the things that makes this constraint hard to understand is that there are no local structures that are disallowed in the graph. Indeed, if the cleanup blocks have just one entrance point, then there are no restrictions at all (since that entrance point will necessarily dominate all other reachable cleanup blocks). I don't think we can actually strengthen the restriction much either, since drop elab actually generates cycles (for dropping arrays) and branches (for drop flags) .

Instead this constraint is about how control flow can merge "between entrance points to cleanup code." One simple example of a graph that is disallowed is the one above, another example is a W shaped graph (all edges pointing down).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was hoping I could avoid thinking about dominators. ;) I never delved deeply enough into compilers to get solid intuition for them... In this case, do you compute dominators before

So, an example of something we cannot have is a control flow split where one of the successors is also reachable from another independent cleanup path? But a "V", where two independent cleanup paths merge, is okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative definition that doesn't use dominators.

It must be possible to partition reachable cleanup blocks into groups such that:

  • Each group has a unique entry block. Edges from outside into the group must target the group's entry block.
  • Edges from within a particular group to outside must target the same block.
  • Edges in-between groups cannot form a cycle.
  • There are no restriction on edges within the group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never delved deeply enough into compilers to get solid intuition for them... In this case, do you compute dominators before

End of the sentence may have gotten lost there? Otherwise you'll have to expand a bit

So, an example of something we cannot have is a control flow split where one of the successors is also reachable from another independent cleanup path? But a "V", where two independent cleanup paths merge, is okay?

Yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edges from outside into the group must target the group's entry block.

Obviously a nit but note that we have to restrict to edges from reachable blocks

Copy link
Member

@RalfJung RalfJung Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End of the sentence may have gotten lost there? Otherwise you'll have to expand a bit

Sorry... got interrupted and then forgot to finish the sentence.^^

I was about to ask whether dominance is computed before or after restricting the graph to cleanup blocks. The comment should clarify that.

For me personally this restriction is so arbitrary and unmotivated that the examples would also be helpful, but it is very possible that this makes more sense to others. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to ask whether dominance is computed before or after restricting the graph to cleanup blocks. The comment should clarify that.

Dominance is traditionally only defined for a control flow graph (ie a directed graph with a "start" vertex), so it doesn't really make sense to compute it after restricting to cleanup blocks, which typically do not form a control flow graph. That being said, there is a fairly natural extension of the definition (in which we consider a set of start vertices, instead of just one). In that case it doesn't matter which we pick. I'll clarify anyway.

For me personally this restriction is so arbitrary and unmotivated that the examples would also be helpful, but it is very possible that this makes more sense to others. :)

Yeah, I don't really understand it either, but I also don't feel like going and reading the Itanium ABI spec to find out...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constraints are described in https://llvm.org/docs/ExceptionHandling.html#funclet-transitions. Their relation to MIR can only be understood in the context of code generation for MSVC target (other targets don't use funclets).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I thought Itanium did as well. Interesting

@rust-log-analyzer

This comment has been minimized.

@tmiasko
Copy link
Contributor

tmiasko commented Jan 14, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 14, 2023
@bors
Copy link
Contributor

bors commented Jan 14, 2023

⌛ Trying commit 7057f835cfc7bf395bef792e116575864578123a with merge 027f53d68204819904d833b75f4a81a74686fd0a...

@bors
Copy link
Contributor

bors commented Jan 14, 2023

☀️ Try build successful - checks-actions
Build commit: 027f53d68204819904d833b75f4a81a74686fd0a (027f53d68204819904d833b75f4a81a74686fd0a)

@rust-timer

This comment has been minimized.

@RalfJung
Copy link
Member

r? @tmiasko

@rustbot rustbot assigned tmiasko and unassigned RalfJung Jan 14, 2023
@JakobDegen
Copy link
Contributor Author

So there is no concern about target dependence, and we could use crater to assess how often this is tripped. I suggest we do that at some point, regardless of if this is defaulted on or off.

Maybe regular crater runs under -Zvalidate-mir would be a good idea anyway...

@JakobDegen
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 16, 2023
@bors
Copy link
Contributor

bors commented Jan 16, 2023

⌛ Trying commit 4bc963e with merge 8e42a53653740a9bdf11fbf5f1c3d28e4d2e6d01...

@tmiasko
Copy link
Contributor

tmiasko commented Jan 16, 2023

The cleanup_kinds code, where existing "check" is located, is always executed. It might be an interesting experiment to run it only when necessary, especially if this validation were to be enabled by default.

@JakobDegen
Copy link
Contributor Author

The cleanup_kinds code, where existing "check" is located, is always executed. It might be an interesting experiment to run it only when necessary, especially if this validation were to be enabled by default.

From the current appearance of codegen, it seems to be unconditionally used. I don't really understand this code well enough to be sure though

@tmiasko
Copy link
Contributor

tmiasko commented Jan 17, 2023

From the current appearance of codegen, it seems to be unconditionally used.

Yes, but for !base::wants_msvc_seh(), the code in llbb_characteristics is equivalent to:

let needs_landing_pad = !fx.mir[self.bb].is_cleanup && fx.mir[target].is_cleanup;
let is_cleanupret = false;
(needs_landing_pad, is_cleanupret)

@JakobDegen
Copy link
Contributor Author

Ah, ok. Yeah, that seems reasonable, should probably be done separately from this PR though

@bors
Copy link
Contributor

bors commented Jan 17, 2023

☀️ Try build successful - checks-actions
Build commit: 8e42a53653740a9bdf11fbf5f1c3d28e4d2e6d01 (8e42a53653740a9bdf11fbf5f1c3d28e4d2e6d01)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8e42a53653740a9bdf11fbf5f1c3d28e4d2e6d01): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.4% [-0.5%, -0.4%] 2
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.9% [2.9%, 2.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.1% [-5.1%, -3.0%] 2
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.8% [2.1%, 5.3%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

@rustbot rustbot removed S-waiting-on-perf Status: Waiting on a perf run to be completed. perf-regression Performance regression. labels Jan 17, 2023
@tmiasko
Copy link
Contributor

tmiasko commented Jan 17, 2023

@bors r+

@bors
Copy link
Contributor

bors commented Jan 17, 2023

📌 Commit 4bc963e has been approved by tmiasko

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 17, 2023
@bors
Copy link
Contributor

bors commented Jan 17, 2023

⌛ Testing commit 4bc963e with merge f34cc65...

@bors
Copy link
Contributor

bors commented Jan 17, 2023

☀️ Test successful - checks-actions
Approved by: tmiasko
Pushing f34cc65 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 17, 2023
@bors bors merged commit f34cc65 into rust-lang:master Jan 17, 2023
@rustbot rustbot added this to the 1.68.0 milestone Jan 17, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f34cc65): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.9%, 0.9%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.3% [-4.1%, -2.5%] 2
All ❌✅ (primary) 0.9% [0.9%, 0.9%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants