Skip to content

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Nov 14, 2023

The data flow used by assertion prop and CSE utilize "all-succs" which includes handler successors. However, in reality neither analysis needs full treatment of handlers; we do not CSE variables that are live-in to handlers, and assertion prop never kills assertions that would need to be propagated to handlers.

Additionally, the data flow framework used conflates end-of-block propagation of facts with reachability of the handler. If no facts changed at the ends of the preds of the handler, then the handler is not visited. This is despite the fact that the handler in the general case is also reachable with facts from the beginning of every enclosed basic block (or more generally with facts that were invalidated in the middle of enclosed basic blocks).

This ends up working out today only because of the fake successor edges we have from preds of the try to the handler, in addition to the restrictions on the data flows described above. Since I'm removing these fake edges, we need a more explicit solution. The change here has such a solution, by making it more explicit what exactly is needed in the data flow for CSE and AP here: they only need to consider regular successors, with the added catch that they need to consider reachability of handlers once we see the corresponding 'try' being reachable.

Prerequisite to #94672.

No diffs are expected.

The data flow used by assertion prop and CSE utilize "all-succs" which
includes handler successors. However, in reality neither analysis needs
full treatment of handlers; we do not CSE variables that are live-in to
handlers, and assertion prop never kills assertions that would need to
be propagated to handlers.

Additionally, the data flow framework used conflates end-of-block
propagation of facts with reachability of the handler. If no facts
changed at the ends of the preds of the handler, then the handler is not
visited. This is despite the fact that the handler in the general case
is also reachable with facts from the beginning of every enclosed basic
block (or more generally with facts that were invalidated in the middle
of enclosed basic blocks).

This ends up working out today only because of the fake successor edges
we have from preds of the try to the handler, in addition to the
restrictions on the data flows described above. Since I'm removing these
fake edges, we need a more explicit solution. The change here has such a
solution, by making it more explicit what exactly is needed in the data
flow for CSE and AP here: they only need to consider regular successors,
with the added catch that they need to consider reachability of handlers
once we see the corresponding 'try' being reachable.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 14, 2023
@ghost ghost assigned jakobbotsch Nov 14, 2023
@ghost
Copy link

ghost commented Nov 14, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

The data flow used by assertion prop and CSE utilize "all-succs" which includes handler successors. However, in reality neither analysis needs full treatment of handlers; we do not CSE variables that are live-in to handlers, and assertion prop never kills assertions that would need to be propagated to handlers.

Additionally, the data flow framework used conflates end-of-block propagation of facts with reachability of the handler. If no facts changed at the ends of the preds of the handler, then the handler is not visited. This is despite the fact that the handler in the general case is also reachable with facts from the beginning of every enclosed basic block (or more generally with facts that were invalidated in the middle of enclosed basic blocks).

This ends up working out today only because of the fake successor edges we have from preds of the try to the handler, in addition to the restrictions on the data flows described above. Since I'm removing these fake edges, we need a more explicit solution. The change here has such a solution, by making it more explicit what exactly is needed in the data flow for CSE and AP here: they only need to consider regular successors, with the added catch that they need to consider reachability of handlers once we see the corresponding 'try' being reachable.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

Compiler::optDumpAssertionIndices("out -> ", lastTryBlock->bbAssertionOut, "\n");
}
BitVecOps::IntersectionD(apTraits, block->bbAssertionIn, firstTryBlock->bbAssertionIn);
BitVecOps::IntersectionD(apTraits, block->bbAssertionIn, lastTryBlock->bbAssertionOut);
Copy link
Member Author

@jakobbotsch jakobbotsch Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was pretty meaningless -- intersecting with the lexically last block of a try region shouldn't be sufficient to guarantee any form of correctness here.

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, Fuzzlyn

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Nov 14, 2023

There are some x86 diffs. I checked in detail, and they happen from the MergeHandler change I made, where we no longer intersect with the last block in the try region. That change is only no-diff under the assumption that the first try block dominates all other blocks in the try. But that's not necessarily the case due to flowgraph imprecisions around finally blocks. For example:

BB24 [0038]  1                             1       [???..???)                                     internal 
BB01 [0000]  1  2    BB24                  1       [000..00D)-> BB11 ( cond ) T2      try {       keep i 
BB02 [0001]  1  2    BB01                  1       [00D..00E)-> BB04 (always) T2                  i 
BB03 [0027]  0  2                          0       [00D..00E)        (throw ) T2                  i rare hascall gcsafe 
BB04 [0028]  1  2    BB02                  1       [00D..01C)-> BB07 ( cond ) T2                  i hascall 
BB05 [0002]  1  2    BB04                  1       [01C..029)-> BB22 (callf ) T2                  i hascall gcsafe nullcheck 
BB06 [0022]  1  2    BB22                  1       [???..???)-> BB19 (ALWAYS) T2                  i internal gcsafe KEEP 
BB07 [0004]  1  0    BB04                  1       [029..038)-> BB15 ( cond ) T0      try {       keep i hascall gcsafe 
BB08 [0005]  1  0    BB07                  1       [038..045)-> BB22 (callf ) T0                  i hascall gcsafe nullcheck 
BB09 [0020]  1  0    BB22                  1       [???..???)-> BB20 (ALWAYS) T0      }           i internal gcsafe KEEP 
BB10 [0007]  1  2  0                       0       [047..04A)-> BB15 (always) T2 H0   catch { }   keep i rare hascall 
BB11 [0009]  1  1    BB01                  1       [04A..058)-> BB15 ( cond ) T1      try {       keep i hascall gcsafe 
BB12 [0010]  1  1    BB11                  1       [058..05A)-> BB22 (callf ) T1                  i gcsafe 
BB13 [0018]  1  1    BB22                  1       [???..???)-> BB21 (ALWAYS) T1      }           i internal gcsafe KEEP 
BB14 [0012]  1  2  1                       0       [05C..05F)                 T2 H1   catch { }   keep i rare hascall 
BB15 [0013]  4  2    BB07,BB10,BB11,BB14   1       [05F..060)-> BB17 ( cond ) T2                  i 
BB16 [0034]  1  2    BB15                  0       [05F..060)        (throw ) T2                  i rare hascall gcsafe 
BB17 [0035]  1  2    BB15                  1       [05F..067)                 T2      }           i hascall 
BB18 [0037]  1       BB17                  1       [067..06E)-> BB23 (always)                     keep i hascall gcsafe cfb cfe 
BB19 [0023]  1       BB06                  1       [???..???)-> BB23 (ALWAYS)                     i internal gcsafe KEEP 
BB20 [0021]  1       BB09                  1       [???..???)-> BB23 (ALWAYS)                     i internal gcsafe KEEP 
BB21 [0019]  1       BB13                  1       [???..???)-> BB23 (ALWAYS)                     i internal gcsafe KEEP 
BB22 [0014]  4     2 BB05,BB08,BB12        1       [067..06E)-> BB06,BB09,BB13 (finret)    H2   finally { } keep i hascall gcsafe 
BB23 [0015]  4       BB18,BB19,BB20,BB21   1       [06E..06F)        (return)                     i gcsafe 

In this flowgraph, there is a way to enter BB09 without going through BB07, via BB05 -> BB22 -> BB09. So that means the assertions of the "last lexical block" of the try region is not necessarily a subset of the assertions of the "try head". However, given that finally returns are well bracketed I do not see a correctness issue here (we know if we entered BB22 via BB05, then we return to BB06)

@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Diffs -- only a few on x86, that I investigated above. Other than that just some nice throughput improvements.

Comment on lines 5497 to 5498
// kill assertions in global AP. Note that if we did kill assertions we
// would need to be more careful about our mid-block handling when in a
Copy link
Contributor

@SingleAccretion SingleAccretion Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if we did kill assertions...

In my opinion, it is confusing to talk about the possibility of "killing assertions" in global AP. How would that work? It seems impossible almost by definition of the algorithm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean because we use SSA/VN? I mean if facts could become false in other ways than due to control flow joins. We would need to propagate that to the handlers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean because we use SSA/VN?

Right. VNs are "immutable", so if a a fact about a VN dominates a subgraph, it will always be true on the entirety of that subgraph. I understand the comment, I would just request to reword to not mention AP along the lines of "no clients need the concept of 'kill's".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can reword it tomorrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think the wording looks better now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you!

{
FlowEdge* preds = m_pCompiler->BlockPredsWithEH(block);
for (FlowEdge* pred = preds; pred; pred = pred->getNextPredEdge())
for (FlowEdge* pred = block->bbPreds; pred; pred = pred->getNextPredEdge())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can this loop use the PredEdges iterator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let me simplify this tomorrow.

});
}

if (m_pCompiler->bbIsTryBeg(block))
Copy link
Contributor

@SingleAccretion SingleAccretion Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we not need to consider the exceptional successors of filters here?

(Not saying we should)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the second pass EH successors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that because the filter and its enclosed handlers are all dominated by a same try entry there is no need to consider the filter as a pred of those handlers.

@jakobbotsch
Copy link
Member Author

Ping @AndyAyersMS... when you have time.

@jakobbotsch jakobbotsch merged commit 9469471 into dotnet:main Nov 15, 2023
@jakobbotsch jakobbotsch deleted the optimize-assertion-prop-cse-dataflow branch November 15, 2023 21:40
@github-actions github-actions bot locked and limited conversation to collaborators Dec 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants