Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve LSRA and other dump output #71499

Merged
merged 1 commit into from
Jun 30, 2022

Conversation

BruceForstall
Copy link
Member

@BruceForstall BruceForstall commented Jun 30, 2022

E.g.,

Update LSRA "Allocating Registers" table description. Now:

The following table has one or more rows for each RefPosition that is handled during allocation.
The columns are: (1) Loc: LSRA location, (2) RP#: RefPosition number, (3) Name, (4) Type (e.g. Def, Use,
Fixd, Parm, DDef (Dummy Def), ExpU (Exposed Use), Kill) followed by a '*' if it is a last use, and a 'D'
if it is delayRegFree, (5) Action taken during allocation. Some actions include (a) Alloc a new register,
(b) Keep an existing register, (c) Spill a register, (d) ReLod (Reload) a register. If an ALL-CAPS name
such as COVRS is displayed, it is a score name from lsra_score.h, with a trailing '(A)' indicating alloc,
'(C)' indicating copy, and '(R)' indicating re-use. See dumpLsraAllocationEvent() for details.
The subsequent columns show the Interval occupying each register, if any, followed by 'a' if it is
active, 'p' if it is a large vector that has been partially spilled, and 'i' if it is inactive.
Columns are only printed up to the last modified register, which may increase during allocation,
in which case additional columns will appear. Registers which are not marked modified have ---- in
their column.

Dump nodes added during resolution, e.g.:

   BB29 bottom (BB08->BB08): move V25 from STK to rdi (Critical)
N001 (  1,  1) [001174] ----------z                 t1174 =    LCL_VAR   int    V25 cse4          rdi REG rdi

Dump more data in the LSRA block sequence data:

-BB03( 16   )
-BB04(  4   )
+BB03 ( 16   ) critical-in critical-out
+BB04 (  4   ) critical-out

When dumping various flow bitvectors, annotate the bitvectors better:

-BB25 in gen out
-0000000000000000
-0000000000000003 CSE #01.c
-0000000000000003 CSE #01.c
+BB25
+ in: 0000000000000000
+gen: 0000000000000003 CSE #01.c
+out: 0000000000000003 CSE #01.c

Dump hoisting bitvectors using the function which sorts on local number:

-  USEDEF  (5)={V04 V00 V01 V02 V03}
+  USEDEF  (5)={V00 V01 V02 V03 V04}

Also, fix various typos and formatting.

E.g.,

Update LSRA "Allocating Registers" table description.

Dump nodes added during resolution, e.g.:
```
   BB29 bottom (BB08->BB08): move V25 from STK to rdi (Critical)
N001 (  1,  1) [001174] ----------z                 t1174 =    LCL_VAR   int    V25 cse4          rdi REG rdi
```

Dump more data in the LSRA block sequence data:
```
-BB03( 16   )
-BB04(  4   )
+BB03 ( 16   ) critical-in critical-out
+BB04 (  4   ) critical-out
```

When dumping various flow bitvectors, annotate the bitvectors better:
```
-BB25 in gen out
-0000000000000000
-0000000000000003 CSE #1.c
-0000000000000003 CSE #1.c
+BB25
+ in: 0000000000000000
+gen: 0000000000000003 CSE #1.c
+out: 0000000000000003 CSE #1.c
```

Dump hoisting bitvectors using the sorting number:
```
-  USEDEF  (5)={V04 V00 V01 V02 V03}
+  USEDEF  (5)={V00 V01 V02 V03 V04}
```

Also, fix various typos and formatting.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 30, 2022
@ghost ghost assigned BruceForstall Jun 30, 2022
@ghost
Copy link

ghost commented Jun 30, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

E.g.,

Update LSRA "Allocating Registers" table description.

Dump nodes added during resolution, e.g.:

   BB29 bottom (BB08->BB08): move V25 from STK to rdi (Critical)
N001 (  1,  1) [001174] ----------z                 t1174 =    LCL_VAR   int    V25 cse4          rdi REG rdi

Dump more data in the LSRA block sequence data:

-BB03( 16   )
-BB04(  4   )
+BB03 ( 16   ) critical-in critical-out
+BB04 (  4   ) critical-out

When dumping various flow bitvectors, annotate the bitvectors better:

-BB25 in gen out
-0000000000000000
-0000000000000003 CSE #01.c
-0000000000000003 CSE #01.c
+BB25
+ in: 0000000000000000
+gen: 0000000000000003 CSE #01.c
+out: 0000000000000003 CSE #01.c

Dump hoisting bitvectors using the function which sorts on local number:

-  USEDEF  (5)={V04 V00 V01 V02 V03}
+  USEDEF  (5)={V00 V01 V02 V03 V04}

Also, fix various typos and formatting.

Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@BruceForstall
Copy link
Member Author

@kunalspathak @dotnet/jit-contrib PTAL

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. Added some minor questions/suggestions.

@@ -1194,7 +1194,7 @@ struct GenTree
return (gtOper == GT_LCL_FLD || gtOper == GT_LCL_FLD_ADDR || gtOper == GT_STORE_LCL_FLD);
}

inline bool OperIsLocalField() const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline is implicit for function definitions within the body of a class declaration, so this is just to be consistent with all other cases that don't use it.

@@ -2110,13 +2110,13 @@ void LinearScan::buildIntervals()
printf("-----------------\n");
for (BasicBlock* const block : compiler->Blocks())
{
printf(FMT_BB " use def in out\n", block->bbNum);
printf(FMT_BB "\nuse: ", block->bbNum);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love this

}
#elif defined(TARGET_ARM)
if (frameSize > 0x0400)
{
// We likely have a large stack frame.
//
// Thus we might need to use large displacements when loading or storing
// to CSE LclVars that are not enregistered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, how do you spot these? Do you have a tool or something to scan the files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reading code.

const LsraBlockInfo& bi = blockInfo[block->bbNum];

// Note that predBBNum isn't set yet.
JITDUMP(" (%6s)", refCntWtd2str(bi.weight));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me, but block->getBBWeight() == bi.weight and hence this change is same as what he had until now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly: no change to the output

JITDUMP(" " FMT_BB " %s", block->bbNum, insertionPointString);
if (toBlock == nullptr)
{
// SharedCritical resolution has no `toBlock`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be helpful to print about SharedCritical and fromBlock at least?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure what to do about SharedCritical so I kind-of punted. I believe the "from" block is the one that gets printed anyway.

@BruceForstall BruceForstall merged commit 9885fbe into dotnet:main Jun 30, 2022
@BruceForstall BruceForstall deleted the DumperImprovements branch June 30, 2022 20:58
@ghost ghost locked as resolved and limited conversation to collaborators Jul 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants