PGO: Instrument cold blocks with profile validators #53840

EgorBo · 2021-06-07T22:58:43Z

Profile Data can't be 100% reliable and there are cases where a hot or a semi-hot code ends up in a block with zero weight (aka "cold block"):

Static PGO that we ship is based on TE benchmarks and can be less relevant for other workloads.
We don't support context-sensitive profiling yet. It's not a big problem for static PGO where we don't promote methods to tier1 during profile collection, but it is a problem with the dynamic one where we can bake some weights during first seconds of the app session (it's enough to call a method 30 times) and use them for all other callsites. (see example below)
Mistakes in the logic where we scale/propagate/mix weights in the JIT.
Workflow can change over time and the blocks initially recognized as cold can become hot. The only way to fix this is the ability to deoptimize code and re-collect profile.
Sampling-based profiling is even less accurate if we switch to that.

In order to get a sense of the big picture, I'm introducing a sort of a late instrumentation where I insert helper calls into every cold block with a profile data at some late JIT Phase. I decided to use calls instead of counters to be able to quickly compose a CSV report on app shutdown. It allows me to quickly get a statistics which methods have problematic weights.

Example:

static void Main()
{
    // Promote DoWork to tier1 and bake weights
    for (int i = 0; i < 100; i++)
    {
        DoWork(i);
        Thread.Sleep(16);
    }

    // Weights in DoWork are baked at this point.
    // Call it with unusual (from profile's point of view) weight.
    DoWork(-10);
    DoWork(-20);
}

[MethodImpl(MethodImplOptions.NoInlining)]
static int DoWork(int i)
{
    if (i >= 0)
    {
        return 42; // always taken in the first loop
    }
    return -42;
}

And run with:

DOTNET_TieredPGO=1
DOTNET_ProfileValidationPath="C:\prj\report.csv"

It will save a list of problematic methods where cold blocks were hit (in a CSV format) on app exit:

I tried to run a desktop app AvaloniaILSpy with default parameters and here is what it printed (I closed the form by hands after 10 seconds and random clicking):

(38 methods)

If I run it with DOTNET_TieredPGO=1 and DOTNET_TC_QuickJitForLoops=1 it lists 422 methods

PS: It doesn't tell which blocks specifically have invalid weights - it's just for overall sense.
PS2: I guess I should use full method names

/cc @AndyAyersMS @davidwrighton @dotnet/jit-contrib

AndyAyersMS · 2021-06-08T01:14:35Z

Can you give some examples where this tech helped you sort out a problem?

I like the idea, but if we're going to go down this road I think we should look at something that has more long-lasting diagnostic capabilities, like doing a late instrumentation pass on all blocks, or relying on something like PIN.

EgorBo · 2021-06-08T09:42:10Z

Can you give some examples where this tech helped you sort out a problem?

I was just testing how much we can trust profile data (for inlining). Static PGO looks good so far in different benchmarks, there were some issues, like PowerShell benchmarks had in the hot path Path::HasExtension with different expectations.
Same for String::Equals, CastHelpers::ChkCast_Helper.

A small benchmark that just deserializes a complicated JSON file also had issues:

I can try to rewrite it to counters if you give me some pointers where to look at. But I guess it's not a priority for net6.0 so I'll just leave this as a prototype.

EgorBo added 2 commits June 7, 2021 20:30

Insert profile validators

7989bb4

Instrument cold blocks to validate profile data

530a654

dotnet-issue-labeler bot added the area-VM-coreclr label Jun 7, 2021

EgorBo added 3 commits June 8, 2021 02:14

Fix CI

1aca18d

Clean up

5ec5b56

Formatting

60a5b13

EgorBo closed this Jun 8, 2021

EgorBo mentioned this pull request Jun 21, 2021

[JIT] Improve inliner: new heuristics, rely on PGO data #52708

Merged

ghost locked as resolved and limited conversation to collaborators Jul 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PGO: Instrument cold blocks with profile validators #53840

PGO: Instrument cold blocks with profile validators #53840

EgorBo commented Jun 7, 2021 •

edited

Loading

AndyAyersMS commented Jun 8, 2021

EgorBo commented Jun 8, 2021

PGO: Instrument cold blocks with profile validators #53840

PGO: Instrument cold blocks with profile validators #53840

Conversation

EgorBo commented Jun 7, 2021 • edited Loading

AndyAyersMS commented Jun 8, 2021

EgorBo commented Jun 8, 2021

EgorBo commented Jun 7, 2021 •

edited

Loading