Skip to content

Conversation

AndyAyersMS
Copy link
Member

Fix #116223

Save profile data (schema, schema size, and data) in the inline context.
Always store the inline context in GT_CALL nodes. Use this to find
the correct PGO data to consult when optimizing a call (in particular,
for late cast expansions).

Also pick up the dumping improvements from #116231.

AndyAyersMS and others added 2 commits June 2, 2025 19:42
Fix dotnet#116223

Save profile data (schema, schema size, and data) in the inline context.
Always store the inline context in GT_CALL nodes. Use this to find
the correct PGO data to consult when optimizing a call (in particular,
for late cast expansions).
@Copilot Copilot AI review requested due to automatic review settings June 3, 2025 02:44
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 3, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances PGO-driven inlining by storing per-method profile data in the inline context and leveraging it during call optimization (especially for late cast expansions), and also refines dump logging.

  • Introduce PgoInfo and store PGO data inside InlineContext
  • Propagate PGO info into calls and update pickGDV to consume it with a new verboseLogging flag
  • Enable dumping of PGO-driven type guesses in gtDispTree for cast helpers

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/inline.h Define PgoInfo and add Get/Set/HasPgoInfo to InlineContext
src/coreclr/jit/inline.cpp Implement PgoInfo constructors
src/coreclr/jit/importercalls.cpp Use InlineContext PGO data in pickGDV and add verboseLogging
src/coreclr/jit/helperexpansion.cpp Switch cast-expansion arrays from MAX_CAST_GUESSES to MAX_GDV_TYPE_CHECKS
src/coreclr/jit/gentree.h Make gtInlineContext unconditionally available
src/coreclr/jit/gentree.cpp Assign gtInlineContext on new/clone calls and dump PGO data for cast helpers
src/coreclr/jit/compiler.h Add verboseLogging parameter (default true) to pickGDV signature
src/coreclr/jit/compiler.cpp Call SetPgoInfo on compInlineContext during initialization
Comments suppressed due to low confidence (2)

src/coreclr/jit/helperexpansion.cpp:2160

  • The call to pickGDV is missing the new verboseLogging argument, leading to a signature mismatch. Add a bool verboseLogging parameter (e.g., true or false) to match the updated signature.
comp->pickGDV(castHelper, castHelper->gtCastHelperILOffset, false, likelyClasses, nullptr, &likelyClassCount, likelyLikelihoods);

src/coreclr/jit/compiler.cpp:2557

  • Add tests to verify that SetPgoInfo correctly stores PGO schema, count, and data, and that HasPgoInfo/GetPgoInfo return the expected values in inlined contexts.
compInlineContext->SetPgoInfo(PgoInfo(this));

}
#endif

const PgoInfo& GetPgoInfo()
Copy link

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding a brief doc comment explaining what GetPgoInfo, SetPgoInfo, and HasPgoInfo do, and how PgoInfo is intended to be used within inlining.

Copilot uses AI. Check for mistakes.

int* candidatesCount,
unsigned* likelihoods);
unsigned* likelihoods,
bool verboseLogging = true);
Copy link

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The pickGDV parameter list is growing; consider grouping related flags (like verboseLogging) and PGO inputs into a struct to simplify calls and reduce maintenance risk.

Copilot uses AI. Check for mistakes.

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib FYI
@EgorBo PTAL

Makes GT_CALL nodes bigger. Should fix a number of inlining-related perf issues, like #113913. Reasonble number of diffs.

For one benchmark there I have locally

Method Job Toolchain input Mean Error StdDev Median Min Max Ratio RatioSD Allocated Alloc Ratio
LastWithPredicate_FirstElementMatches Job-CLLPQM No inline w/EH IList 5.086 ns 0.1165 ns 0.1090 ns 5.024 ns 4.987 ns 5.269 ns 1.00 0.03 - NA
LastWithPredicate_FirstElementMatches Job-COLTNK Inline w/EH IList 6.779 ns 0.0913 ns 0.0854 ns 6.807 ns 6.635 ns 6.904 ns 1.33 0.03 - NA
LastWithPredicate_FirstElementMatches Job-BEDPPC This PR IList 3.291 ns 0.0617 ns 0.0577 ns 3.289 ns 3.203 ns 3.419 ns 0.65 0.02 - NA

@EgorBo
Copy link
Member

EgorBo commented Jun 3, 2025

@EgorBot -amd -arm

using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Benchmarks).Assembly).Run(args);

public class Benchmarks
{
    int class1 = 0;
    int class2 = 0;

    BaseClass b1 = new DerivedClass1();
    BaseClass b2 = new DerivedClass2();

    [Benchmark]
    public void Bench()
    {
        for (int i = 0; i < 10000; i++)
            Problem(i);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public void Problem(int i)
    {
        BaseClass b = i % 3 == 0 ? b1 : b2;
        if (b is DerivedClass1)
            class1++;
        else if (IsDerivedClass2(b))
            class2++;
    }

    public bool IsDerivedClass2(BaseClass b) => b is DerivedClass2;

    public class BaseClass { }
    public class DerivedClass1 : BaseClass { }
    public class DerivedClass2 : BaseClass { }
}

@EgorBo
Copy link
Member

EgorBo commented Jun 3, 2025

hm.. @AndyAyersMS any idea why your initial repro snippet regresses with your PR? EgorBot/runtime-utils#376

It looks like the casts no longer expand with PGO at all there

@AndyAyersMS
Copy link
Member Author

hm.. @AndyAyersMS any idea why your initial repro snippet regresses with your PR? EgorBot/runtime-utils#376

It looks like the casts no longer expand with PGO at all there

Odd. It was working ok for me locally. Let me double-check.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Jun 3, 2025

hm.. @AndyAyersMS any idea why your initial repro snippet regresses with your PR? EgorBot/runtime-utils#376
It looks like the casts no longer expand with PGO at all there

Odd. It was working ok for me locally. Let me double-check.

It works with checked builds -- there is a misplaced #endif preventing profile capture.

@AndyAyersMS
Copy link
Member Author

@EgorBot -amd -arm

using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Benchmarks).Assembly).Run(args);

public class Benchmarks
{
    int class1 = 0;
    int class2 = 0;

    BaseClass b1 = new DerivedClass1();
    BaseClass b2 = new DerivedClass2();

    [Benchmark]
    public void Bench()
    {
        for (int i = 0; i < 10000; i++)
            Problem(i);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public void Problem(int i)
    {
        BaseClass b = i % 3 == 0 ? b1 : b2;
        if (b is DerivedClass1)
            class1++;
        else if (IsDerivedClass2(b))
            class2++;
    }

    public bool IsDerivedClass2(BaseClass b) => b is DerivedClass2;

    public class BaseClass { }
    public class DerivedClass1 : BaseClass { }
    public class DerivedClass2 : BaseClass { }
}

@AndyAyersMS
Copy link
Member Author

Results look better now

BenchmarkDotNet v0.14.0, Ubuntu 24.04.2 LTS (Noble Numbat)
Unknown processor
.NET SDK 10.0.100-preview.6.25303.101
[Host] : .NET 9.0.5 (9.0.525.21509), Arm64 RyuJIT AdvSIMD
Job-GQRFWP : .NET 10.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
Job-KPSXLT : .NET 10.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD

Method Toolchain Mean Error Ratio
Bench Main 36.91 us 0.018 us 1.00
Bench PR 32.48 us 0.010 us 0.88

@AndyAyersMS
Copy link
Member Author

/ba-g unrelated failure with no helix log

@AndyAyersMS AndyAyersMS merged commit 30082a4 into dotnet:main Jun 3, 2025
107 of 109 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JIT: we lose profile info for inlinee casts

2 participants