-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust weights of blocks with profile inside loops in non-profiled methods #71659
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsFixes #71649 (and a couple of related performance regressions in dotnet/performance). Repro: using System;
using System.Linq;
using System.Numerics;
using System.Runtime.CompilerServices;
using System.Threading;
public class Program
{
public static void Main()
{
for (int i = 0; i < 50; i++)
{
new Program().Test();
Thread.Sleep(50);
}
Console.WriteLine("Done");
//Console.ReadKey();
}
uint[] input_uint = Enumerable.Range(0, 1000).Select(i => (uint)i).ToArray();
[MethodImpl(MethodImplOptions.NoInlining |
MethodImplOptions.AggressiveOptimization)]
public int Test()
{
int sum = 0;
uint[] input = input_uint;
for (int i = 0; i < input.Length; i++)
sum += BitOperations.TrailingZeroCount(input[i]);
return sum;
}
} When we run this, Codegen diff https://www.diffchecker.com/dIVV1Iwy cc @AndyAyersMS
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about doing something like this but was concerned that it might have much wider impact than I'd like. But it might be worth a try.
We should prepare for a follow-on perf assessment to determine if this is a net improvement.
Looking at SPMI there are quite a few affected methods. Oddly no diffs from benchmarks. I wonder if that collection is broken right now or something?
@AndyAyersMS how do I run SPMI and get a sort of statistics - how many methods with PGO data, how many loops were clonned - etc? |
You would have to do something custom... we have never gotten around to implementing the "generalized metrics" work that would make this sort of thing easy. |
I hacked it locally - only +10 new loop clonning, around 200 regressions (among 1200 in libraries.pmi) are due to loop-alignment (so number are better with |
Improvements on linux-x64 dotnet/perf-autofiling-issues#6724 |
We noticed that this PR is associated with the following regression across a few configurations. Specifically looking at the x64 Windows historical data, we observed a clear regression: Strangely, we cannot find an associated issue in the Perf Auto-filler and runtime repos for this configuration. System.Collections.IndexerSet.Array(Size: 512)
|
@EgorBo is the regression above resolved or do we need an issue for it? |
Ping @EgorBo. This popped up in the 6.0 vs 7.0 RC2 report as well. System.Collections.IndexerSet.Array(Size: 512)
|
Thanks, looking at the issue, it was expected to see improvements and regressions from that change |
Fixes #71649 (and a couple of related performance regressions in dotnet/performance).
Repro:
When we run this,
Test
is not profiled, but it imports inside its loopBitOperations.TrailingZeroCount
which is profiled (static BCL profile). ThenoptSetBlockWeiths
divides its weight by 2 because it doesn't dominate all returns and the current method does not have profile data, see here.At the same time,
optScaleLoopBlocks
does not touch loop body because it has profile data so it doesn't scale it back to something "hot". My PR changes that - if the root method is not profiled - we try to adjust weights in such loops.Codegen diff https://www.diffchecker.com/dIVV1Iwy
Left: ReadyToRun=0, Right: ReadyToRun=1 - take a look at loop body (its weight is 1 in case of static pgo so loop aligner does not think it's profitable to align)
cc @AndyAyersMS