Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Windows/x64: 5 Regressions on 2/3/2024 12:19:35 AM #98044

Open
performanceautofiler bot opened this issue Feb 6, 2024 · 10 comments
Open

[Perf] Windows/x64: 5 Regressions on 2/3/2024 12:19:35 AM #98044

performanceautofiler bot opened this issue Feb 6, 2024 · 10 comments
Assignees
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-windows runtime-coreclr specific to the CoreCLR runtime tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Feb 6, 2024

Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Globalization.Tests.StringEquality

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
761.52 ns 973.83 ns 1.28 0.00 True
1.19 μs 1.28 μs 1.08 0.01 False

graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Globalization.Tests.StringEquality*'

Payloads

Baseline
Compare

System.Globalization.Tests.StringEquality.Compare_Same(Count: 1024, Options: (en-US, OrdinalIgnoreCase))

ETL Files

Histogram

JIT Disasms

System.Globalization.Tests.StringEquality.Compare_Same_Upper(Count: 1024, Options: (en-US, OrdinalIgnoreCase))

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in Benchmark.GetChildKeysTests

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
14.99 ms 16.20 ms 1.08 0.02 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Benchmark.GetChildKeysTests*'

Payloads

Baseline
Compare

Benchmark.GetChildKeysTests.AddChainedConfigurationEmpty

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in Span.Sorting

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
8.54 μs 16.21 μs 1.90 0.45 True

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Span.Sorting*'

Payloads

Baseline
Compare

Span.Sorting.QuickSortArray(Size: 512)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in Benchstone.BenchI.EightQueens

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
1.80 μs 2.04 μs 1.13 0.03 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Benchstone.BenchI.EightQueens*'

Payloads

Baseline
Compare

Benchstone.BenchI.EightQueens.Test

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added arch-x64 os-windows runtime-coreclr specific to the CoreCLR runtime untriaged New issue has not been triaged by the area owner labels Feb 6, 2024
@DrewScoggins DrewScoggins removed the untriaged New issue has not been triaged by the area owner label Feb 6, 2024
@DrewScoggins DrewScoggins transferred this issue from dotnet/perf-autofiling-issues Feb 6, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 6, 2024
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Feb 6, 2024
@DrewScoggins
Copy link
Member

Diff here: 207e1fb...df0778d

Nothing is jumping out as the culprit, but there were a few JIT changes.

@DrewScoggins
Copy link
Member

Linux related regressions: dotnet/perf-autofiling-issues#28564

@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 7, 2024
@ghost
Copy link

ghost commented Feb 7, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Globalization.Tests.StringEquality

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
761.52 ns 973.83 ns 1.28 0.00 True
1.19 μs 1.28 μs 1.08 0.01 False

graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Globalization.Tests.StringEquality*'

Payloads

Baseline
Compare

System.Globalization.Tests.StringEquality.Compare_Same(Count: 1024, Options: (en-US, OrdinalIgnoreCase))

ETL Files

Histogram

JIT Disasms

System.Globalization.Tests.StringEquality.Compare_Same_Upper(Count: 1024, Options: (en-US, OrdinalIgnoreCase))

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in Benchmark.GetChildKeysTests

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
14.99 ms 16.20 ms 1.08 0.02 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Benchmark.GetChildKeysTests*'

Payloads

Baseline
Compare

Benchmark.GetChildKeysTests.AddChainedConfigurationEmpty

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in Span.Sorting

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
8.54 μs 16.21 μs 1.90 0.45 True

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Span.Sorting*'

Payloads

Baseline
Compare

Span.Sorting.QuickSortArray(Size: 512)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 1a2f095fb212dcbf394f01122b9f317b7cc70fdb
Compare 2361c00717a54a5dd9b0cf727102d64f783855b9
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in Benchstone.BenchI.EightQueens

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
1.80 μs 2.04 μs 1.13 0.03 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Benchstone.BenchI.EightQueens*'

Payloads

Baseline
Compare

Benchstone.BenchI.EightQueens.Test

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

os-windows, arch-x64, area-CodeGen-coreclr, untriaged, runtime-coreclr, needs-area-label

Milestone: -

@vcsjones vcsjones removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 13, 2024
@BruceForstall
Copy link
Member

Maybe #97722?

@BruceForstall BruceForstall added this to the 9.0.0 milestone Feb 13, 2024
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Feb 13, 2024
@AndyAyersMS AndyAyersMS added the Priority:2 Work that is important, but not critical for the release label May 8, 2024
@AndyAyersMS
Copy link
Member

EightQueens seems to be an intel-only regression, and then only on some cases, and two other regressions since.

image

Most all the time is in TryMe.

Codegen from baseline to latest shows RBO did one jump thread (from #97722), different layout, and an IV widening.

There are a lot of spilled CSEs here in both baseline and latest codegen, but more spill occurrences in latest. Possibly the one extra jump thread by RBO has created more critical edges and so made life more difficult for LSRA.

Final flow graphs. You can clearly see the impact of RPO layout at least...

MAIN BASELINE

@AndyAyersMS AndyAyersMS added the tenet-performance-benchmarks Issue from performance benchmark label Jul 27, 2024
@AndyAyersMS
Copy link
Member

Span.Sorting.QuickSortArray(Size: 512)

Regressions here were fixed by RPO layout:

image

@AndyAyersMS
Copy link
Member

System.Globalization.Tests.StringEquality.Compare_Same(Count: 1024, Options: (en-US, OrdinalIgnoreCase))

Ditto for this benchmark

image

@AndyAyersMS
Copy link
Member

System.Globalization.Tests.StringEquality.Compare_Same_Upper(Count: 1024, Options: (en-US, OrdinalIgnoreCase))

image

Same as the two above, recovers with later changes.

@AndyAyersMS
Copy link
Member

Benchmark.GetChildKeysTests.AddChainedConfigurationEmpty

Ditto like the above

image

@AndyAyersMS
Copy link
Member

So the only persisted regression is in 8 queens, and that one seems to be the increase in resolution moves by the allocator.

Going to move this to .NET 10 as there's no simple fix available now.

@AndyAyersMS AndyAyersMS modified the milestones: 9.0.0, 10.0.0 Aug 6, 2024
@AndyAyersMS AndyAyersMS removed the Priority:2 Work that is important, but not critical for the release label Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-windows runtime-coreclr specific to the CoreCLR runtime tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

5 participants