Skip to content

Conversation

@richlander
Copy link
Member

@richlander richlander commented Feb 26, 2025

This is the space for libraries and runtime improvements. We'll aim to merge this PR by March 7th to align with the planned ship date of March 11th.

Everyone is welcome to add feature descriptions and participate in this process.

Parent PR/branch: #9769.

@hez2010
Copy link

hez2010 commented Mar 4, 2025

Inlining of Late Devirtualized Virtual Methods

In .NET, the JIT compiler can optimize virtual method calls by replacing them with non-virtual calls when it can determine the exact type of the this object. However, sometimes this type information is only available if the call is inlined. Consider the following example:

IC obj = GetObject();
obj.M();

IC GetObject() => new C();

interface IC
{
    void M();
}
class C : IC
{
    public void M() => Console.WriteLine(42);
}

If the call to GetObject is not inlined, the JIT cannot determine that obj is actually of type C rather than IC, so that the subsequent call M() on obj will not be devirtualized. Late devirtualization occurs when a call becomes eligible for devirtualization due to previous inlining. This, in turn, can create new inlining opportunities. Previously, such opportunities were abandoned, but with dotnet/runtime#110827, the JIT now marks and seamlessly inlines these late devirtualized calls. Inlining a late devirtualized call can, in turn, create additional devirtualization opportunities, which can then be inlined, further enhancing optimization potential.

Devirtualize based on Inlining Observation

During inlining, a temporary variable may be created to hold the return value from the callee. With dotnet/runtime#111948, the JIT analyzes and updates the type of this temporary variable accordingly. If all return sites in a callee yield the same exact type, this precise type information is leveraged to devirtualize subsequent calls.


With the above two improvements, along with recent efforts on array de-abstraction, the JIT can now devirtualize, inline, stack-allocate, and then perform struct promotion on arbitrary enumerators, so that the enumerator abstraction can be entirely eliminated, even without PGO. Consider the following example:

var r = GetRangeEnumerable(0, 10);
foreach (var i in r)
{
    Console.WriteLine(i);
}

static IEnumerable<int> GetRangeEnumerable(int start, int count) => new RangeEnumerable(start, count);

class RangeEnumerable(int start, int count) : IEnumerable<int>
{
    public class RangeEnumerator(int start, int count) : IEnumerator<int>
    {
        private int _value = start - 1;
        public int Current => _value;
        object IEnumerator.Current => Current;
        public void Dispose() { }
        public bool MoveNext()
        {
            _value++;
            return count-- != 0;
        }
        public void Reset() => _value = start - 1;
    }

    public IEnumerator<int> GetEnumerator() => new RangeEnumerator(start, count);
    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

With recent JIT improvements, the compiler now produces fully optimized code that devirtualizes and inlines all virtual calls, while also eliminating the enumerator allocation, which is achieved through escape analysis and struct promotion, allowing the enumerator to be entirely stack-allocated, resulting in zero heap allocations and optimal performance:

...
G_M27646_IG02:
       mov      ebx, 10
       mov      r15d, -1
       jmp      SHORT G_M27646_IG04
G_M27646_IG03:
       mov      edi, r15d
       call     [System.Console:WriteLine(int)]
       mov      ebx, r14d
G_M27646_IG04:
       inc      r15d
       lea      edi, [rbx-0x01]
       mov      r14d, edi
       test     ebx, ebx
       jne      SHORT G_M27646_IG03
...

See the full comparison of codegen between .NET 9 and .NET 10 here.

Support for Casting and Negation in NativeAOT Type Preinitializer

NativeAOT includes a type preinitializer that can execute type initializers (static constructors) without side effects at compile time using an IL interpreter. The results are then embedded directly into the binary, allowing the elimination of these initializers entirely. With dotnet/runtime#112073, support has been extended to cover all variants of conv.* and neg opcodes, enabling preinitialization of methods that include casting or negation operations.

@richlander
Copy link
Member Author

Looks great @hez2010 -- Can you create a PR on the branch?

Can you define "late devitalization" in the text? "late" can be used in multiple ways, like "late-stage capitalism". We want to make it clear to readers.

As much as possible, can you align with the format we've used in release notes previously, to match?

Also, if that's too much, I'm happy to do this myself.

@amanasifkhalid
Copy link
Contributor

As much as possible, can you align with the format we've used in release notes previously, to match?

Also, if that's too much, I'm happy to do this myself.

If it's easier, I can add @hez2010's notes to my batch of JIT notes.

@richlander
Copy link
Member Author

Sounds good @amanasifkhalid. Let's do that.

@hez2010
Copy link

hez2010 commented Mar 5, 2025

Can you define "late devitalization" in the text? "late" can be used in multiple ways, like "late-stage capitalism". We want to make it clear to readers.

As much as possible, can you align with the format we've used in release notes previously, to match?

I have updated the original comment to reflect these.

If it's easier, I can add @hez2010's notes to my batch of JIT notes.

Given it is going to be included in your JIT notes, I will not send a separate PR for this.


Notice how `foreach_opaque_array_via_interface` allocates memory on the heap, suggesting the JIT failed to stack-allocate and promote the enumerator to registers. This is because the JIT relies on a technique called escape analysis to enable stack allocation. Escape analysis determines if an object's lifetime can exceed that of its creation context; if the JIT can guarantee an object will not outlive the current method, it can safely allocate it on the stack. In the above example, calling an interface method on the enumerator to control iteration causes it to escape, as the call takes a reference to the enumerator object. On the fast path of the type test, the JIT can try to devirtualize and inline these interface calls to keep the enumerator from escaping. However, escape analysis typically considers the whole method context, so the slow path's reliance on interface calls prevents the JIT from stack-allocating the enumerator at all.

[dotnet/runtime #111473](https://github.com/dotnet/runtime/pull/111473) introduces conditional escape analysis -- a flow-sensitive form of the technique -- to the JIT. Conditional escape analysis can determine if an object will escape only on certain paths through the method, and prompt the JIT to create a fast path where the object never escapes. For array enumeration scenarios, conditional escape analysis reveals the enumerator will escape only when type tests for the collection fail, enabling the JIT to create a copy of the iteration code where the enumerator is stack-allocated and promoted. [benchmark results]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndyAyersMS do you have updated benchmark runs for the opaque case? Also I'm guessing the above benchmarks I took from your conditional escape analysis doc aren't up-to-date, now that we have @hez2010's changes in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are tracked in the benchmark system, eg https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu%2022.04/ViperUbuntu/ArrayDeAbstraction.foreach_opaque_array_via_interface.html

If you want a table I can run some stuff locally and get you something.

Also the array alloc revert has made it into P2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe the more recent changes from @hez2010 impacted the lab benchmark numbers since PGO could already get those cases (neither PR above has any lab annotations).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah changes to late devirt were mainly for non-PGO scenarios, PGO benchmark shouldn't be affected (the guard created by GDV was blocking stackalloc before conditional escape analysis).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some locally collected data.

Method Toolchain Mean Median Ratio Allocated Alloc Ratio
foreach_static_readonly_array net9.0 150.78 ns 150.72 ns 1.00 - NA
foreach_static_readonly_array_via_interface_property net9.0 851.75 ns 849.93 ns 5.65 32 B NA
foreach_opaque_array_via_interface net9.0 874.66 ns 877.68 ns 5.80 32 B NA
foreach_static_readonly_array net10.0p2 151.75 ns 151.14 ns 1.01 - NA
foreach_static_readonly_array_via_interface_property net10.0p2 280.04 ns 278.77 ns 1.86 - NA
foreach_opaque_array_via_interface net10.0p2 277.89 ns 277.16 ns 1.84 - NA

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both, I'll update the tables

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Non-PGO you can check the benchmark numbers here: dotnet/runtime#111948 (comment)
It was benchmarked without tiering compilation to demonstrate how late devirt inlining improvements interact with array de-abstraction.

## Array Enumeration De-Abstraction

This is about the feature
Preview 1 introduced enhancements to the JIT compiler's devirtualization abilities for array interface methods, enabling the JIT to begin removing the abstraction overhead of array iteration via enumerators. Consider the following benchmarks:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say Preview 2? Or is this a Preview 1 feature we didn't call out before, or something being improved on furth in preview 2? This is confusing 😕

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or something being improved on further in preview 2

It's this case. I'll add something here to transition into Preview 2-specific work.

IC obj = GetObject();
obj.M();

IC GetObject() => new C();
Copy link
Member Author

@richlander richlander Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, it is usually devirt that enables inlining. In this case, it is the opposite, that we need inlining to enable devirt, which may allow further inlining. Fair? If so, this example is great to talk to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct, the two tend to create virtuous cycles out of each other.

@richlander richlander merged commit 832e7f2 into dotnet10p2 Mar 18, 2025
2 checks passed
@richlander richlander deleted the dotnet10p2-runtime branch March 18, 2025 00:15
@richlander
Copy link
Member Author

Thanks everyone. The contributions look great!

rbhanda added a commit that referenced this pull request Mar 18, 2025
* start preview 2 release notes

* Add feature section to WPF release notes

Fix linter issue

* updates for linter

* updates

* add release notes for efcore

* small fix to the fixed bugs link

* typo

* added mention of another improvement

* .NET MAUI in 10 preview 2

Starting the release notes gathering here. @dalexsoto @rolfbjarne @jonpryor @PureWeen @Redth please contribute.

* Add ExportPkcs12 blurb

* Fix bad newlines from Copilot Workspace

* add release note about SDK noun-first command forms

* Update known issues to include ARM64 Windows hosting bundle upgrade break (#9783)

* Update known issues to include ARM64 Windows hosting bundle upgrade break

* Apply suggestions from code review

Co-authored-by: Daniel Roth <[email protected]>

* Add ASP.NET Core updates for .NET 10 Preview 2 (#9801)

* Add ASP.NET Core updates for .NET 10 Preview 2

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Martin Costello <[email protected]>

* Add Blazor reconnection UI component

* Add remaining Blazor features

* Edits

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Safia Abdalla <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Safia Abdalla <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Safia Abdalla <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Luke Latham <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Luke Latham <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Luke Latham <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Luke Latham <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Luke Latham <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Luke Latham <[email protected]>

* Add ShouldMatch code snippet

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Wade Pickett <[email protected]>

* Update release-notes/10.0/preview/preview2/aspnetcore.md

Co-authored-by: Wade Pickett <[email protected]>

---------

Co-authored-by: Martin Costello <[email protected]>
Co-authored-by: Safia Abdalla <[email protected]>
Co-authored-by: Luke Latham <[email protected]>
Co-authored-by: Wade Pickett <[email protected]>

* Add containers release notes for 10.0 Preview 2 (#9796)

* Add containers release notes for 10.0 Preview 2

* Fix run-on-sentence

Co-authored-by: Matt Thalman <[email protected]>

---------

Co-authored-by: Matt Thalman <[email protected]>

* WinForms Preview 2 Release Notes (#9787)

* Preview 2

* Fix typo

* Update query

* Review feedback

* extra line

* Fixing errors

* Initial version C# P2 blog post (#9795)

* Initial version C# P2 blog post

* Apply suggestions from code review

Co-authored-by: Bill Wagner <[email protected]>

* updat ename of file

* Update release-notes/10.0/preview/preview2/csharp.md

---------

Co-authored-by: James Montemagno <[email protected]>
Co-authored-by: Bill Wagner <[email protected]>

* Preview 2 release notes for WPF (#9805)

* Update wpf.md

* Add missing period.

* cleanup

* cleanup

* Add .NET 10 Preview 2 release notes -- Libraries and Runtime (#9770)

* Start runtime and library release notes

* Add JIT notes

* Update benchmarks

* Fix array de-abstraction intro

* Update release-notes/10.0/preview/preview2/runtime.md

---------

Co-authored-by: Aman Khalid <[email protected]>

* MAUI in 10 preview 2 (#9782)

* MAUI in 10 preview 1

* what's new for maui

thanks davidbritch for capturing these

* Update URLs for macios release notes

* Final polishing

---------

Co-authored-by: Alex Soto <[email protected]>
Co-authored-by: Rachel Kang <[email protected]>

* updates

* updates to readme

* update readme

* tweaks to mds

* enhacne tables

* lint cleanup

* Updating 10.0 PReview files

* Update releases-index.json

* updates

---------

Co-authored-by: Rich Lander <[email protected]>
Co-authored-by: maumar <[email protected]>
Co-authored-by: David Ortinau <[email protected]>
Co-authored-by: Jeremy Barton <[email protected]>
Co-authored-by: Chet Husk <[email protected]>
Co-authored-by: Stephen Halter <[email protected]>
Co-authored-by: Daniel Roth <[email protected]>
Co-authored-by: Martin Costello <[email protected]>
Co-authored-by: Safia Abdalla <[email protected]>
Co-authored-by: Luke Latham <[email protected]>
Co-authored-by: Wade Pickett <[email protected]>
Co-authored-by: Logan Bussell <[email protected]>
Co-authored-by: Matt Thalman <[email protected]>
Co-authored-by: Merrie McGaw <[email protected]>
Co-authored-by: Kathleen Dollard <[email protected]>
Co-authored-by: Bill Wagner <[email protected]>
Co-authored-by: Andy (Steve) De George <[email protected]>
Co-authored-by: Aman Khalid <[email protected]>
Co-authored-by: Alex Soto <[email protected]>
Co-authored-by: Rachel Kang <[email protected]>
Co-authored-by: Rahul Bhandari (.NET) <[email protected]>
@AndyAyersMS
Copy link
Member

Somehow this data table got modified -- it should not show allocation in .NET 10 (that's part of the point of the optimization).

image

@amanasifkhalid
Copy link
Contributor

@AndyAyersMS thanks for catching that, I opened #9813 to fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants