Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow preinitializing types with canonical forms #79384

Merged
merged 3 commits into from
Dec 23, 2022

Conversation

MichalStrehovsky
Copy link
Member

@MichalStrehovsky MichalStrehovsky commented Dec 8, 2022

The number of preinitialized types in a hello world goes from 156 to 244 (out of 431 with a cctor in total). The size of hello world goes from 2,908,160 bytes to 2,899,456 (although this is not a size optimization).

We were disallowing compile time execution of class constructors for types that have canonical forms because they could have a matching type loader template that allows creating new types at runtime.

Consider:

_ = ClassWithTemplate<C1>.Cookie;

var val = (int)typeof(ClassWithTemplate<>).MakeGenericType(typeof(C2)).GetField("Cookie").GetValue(null);

class ClassWithTemplate<T>
{
    public static int Cookie = 42;
}

If we allowed ClassWithTemplate<C1> to preinitialize, what does it mean for ClassWithTemplate<__Canon>? And what about the runtime-created ClassWithTemplate<C2> - who will run the cctor? And what if there was a ClassWithTemplate<C3> (could even be statically present) that for whatever reason requires a cctor (assume a more complicated cctor than the above example).

It was easier to just ban this.

But since we have a scanning phase in optimized builds, we can know exactly what types will end up with a template (that ultimately allows creating new canonical types at runtime) and whether we were able to preinitialize all types with the same canonical form.

If we know a template is not possible, we don't have to worry about the runtime case - we just need to ensure all canonically equivalent types either get preinitialized, or they don't.

This pull request does that:

  • we keep information about type templates from the scanning phase into compilation phase
  • we extract the policy decision about whether preinitialization for a type is allowed into a separate class
  • we add an extra policy decision about whether canonical forms of types require preinitialization
  • one implementation of this class follows the old conservative policy
  • another implementation uses whole program view to allow preinitialization of types that don't have a matching template and fully preinitialized during scanning

Cc @dotnet/ilc-contrib

@ghost
Copy link

ghost commented Dec 8, 2022

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

The number of preinitialized types in a hello world goes from 156 to 244 (out of 431 with a cctor in total). The size of hello world goes from 2,908,160 bytes to 2,899,456 (although this is not a size optimization).

We were disallowing compile time execution of class constructors for types that have canonical forms because they could have a matching type loader template that allows creating new types at runtime.

Consider:

_ = ClassWithTemplate<C1>.Cookie;

var val = (int)typeof(ClassWithTemplate<>).MakeGenericType(typeof(C2)).GetField("Cookie").GetValue(null);

class ClassWithTemplate<T>
{
    public static int Cookie = 42;
}

If we allowed ClassWithTemplate<C1> to preinitialize, what does it mean for ClassWithTemplate<__Canon>? And what about the runtime-created ClassWithTemplate<C2> - who will run the cctor?

It was easier to just ban this.

But since we have a scanning phase in optimized builds, we can know exactly what types will end up with a template (that ultimately allows creating new canonical types at runtime).

If we know a template is not possible, we don't have to worry about the runtime case - we just need to ensure all canonically equivalent types either get preinitialized, or they don't.

This pull request does that:

  • we keep information about type templates from the scanning phase into compilation phase
  • we extract the policy decision about whether preinitialization for a type is allowed into a separate class
  • one implementation of this class follows the old conservative policy
  • another implementation uses whole program view to allow preinitialization of types that don't have a matching template
Author: MichalStrehovsky
Assignees: MichalStrehovsky
Labels:

area-NativeAOT-coreclr

Milestone: -

@MichalStrehovsky
Copy link
Member Author

Oh, interesting, I was digging into the size win (because it's funny to see a size win when I see that just the frozen segment region grew by almost 20 kB) and we're able to get rid of a bunch of stuff because this connects to @EgorBo's work on the JIT side to leverage preinitialization info knowledge. For example, we totally eliminated all the information about Array.EmptyArray<T> type - it's all gone. Array.Empty<T>() became a zero cost abstraction on NativeAOT.

// Codegen inlines information about whether the type has a cctor. Since codegen happens
// on top of canonical forms, we better agree on whether a type is preinitialized
// across all types that share the same canonical form. We should not be preinitializing
// types where there's a chance that different substitution produce different answer on this.
Copy link
Member

@jkotas jkotas Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assume that the pre-initialization interpreter does not ever take different path based on the substitution. It feels like a very fragile assumption.

For example, the static constructor can have a code like if (typeof(T) == typeof(string)). We will need be always very careful about not introducing interpretation of code like this in the pre-initializer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through all the GetObject calls we currently do in the interpreter and reviewed they don't affect the paths code could be taking. Off the top of my head, I only see isinst/castclass and ldtoken in that bucket. I already added a test for the isinst path (in this PR). I can also add ldtoken. We also have this assert. Are there additional mitigations you could think of that would alleviate the concern?

@jkotas
Copy link
Member

jkotas commented Dec 13, 2022

I have played with this some more and noticed that it is very easy to de-rail the optimization for Array.Empty by something needing the template for it. For example:

using System.Runtime.CompilerServices;

// Uncomment this and Array.Empty<object> returned by f() is not going to return a pre-initialized frozen array anymore
// typeof(Array).GetMethods().ToString();

for (;;) f();

[MethodImpl(MethodImplOptions.NoInlining)]
static object f() => Array.Empty<object>();

Would it be possible to keep the concrete instantiations like the one from my example pre-initialized even when the shared generic type has a template?

@MichalStrehovsky
Copy link
Member Author

Would it be possible to keep the concrete instantiations like the one from my example pre-initialized even when the shared generic type has a template?

Maybe this would work:

We let the preinitialization happen for whatever we've seen statically (irrespective of whether there's a template), but we still say the type requires cctor check if there was a template. We burn in information that the cctor already ran for the statically preinitialized cases in the cctor context. We can't get rid of the cctor method body itself and it will run when needed for new types. Not sure of the impact on code quality with that.

@MichalStrehovsky
Copy link
Member Author

We would still need all canonically equivalent types without a template to agree on whether they're preinitialized and we can forgo the cctor check.

Or we don't look at the existence of templates at all and always say "requires cctor check" and let the cctor context decide if it was preinitialized or not at runtime.

@jkotas
Copy link
Member

jkotas commented Dec 14, 2022

We can't get rid of the cctor method body itself and it will run when needed for new types. Not sure of the impact on code quality with that.

Right, you would get the same code quality for shared code as what you get today.

We would still need all canonically equivalent types without a template to agree on whether they're preinitialized and we can forgo the cctor check.

Can we check that in the compiler? Something like: If there is no template and all canonically equivalent types preinitialized successfully, forgo the cctor check in the shared code.

MichalStrehovsky added a commit to MichalStrehovsky/runtime that referenced this pull request Dec 15, 2022
jkotas pushed a commit that referenced this pull request Dec 15, 2022
@MichalStrehovsky
Copy link
Member Author

Pushed out a new version that should match what we discussed. I'll update the top-post tomorrow.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some test failures, otherwise looks great!

…olicy

We blanket disable canonical form preinitialization to keep things working by the policy instead.
@MichalStrehovsky
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

The System.Linq test failure is real. Still looking into what the fix should be.

We have a type that was created statically in the compiler. The type has a GC static field and a cctor. We apparently optimized away the non-GC static base. The type loader is constructing a new type that needs to refer to that non-GC static base due to a cctor check from shared code. We cannot find it. And for whatever reason the template type loader is fine creating a null slot for this, so it crashes a lot later than it should:

return index.HasValue ? staticInfoLookup.GetIntPtrFromIndex(index.Value) : IntPtr.Zero;

I'll make a new pull request to sort out the story for when static bases need to be generated for stuff because the current state of affairs doesn't look right.

@MichalStrehovsky
Copy link
Member Author

The System.Linq test failure is real. Still looking into what the fix should be.

The fix was actually simple - we have logic to generate the bases for type loader purposes. The logic now needs to ask "does the type need cctor context" instead of "is the type lazy initialized". Those questions now have different answers.

I've sprinkled some extra asserts as well.

Still, I'm left with some question marks around how things are done right now and I'll want to look at the treatment of statics in the type loader later.

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky MichalStrehovsky merged commit acd76bc into dotnet:main Dec 23, 2022
@MichalStrehovsky MichalStrehovsky deleted the preinitgen branch December 23, 2022 04:14
@ghost ghost locked as resolved and limited conversation to collaborators Jan 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants