Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mono incorrectly throws a TypeLoadException for a recursive type definition #85821

Closed
joshpeterson opened this issue May 5, 2023 · 11 comments · Fixed by #85828
Closed

Mono incorrectly throws a TypeLoadException for a recursive type definition #85821

joshpeterson opened this issue May 5, 2023 · 11 comments · Fixed by #85828
Assignees
Milestone

Comments

@joshpeterson
Copy link
Contributor

Description

Given the following code:

var test = new Node<int>();
Console.WriteLine(test);

public class Node<T>
{
  public Node<T>[][] Children;
  public T Data;
}

Mono will throw a TypeLoadException exception, which is not expected:

Unhandled Exception:
System.TypeLoadException: Recursive type definition detected .Node`1
[ERROR] FATAL UNHANDLED EXCEPTION: System.TypeLoadException: Recursive type definition detected .Node`1

CoreCLR does not throw a similar exception.

Reproduction Steps

  1. Create a new console app with the code above.
  2. In the .csproj file, add the following content to run the console app with Mono:
<SelfContained>true</SelfContained>
<UseMonoRuntime>true</UseMonoRuntime>
<RuntimeIdentifier>osx-x64</RuntimeIdentifier>

(Note that I'm using macOS x64, but the behavior is not specific to platform or architecture.)

  1. Run the built console app, the exception will occur, which is not expected.

Expected behavior

The output is:

Node`1[System.Int32]

and no exception occurs.

Actual behavior

Mono will throw a TypeLoadException exception, which is not expected:

Unhandled Exception:
System.TypeLoadException: Recursive type definition detected .Node`1
[ERROR] FATAL UNHANDLED EXCEPTION: System.TypeLoadException: Recursive type definition detected .Node`1

Regression?

This has not worked in Mono. I also see the same behavior with Mono 6.12.0 from the mono/mono repository.

Known Workarounds

I don't have any known worarounds.

Configuration

I'm using .NET version 7.0.202 on macOS x64.

Other information

This bug was originally reported to Unity here: https://issuetracker.unity3d.com/issues/typeloadexception-is-thrown-when-using-recursive-types-in-monobehaviour-inherited-classes

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 5, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 5, 2023
@ghost
Copy link

ghost commented May 5, 2023

Tagging subscribers to this area:
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Given the following code:

var test = new Node<int>();
Console.WriteLine(test);

public class Node<T>
{
  public Node<T>[][] Children;
  public T Data;
}

Mono will throw a TypeLoadException exception, which is not expected:

Unhandled Exception:
System.TypeLoadException: Recursive type definition detected .Node`1
[ERROR] FATAL UNHANDLED EXCEPTION: System.TypeLoadException: Recursive type definition detected .Node`1

CoreCLR does not throw a similar exception.

Reproduction Steps

  1. Create a new console app with the code above.
  2. In the .csproj file, add the following content to run the console app with Mono:
<SelfContained>true</SelfContained>
<UseMonoRuntime>true</UseMonoRuntime>
<RuntimeIdentifier>osx-x64</RuntimeIdentifier>

(Note that I'm using macOS x64, but the behavior is not specific to platform or architecture.)

  1. Run the built console app, the exception will occur, which is not expected.

Expected behavior

The output is:

Node`1[System.Int32]

and no exception occurs.

Actual behavior

Mono will throw a TypeLoadException exception, which is not expected:

Unhandled Exception:
System.TypeLoadException: Recursive type definition detected .Node`1
[ERROR] FATAL UNHANDLED EXCEPTION: System.TypeLoadException: Recursive type definition detected .Node`1

Regression?

This has not worked in Mono. I also see the same behavior with Mono 6.12.0 from the mono/mono repository.

Known Workarounds

I don't have any known worarounds.

Configuration

I'm using .NET version 7.0.202 on macOS x64.

Other information

This bug was originally reported to Unity here: https://issuetracker.unity3d.com/issues/typeloadexception-is-thrown-when-using-recursive-types-in-monobehaviour-inherited-classes

Author: joshpeterson
Assignees: -
Labels:

untriaged, area-AssemblyLoader-mono, needs-area-label

Milestone: -

@akoeplinger akoeplinger removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 5, 2023
@akoeplinger
Copy link
Member

/cc @lambdageek @vargaz

@akoeplinger akoeplinger added this to the 8.0.0 milestone May 5, 2023
@akoeplinger akoeplinger removed the untriaged New issue has not been triaged by the area owner label May 5, 2023
@lambdageek
Copy link
Member

We've fixed a number of cases similar to this one. In principle this ought to work: it's not a struct, the generic parameter doesn't get used in a parent class or an interface, etc.

I'll take an initial look

@lambdageek lambdageek self-assigned this May 5, 2023
@lambdageek
Copy link
Member

Repros with .NET8 previews.

It's not because of the toplevel statements, AFAICT. Same problem if I make a normal class with a Main.

It does seem to be related to Node<T>[][]. If I change it to Node<T>[], the exception goes away.

I think probably the constructor for arrays is too aggressive in initializing the element type. if we know it's a reference type, we shouldn't need to expand it too much.

But on the other hand, initializing Node<T>[] works - so we must understand at least some types of recursion in this case.

Maybe it's something about array interface methods. I'll have to try a runtime with tracing enabled.

@lambdageek
Copy link
Member

lambdageek commented May 5, 2023

we're probably hitting something added in 2005. If the code is overly-pessimistic, it's an easy fix. If it turns out the initialization is necessary, we are probably looking at a very hard bug (probably will need loader levels ala mono/mono#7856)

if (mono_class_is_ginst (eclass))
mono_class_init_internal (eclass);
if (!eclass->size_inited)
mono_class_setup_fields (eclass);

I think we don't need the generic instance to be fully initialized at this point. so the first if is probably not needed. I think at the time this was added we didn't have fine-grained initialization for sizes, probably, so the generic instance probably had to be fully inited so we could figure out the element size.

The second if also seems overly conservative. if the type is not a valuetype, we the array will have pointer-sized elements - there's no need to init the element sizes.

It's worth comparing how pointer types work. In a pointer MonoType, the element type is a MonoType, not a MonoClass. So for pointers we never call mono_class_from_mono_type_internal during parsing in do_mono_metadata_parse_type so when we're resolving the Children field when setting up the field layout for Node<T> if the type of the field was Node<T>** we don't need to recursively initialize Node<T>. But in the array case we end calling mono_class_from_mono_type_internal on Node<T>[] (the element type of the outer array), then on Node<T> and then we endup on line 1186, above and we end up recursively initializing Node<T>. (One problem with this story is that for some reason Node<T>[] Children works while array-of-array doesn't. So somewhere in this story we avoid the recursion into mono_class_init_internal - but I'm not sure where, yet)

@joshpeterson
Copy link
Contributor Author

I wanted to follow up on this. @lambdageek: Did the draft PR provide a viable solution?

@lambdageek
Copy link
Member

I wanted to follow up on this. @lambdageek: Did the draft PR provide a viable solution?

I haven't completely convinced myself that removing those 2 lines is correct. But it didn't crash on any existing tests.

My plan is to try to find some time this week to step through the whole initialization in the debugger for Node<T>[][] Children; and public Node<T>[] Children; and to try and understand why the former gets a TLE and the latter doesn't. That will hopefully convince me that the draft PR is alright.

@joshpeterson
Copy link
Contributor Author

joshpeterson commented May 16, 2023

@lambdageek: Did you have a chance to step through the initialization code in the debugger?

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label May 19, 2023
@lambdageek
Copy link
Member

lambdageek commented May 19, 2023

@joshpeterson hey, sorry it took a while. yea I'm now pretty confident I understand what the difference was between Node[] Children and Node[][] Children. I updated the PR with some notes.

@lambdageek
Copy link
Member

lambdageek commented May 19, 2023

oops. Edited the previous comment. I meant to say "I'm now pretty confident". The PR is good.

@joshpeterson
Copy link
Contributor Author

@joshpeterson hey, sorry it took a while. yea I'm now pretty confident I understand what the difference was between Node[] Children and Node[][] Children. I updated the PR with some notes.

Thank you! I'll watch that PR and sync it back to our fork of Mono when it lands.

lambdageek added a commit that referenced this issue May 22, 2023
…he element type (#85828)

Consider code like this:

```csharp
public class Node<T>
{
    public Node<T>[][] Children;
}


Console.WriteLine (typeof(Node<int>));
```

In this case, when we're JITing ``ldtoken class Node`1<int32>`` we first have to parse the type ``Node`1<int32>`` and initialize it.
When we're initializing it, we need to resolve the types of all the fields in the class in order to to figure out if its instance size.

To resolve the field types we have to parse the types of all the fields, and we eventually end up parsing the type ``Node`1<!0>[][]`` which ends up here:

https://github.com/dotnet/runtime/blob/558345d16cf76525d0c7fdbafbfd3a2457142b39/src/mono/mono/metadata/metadata.c#L4023-L4027

When we get to line 4027 the second time (recursively), `etype` is ``Node`1<!0>`` as a `MonoType*`.  And we want to set `type->data.klass` to its corresponding `MonoClass*`.   That ends up calling `mono_class_from_mono_type_internal`, which does this:

https://github.com/dotnet/runtime/blob/558345d16cf76525d0c7fdbafbfd3a2457142b39/src/mono/mono/metadata/class.c#L2214-L2215

When we call `mono_class_create_array` we end up here:

https://github.com/dotnet/runtime/blob/558345d16cf76525d0c7fdbafbfd3a2457142b39/src/mono/mono/metadata/class-init.c#L1186-L1187

with `eclass` equal to ``Node`1<!0>` which we try to initialize and we get a TLE because we're going to try to initialize the same generic type definition ``Node`1`` twice.

Compare with this other class:

```csharp
public unsafe class Node2<T>
{
    public Node2<T>[] Children;
}
```

In this case we only end up calling `mono_class_from_mono_type_internal` on ``Node2`1<int32>`` (not an array).  And that branch does not do any initialization - it just returns `type->data.klass` (for ``Node2`1<!0>`` - which is seen as a gtd at this point) or `mono_class_create_generic_inst (type->data.generic_class)` (for ``Node2`1<int32>`` which is seen later when `ldtoken` inflates the gtd)) - without any extra initialization.


---

It seems the reason we get into trouble is because `mono_class_create_array` does more work than other `MonoClass` creation functions - it tries to initialize the `MonoClass` a little bit (for example by setting `MonoClass:has_references`) which needs at least a little bit of the element class to be initialized.

But note that the code has this:

https://github.com/dotnet/runtime/blob/558345d16cf76525d0c7fdbafbfd3a2457142b39/src/mono/mono/metadata/class-init.c#L1186-L1189

I feel fairly certain we don't need the full class initialization if we're going to immediately initialize the field size information.

Fixes #85821
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label May 22, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants