Skip to content

PLINQ query crashes a .NET Core 6 application due to unhandled InvalidCastException thrown from System.Threading.Tasks.Task.RunContinuations. #83520

@mhinkka

Description

@mhinkka

Description

I have a Web API application hosted in IIS that sorts rows of an array (actually a two-dimensional array/table of objects: object[][]) using expressions similar to this:

rows.AsParallel().AsOrdered().WithCancellation(cts.Token).OrderBy(...).ThenBy(...).ToList()

The application has built-in support for canceling any running sorting operation at any point. This cancellation performs cancellation for the thread that is sorting the array using a call to Thread.Interrupt-function as well as calling CancellationTokenSource.Cancel for the CancellationTokenSource (cts). I want to use both methods as I can't pass the cancellation token into all the places these cancellable operations could call (Including, e.g., third-party components).

We have been getting intermittent unhandled System.InvalidCastException exceptions from System.Threading.Tasks.Task.RunContinuations crashing the whole application in situations that have the following issues in common:

  1. PLINQ sorting operation is being run.
  2. The sorting operation has been interrupted with the ThreadInterruptException.
  3. An unhandled exception was thrown from System.Threading.Tasks.Task.RunContinuations.
    This exception has the following details:
System.InvalidCastException: Unable to cast object of type 'System.Object' to type 'System.Collections.Generic.List`1[System.Object]'.
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task.ProcessChildCompletion(Task childTask)
   at System.Threading.Tasks.Task.FinishStageTwo()
   at System.Threading.Tasks.Task.FinishSlow(Boolean userDelegateExecute)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart() 

In some cases, if the worker thread is using the thread that initiated the PLINQ operation, this stack trace looks different and in that case, this exception could be caught, but this situation seems quite rare as usually parallel tasks are being processed by other threads than the PLINQ caller. Thus, there is no way to catch the exception that occurred in System.Threading.Tasks.Task.RunContinuations, the application terminates.

To me, this seems like a problem in the .NET 6 runtime as I don't see how I can directly affect the continuationObjects being passed to System.Threading.Tasks.Task.RunContinuations.

Can you confirm whether this is an issue in .NET 6 runtime and/or is there some way for me to adjust my application so that this does not occur anymore?

Note also that in the reproducion step-function shown below, the stack trace is slightly different, but the situation is still as fatal as in this production environment case:

Application: testhost.exe
CoreCLR Version: 6.0.1523.11507
.NET Version: 6.0.15
Description: The process was terminated due to an unhandled exception.
Exception Info: System.InvalidCastException: Unable to cast object of type 'System.Object' to type 'System.Collections.Generic.List`1[System.Object]'.
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task.FinishStageTwo()
   at System.Threading.Tasks.Task.ProcessChildCompletion(Task childTask)
   at System.Threading.Tasks.Task.NotifyParentIfPotentiallyAttachedTask()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task.FinishStageTwo()
   at System.Threading.Tasks.Task.FinishSlow(Boolean userDelegateExecute)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
   at System.Threading.Tasks.Task.ExecuteEntryUnsafe(Thread threadPoolThread)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()

Reproduction Steps

I'm not sure if this is minimal, nor whether it replicates in all the environments, but at least in my test environment using .NET 6.0.15 (CoreCLR 6.0.1523.11507) I can replicate this issue reliably by running the code in the following C# function:

		public void CrashTest()
		{
			var rnd = new Random();
			var rows = Enumerable.Range(1, 1000000)
				.Select(x => new object[] { $"T{rnd.Next()}" })
				.ToArray();
			var cts = new CancellationTokenSource();

			for (var i = 0; i < 10000; ++i)
			{
				var thread1 = new Thread(() =>
				{
					try
					{
						while (true)
						{
							var _ = rows
								.AsParallel()
								.AsOrdered()
								.WithCancellation(cts.Token)
								.OrderBy(row => row[0])
								.ToList();
						}
					}
					catch (ThreadInterruptedException)
					{
					}
				});

				thread1.Start();
				Thread.Sleep(rnd.Next(100));
				thread1.Interrupt();
				cts.Cancel();
				thread1.Join();
			}
		}

Expected behavior

Application does not crash.
PLINQ operation is cancelled without threads being killed.

Actual behavior

An unhandled InvlidCastException is thrown from System.Threading.Tasks.Task.RunContinuations crashing the whole application.

Exception generated into windows event log:
Log Name:      Application
Source:        .NET Runtime
Date:          16/03/2023 17.27.19
Event ID:      1026
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Description:
Application: testhost.exe
CoreCLR Version: 6.0.1523.11507
.NET Version: 6.0.15
Description: The process was terminated due to an unhandled exception.
Exception Info: System.InvalidCastException: Unable to cast object of type 'System.Object' to type 'System.Collections.Generic.List`1[System.Object]'.
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task.FinishStageTwo()
   at System.Threading.Tasks.Task.ProcessChildCompletion(Task childTask)
   at System.Threading.Tasks.Task.NotifyParentIfPotentiallyAttachedTask()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task.FinishStageTwo()
   at System.Threading.Tasks.Task.FinishSlow(Boolean userDelegateExecute)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
   at System.Threading.Tasks.Task.ExecuteEntryUnsafe(Thread threadPoolThread)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()

Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name=".NET Runtime" />
    <EventID Qualifiers="0">1026</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2023-03-16T15:27:19.4485051Z" />
    <EventRecordID>160159</EventRecordID>
    <Correlation />
    <Execution ProcessID="12224" ThreadID="0" />
    <Channel>Application</Channel>
    <Computer>NB474</Computer>
    <Security />
  </System>
  <EventData>
    <Data>Application: testhost.exe
CoreCLR Version: 6.0.1523.11507
.NET Version: 6.0.15
Description: The process was terminated due to an unhandled exception.
Exception Info: System.InvalidCastException: Unable to cast object of type 'System.Object' to type 'System.Collections.Generic.List`1[System.Object]'.
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task.FinishStageTwo()
   at System.Threading.Tasks.Task.ProcessChildCompletion(Task childTask)
   at System.Threading.Tasks.Task.NotifyParentIfPotentiallyAttachedTask()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task.FinishStageTwo()
   at System.Threading.Tasks.Task.FinishSlow(Boolean userDelegateExecute)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task&amp; currentTaskSlot, Thread threadPoolThread)
   at System.Threading.Tasks.Task.ExecuteEntryUnsafe(Thread threadPoolThread)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
</Data>
  </EventData>
</Event>

Regression?

We didn't have this issue with the same application in .NET Framework 4.7.1.

Known Workarounds

No response

Configuration

.NET version:       replicates with 6.0.5, 6.0.14, and 6.0.15
Operating system:   "Microsoft Windows NT 10.0.17763.0" and "Microsoft Windows NT 10.0.22621.0"
Process:            64 bit
Operating System:   64 bit
CPU:                "Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz" and "12th Gen Intel(R) Core(TM) i7-12800H Cores: 20"
Cores:              32 and 20 
Memory:             256 GB and 64 GB

Other information

For me it seems that this is most probably thrown from this line in System.Threading.Tasks.Task.RunContinuations:

List<object?> continuations = (List<object?>)continuationObject;

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions