Parallel.For blocks when no ThreadPool threads are available on dotnet core but works on framework #2078

ficzerepeti · 2020-01-23T14:23:47Z

Hi,

I've been migrating a dotnet framework application to dotnet core and during the process I've found a possible issue.

On dotnet core in case all ThreadPool threads are blocked (waiting for computation to be finished for example) then Parallel.For doesn't return however it does on framework. In the app I'm working on the maximum ThreadPool thread count is limited to processor count (ThreadPool.SetMaxThreads(Environment.ProcessorCount, Environment.ProcessorCount);).

I've been able to reproduce the issue with ~60 lines of code which you can find here.

My question is: is this the expected behaviour? If yes then how can I get it to work like in framework?

Thanks,
Peter

The text was updated successfully, but these errors were encountered:

stephentoub · 2020-01-23T15:04:34Z

Thank you for the repro.

Parallel.For works to spread itself over however many threads are available / allowed in the pool by creating a single task. That task immediately creates and queues a copy of itself and then starts running work for the loop. If another thread becomes available, that thread picks up the copy, which in turn creates another copy and starts joining in on the processing of the loop. Etc. The very first Parallel.For creates doesn't actually get scheduled to the pool (assuming the TaskScheduler hasn't been changed to something that works differently); rather, it gets run on the current thread (after all, why block the current thread waiting for another thread to do work when it could just be done by the current thread). In the case where all of the thread pool threads are blocked, that main thread will end up doing all of the work. However, it will also have already queued a copy of itself, and it needs to ensure that such a copy won't actually run, such that everything associated with the Parallel.For invocation will have quiesced by the time the For call returns to its caller. In .NET Framework, Parallel lives in the same assembly as Task, and has access to its internals; it's calling an internal method on the replica to cancel it. In .NET Core, Parallel is a standalone implementation, and that call (or an equivalent) is missing. That means the Parallel.For in your repro does all the work, but then blocks waiting for the single task it queued to complete; that task is just sitting in the thread pool's queue, waiting to be picked up by a thread from the pool, but you've blocked all threads in the pool until the Parallel.For completes and prohibited the thread pool from injecting more, thereby creating a deadlock.

The obvious workaround for you is: don't do that ;-) Parallel.For has been implemented like this since the earliest days of .NET Core (well before, actually), so the fact that we're only learning about this issue now highlights its rarity.

However, it's still worth looking at fixing the implementation of Parallel.For to be able to cancel the replica tasks it creates. I'm not sure how easy it will be to do, though. The "easy" fix would likely lead to an exception being thrown and caught on every Parallel.For invocation, which isn't ideal. We might need https://github.com/dotnet/corefx/issues/8142 to do this correctly.

stephentoub · 2020-01-23T15:16:16Z

ps As another possible workaround to unblock you, you could try adding a class like this to your project:

private sealed class AlwaysInlineScheduler : TaskScheduler
{
    protected override void QueueTask(Task task) => ThreadPool.QueueUserWorkItem(s => TryExecuteTask((Task)s), task);
    protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued) => TryExecuteTask(task);
    protected override IEnumerable<Task> GetScheduledTasks() => null;
}

and then changing your Parallel.For to pass in a ParallelOptions that uses it:

Parallel.For(0, 20, new ParallelOptions { TaskScheduler = new AlwaysInlineScheduler() }, ...);

I suspect that will workaround the issue. When the main task finishes its work, it then waits on each created replica task using Task.Wait(). Task.Wait() tries to employ a similar "use this thread if possible" optimization as previously mentioned, where if the task being waited on hasn't started yet, it will try to run that task on the current thread instead of on a pool thread. To do that, though, it asks the TaskScheduler if it's ok (since the scheduler might have certain requirements for the threads tasks queued to it run on). The default scheduler will allow such "inlining" if it can find the requested task in the current thread pool thread's local queue, but in this case, it's not a thread pool thread that's queueing the initial task but rather the main thread of the app, so when we do the initial Task.Wait, the scheduler will say "nope" and we'll end up blocking waiting on the task rather than inlining it. The class I pasted above overrides TryExecuteTaskInline to just always try to run the task, and as such, it should avoid the blocking that's leading to the deadlock.

Another workaround would be to run the Parallel.For from a thread pool thread (though in your repro you'd need to change the number of dummy tasks you queue to block pool threads, since you'd be using one of the thread pool threads already). Then the initial replica Parallel.For creates will be in that thread pool thread's local queue, and the Task.Wait on it should be able to inline it successfully without further workarounds.

ficzerepeti · 2020-01-24T10:11:15Z

What an insightful response, very much appreciated. It makes reporting bugs a very positive experience.
I like the idea of the replica tasks and now I finally understand what I saw during debugging :)

The obvious workaround for you is: don't do that ;-)

I know you're half joking but I tried that. Sadly it'd be a massive amount of work which has now been put on our backlog. I guess you know what that means :)

Good news is that you were right, AlwaysInlineScheduler solves our problem. For me at this point this is good enough as we're in full control of our codebase and Parallel.For is not directly used but hidden behind a lib to allow switching between various implementations.

I'm happy to wait for dotnet/corefx#8142 to be resolved and get low cost cancellation.

ghost · 2020-07-08T16:53:13Z

Tagging subscribers to this area: @tarekgh
Notify danmosemsft if you want to be subscribed.

ghost · 2024-01-16T01:28:19Z

Tagging subscribers to this area: @dotnet/area-system-threading-tasks
See info in area-owners.md if you want to be subscribed.

Issue Details

Hi,

I've been migrating a dotnet framework application to dotnet core and during the process I've found a possible issue.

On dotnet core in case all ThreadPool threads are blocked (waiting for computation to be finished for example) then Parallel.For doesn't return however it does on framework. In the app I'm working on the maximum ThreadPool thread count is limited to processor count (ThreadPool.SetMaxThreads(Environment.ProcessorCount, Environment.ProcessorCount);).

I've been able to reproduce the issue with ~60 lines of code which you can find here.

My question is: is this the expected behaviour? If yes then how can I get it to work like in framework?

Thanks,
Peter

Author:	ficzerepeti
Assignees:	stephentoub
Labels:	`bug`, `area-System.Threading.Tasks`
Milestone:	9.0.0

stephentoub · 2024-01-24T03:57:59Z

The "easy" fix would likely lead to an exception being thrown and caught on every Parallel.For invocation, which isn't ideal. We might need https://github.com/dotnet/corefx/issues/8142 to do this correctly.

Now that we have SuppressThrowing, I gave this a quick go, but it's unfortunately not as easy as I'd hoped. We can easily create a CancellationTokenSource, create the tasks with its CancellationToken, Cancel the CTS once the initial replica task completes, and ignore any cancellation while waiting via SuppressThrowing. But in order for the Task being waited on to actually transition to the canceled state after it's been started, the scheduler's TryDequeue needs to return true, and that can only happen for the default scheduler if the task being dequeued is queued to the work-stealing queue for the thread calling Cancel. And that's frequently not going to be the case here.

Dotnet-GitSync-Bot added area-System.Threading untriaged New issue has not been triaged by the area owner labels Jan 23, 2020

stephentoub added bug and removed untriaged New issue has not been triaged by the area owner labels Jan 23, 2020

stephentoub added area-System.Threading.Tasks and removed area-System.Threading labels Jan 23, 2020

tarekgh added area-System.Linq.Parallel and removed area-System.Threading.Tasks labels Jul 8, 2020

tarekgh added this to the Future milestone Jul 8, 2020

stephentoub mentioned this issue Oct 15, 2020

Blazor client developers can use Parallel.For in their applications to share more code with server #43411

Closed

stephentoub self-assigned this Jan 16, 2024

stephentoub modified the milestones: Future, 9.0.0 Jan 16, 2024

stephentoub added area-System.Threading.Tasks and removed area-System.Linq.Parallel labels Jan 16, 2024

stephentoub removed their assignment Jan 24, 2024

stephentoub modified the milestones: 9.0.0, Future Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel.For blocks when no ThreadPool threads are available on dotnet core but works on framework #2078

Parallel.For blocks when no ThreadPool threads are available on dotnet core but works on framework #2078

ficzerepeti commented Jan 23, 2020

stephentoub commented Jan 23, 2020 •

edited

Loading

stephentoub commented Jan 23, 2020

ficzerepeti commented Jan 24, 2020 •

edited

Loading

ghost commented Jul 8, 2020

ghost commented Jan 16, 2024

stephentoub commented Jan 24, 2024

Parallel.For blocks when no ThreadPool threads are available on dotnet core but works on framework #2078

Parallel.For blocks when no ThreadPool threads are available on dotnet core but works on framework #2078

Comments

ficzerepeti commented Jan 23, 2020

stephentoub commented Jan 23, 2020 • edited Loading

stephentoub commented Jan 23, 2020

ficzerepeti commented Jan 24, 2020 • edited Loading

ghost commented Jul 8, 2020

ghost commented Jan 16, 2024

stephentoub commented Jan 24, 2024

stephentoub commented Jan 23, 2020 •

edited

Loading

ficzerepeti commented Jan 24, 2020 •

edited

Loading