Skip to content

Conversation

@BrennanConroy
Copy link
Member

Original PR #85523

In System.Threading.RateLimiting.ConcurrencyLimiter and the other three implementations using a synchronously locked deque, there is a deadlock between an outstanding wait-for-lease operation being canceled by user and releasing an existing lease. The internal Release() method has to clean up wait-for-lease operations in some circumstances. This involves disposing the operation's CancellationTokenRegistration, which blocks if the registered callback is already running. This callback can be invoked on another thread if the external cancellation token fires. It locks the rate limiter to return the operation's permits to the rate limiter. If the external cancellation token fires after Release() runs but before it disposes of the CancellationTokenRegistration, there will be a deadlock as Release() holds the lock while waiting for the callback to complete while the callback blocks on the lock to return its permits.

This change eliminates this deadlock by moving the cleanup of any wait-for-lease operations in Release() outside the lock.
It also combines the private struct RequestRegistration with the private class CancelQueueState as they are used only together, increasing the size of the deque element for no discernible gain.

atykhyy added 5 commits April 28, 2023 16:17
In System.Threading.RateLimiting.ConcurrencyLimiter and the other three
implementations using a synchronously locked deque, there is a deadlock
between an outstanding wait-for-lease operation being canceled by user
and releasing an existing lease. The internal Release() method has to
clean up wait-for-lease operations in some circumstances. This involves
disposing the operation's CancellationTokenRegistration, which blocks
if the registered callback is running. This callback can be invoked on
another thread if the external cancellation token fires. It locks the
rate limiter to return the operation's permits to the rate limiter.
If the external cancellation token fires after Release() runs
but before it disposes of the CancellationTokenRegistration, there will
be a deadlock as Release() holds the lock while waiting for the callback
to complete while the callback blocks on the lock to return its permits.

This change eliminates this deadlock by moving the cleanup of any
wait-for-lease operations in Release() outside the lock.
@ghost
Copy link

ghost commented Aug 10, 2023

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Issue Details

Original PR #85523

In System.Threading.RateLimiting.ConcurrencyLimiter and the other three implementations using a synchronously locked deque, there is a deadlock between an outstanding wait-for-lease operation being canceled by user and releasing an existing lease. The internal Release() method has to clean up wait-for-lease operations in some circumstances. This involves disposing the operation's CancellationTokenRegistration, which blocks if the registered callback is already running. This callback can be invoked on another thread if the external cancellation token fires. It locks the rate limiter to return the operation's permits to the rate limiter. If the external cancellation token fires after Release() runs but before it disposes of the CancellationTokenRegistration, there will be a deadlock as Release() holds the lock while waiting for the callback to complete while the callback blocks on the lock to return its permits.

This change eliminates this deadlock by moving the cleanup of any wait-for-lease operations in Release() outside the lock.
It also combines the private struct RequestRegistration with the private class CancelQueueState as they are used only together, increasing the size of the deque element for no discernible gain.

Author: BrennanConroy
Assignees: -
Labels:

area-System.Threading

Milestone: -

@adityamandaleeka
Copy link
Member

Thanks @atykhyy!

@BrennanConroy BrennanConroy force-pushed the atykhyy/ratelimit-fix branch from b3b90c2 to 537b2ba Compare August 10, 2023 03:57
@BrennanConroy BrennanConroy merged commit 219392e into dotnet:main Aug 10, 2023
@BrennanConroy BrennanConroy deleted the atykhyy/ratelimit-fix branch August 10, 2023 15:28
@BrennanConroy
Copy link
Member Author

/backport to release/7.0

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@BrennanConroy backporting to release/7.0 failed, the patch most likely resulted in conflicts:

$ git am --3way --ignore-whitespace --keep-non-patch changes.patch

Applying: Eliminate cancellation deadlock in RateLimiter implementations
Using index info to reconstruct a base tree...
M	src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/ConcurrencyLimiter.cs
M	src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/FixedWindowRateLimiter.cs
M	src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/SlidingWindowRateLimiter.cs
M	src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/TokenBucketRateLimiter.cs
Falling back to patching base and 3-way merge...
Auto-merging src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/TokenBucketRateLimiter.cs
CONFLICT (content): Merge conflict in src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/TokenBucketRateLimiter.cs
Auto-merging src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/SlidingWindowRateLimiter.cs
CONFLICT (content): Merge conflict in src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/SlidingWindowRateLimiter.cs
Auto-merging src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/FixedWindowRateLimiter.cs
CONFLICT (content): Merge conflict in src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/FixedWindowRateLimiter.cs
Auto-merging src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/ConcurrencyLimiter.cs
CONFLICT (content): Merge conflict in src/libraries/System.Threading.RateLimiting/src/System/Threading/RateLimiting/ConcurrencyLimiter.cs
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Eliminate cancellation deadlock in RateLimiter implementations
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128

Please backport manually!

@github-actions
Copy link
Contributor

@BrennanConroy an error occurred while backporting to release/7.0, please check the run log for details!

Error: git am failed, most likely due to a merge conflict.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants