-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async cancellation swallowed by BRPOPLPUSH #2028
Comments
This issue is marked stale. It will be closed in 30 days if it is not updated. |
bump |
We solved few bags with async canceled errors from the time this issue opened, can someone check if this is still relevant or solved by the fixes we did? |
Hi, this issue hasn’t been updated in almost one year. Since there is no confirmation that the issue is still reproducible, I’ll close it for now. Please feel free to reopen it if further assistance is needed. |
Should issues be submitted here or on aioredis package now that they've merged?
Using the BRPOPLPUSH command on an asyncio pool, if a cancellation occurs at exactly the right time it is swallowed and never bubbled up to calling code. This leads to tasks that never exit and violates the convention that
CancelledError
should never be swallowed. The same behaviour is seen with BRPOP, so I assume this is related to blocking calls. I can't recreate with non-blocking calls, but it may just be the race window is smaller there.I can't see an obvious place for this swallowing of the cancellation in the code, but you are catching
BaseException
which will catchCancelledError
(unlike catching onException
), so care needs to be taken to ensure it is always re-raised after any cleanup is done. In cases where you are catchingBaseException
it appears you are trying to disconnect, so perhaps there is a deadlock between the disconnect and the command submit?Demo
Run the following inside a loop (wrap it all in asyncio.run or just run raw in IPython cell). It will not fail every time, as it is timing dependent. You will have to adjust the 0.003 second delay up or down depending on your system speed until it begins to fail. I get about 25% failure with the code below.
The above code should never fail the assertion, as we have cancelled the task and the CancelledError should propagate out within the time we've given it (waiting longer doesn't help, the task never stops). If the 0.003 second delay is decreased, it never fails. Similarly, if the 0.003 second delay is increased, it also never fails. So there is a short window during the start of the
conn.brpoplpush
command that the cancellation is swallowed and the task never exits.On a failure, the following will be printed:
On a good run, you will get the following if the cancel was too early:
Or if the cancel was too late to trigger:
Info
Version: 4.2.0rc1 and also 2.0.1 of aioredis
Platform: Linux, Python 3.8.10 and 3.10.2 confirmed
The text was updated successfully, but these errors were encountered: