-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aiohttp swallows asyncio.CancelledError during connection timeout #229
Comments
Firstly, this is not the async-timeout repo (you just linked to it). |
I noticed this problem when using aiohttp so this is the reason I created the bug here. I understand your point, however how are you supposed to cancel task that is running aiohttp code when it can swallow the |
I'm not sure what you mean, nothing was swallowed? The task was cancelled by the timeout, your If you can demonstrate the confusing behaviour in aiohttp, or explain where it happens, then maybe we can help.. |
Imagine you have a retry logic that retries the request when they fail. Something along these lines: async def worker(cl: aiohttp.ClientSession):
while True:
try:
r = await cl.get(..., timeout=...)
except (asyncio.TimeoutError, OSError):
await asyncio.sleep(10)
else:
do_something(r) If you want to cancel this task, there is a risk that it will never exit. |
Right, so this would better show the problem, right? import asyncio
from async_timeout import timeout
async def test_task():
while True:
print("RETRY")
try:
with timeout(1):
await asyncio.sleep(10)
except asyncio.TimeoutError:
pass
async def main():
t = asyncio.create_task(test_task())
await asyncio.sleep(2)
t.cancel()
await t
asyncio.run(main()) This will sometimes run forever without cancelling the task. Let me think about it.... |
The main problem is that these two operations are not atomic:
You can actually squeeze an explicit |
Your example would not cause the problem. It is really hard to create good reproducer for it. It is highly dependent on timing and I am getting the issue in about 5 cancellations in 1000. |
I get about 50/50 on that example whether the task cancels or sits in an infinite loop. |
Oh ok, I was confused because for me it terminates every time, however when I added one print inside the loop, then it runs in infinite loop. So yeah - it is dependent on the timing, however it pretty much demonstrates what I had on mind. |
As I wrote the problem is that the timeout triggered cancellation and the actual It probably creates issues also if you sequence your explicit cancellation right before the timeout firing as well. During high contention I get errors pretty much every run even when not trying to reproduce the issue. |
As I thought it works also the opposite way (sequencing explicit cancellation right before firing the timeout): import asyncio
from async_timeout import timeout
t = None
async def test_task2():
try:
with timeout(1):
print("Waiting for timeout")
await asyncio.sleep(10)
except:
print("Done waiting for timeout")
raise
async def test_task1():
print("Starting explicit cancellation task")
await asyncio.sleep(1)
print("Explicitly cancelling the task")
t.cancel()
async def main():
asyncio.create_task(test_task1())
global t
t = asyncio.create_task(test_task2())
await asyncio.sleep(3)
try:
await t
except BaseException as e:
print(f"{type(e).__name__}({e})")
asyncio.run(main()) Produces this output for me:
Clearly explicit cancellation was done before the timeout was fired yet it still raised If you try to change the sleep in task1 to 0.9 then you get |
OK, previous examples make it clear why it's a problem. This is a minimal reproducer that can be used as a test. If you add a print to async def test_task(deadline, loop):
print("TRY")
with Timeout(deadline, loop):
await asyncio.sleep(10)
async def main():
loop = asyncio.get_running_loop()
deadline = loop.time() + 1
t = asyncio.create_task(test_task(deadline, loop))
def cancel():
print("CANCEL")
t.cancel()
loop.call_at(deadline, cancel)
await asyncio.sleep(2)
await t |
This is rather tricky to resolve as I just don't see any information being available to figure out where the cancellations came from (or how many there were). Playing with private variables, I can get the previous example to work by changing if task._fut_waiter and task._fut_waiter.cancelled():
return
task.cancel()
self._state = _State.TIMEOUT But, if the cancel happens between the timeout and the exit, I can't see any way to tell that the task has been cancelled a second time, which I can reproduce with this example: async def test_task(deadline, loop):
print("TRY")
with Timeout(deadline, loop):
await asyncio.sleep(10)
async def main():
loop = asyncio.get_running_loop()
deadline = loop.time() + 1
t = asyncio.create_task(test_task(deadline, loop))
def cancel():
print("CANCEL")
t.cancel()
loop.call_at(deadline+.000001, cancel)
await asyncio.sleep(2)
await t |
Since python 3.9 you could change task.cancel(msg="async_timeout._on_cancel") and then catch it in self._task._cancel_message I tried it and in case you call explicit I am not sure though that |
Well, if you fancy playing with it on my branch at #230, and see if you can get it working and tests passing, that would be great. Better to have a fix for 3.9+ than no fix at all. You can wrap the code with |
#230 uses hacky access to private |
Also, the fix only fixes half the problem currently. i.e. It works when the flow of execution is: But will still fail (as per the failing test) when it's like: |
The second case requires some way to recognise that 2 cancellations have happened, which I don't think is possible with the current implementation of |
@Dreamsorcerer in case you call cancel multiple times before the Task you want to cancel gets scheduled then By the way official |
In case you are interested in reproducer how import asyncio
SLEEP_TIME = 1
async def test_task():
# Sleep with wait_for (timeout always larger than sleep time)
try:
t = asyncio.create_task(asyncio.sleep(SLEEP_TIME))
try:
await asyncio.wait_for(t, timeout=SLEEP_TIME+10)
except asyncio.TimeoutError:
print("Sleep 1 timeouted")
else:
print("Sleep 1 finished")
except asyncio.CancelledError:
print("Sleep 1 was cancelled")
# Sleep again in case the cancel missed the first sleep
try:
await asyncio.sleep(5)
except asyncio.CancelledError:
print("Sleep 2 was cancelled")
else:
print("Sleep 2 finished")
async def main():
t = asyncio.create_task(test_task())
await asyncio.sleep(SLEEP_TIME)
# This cancel should happen before wait_for exits its waiter but after first sleep task is done
t.cancel()
print("test_task was cancelled")
await t
asyncio.run(main()) I would expect to see
So how do you cancel a task reliably? Should I check some variable after each |
This appears to already be reported: https://bugs.python.org/issue42130 |
Nope, it stores the first one. The code shortcuts in future calls and doesn't update it: https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L209 |
No matter what I do, I can't seem to figure out a way to detect the latter case. Again, feel free to expand on #235 if you can figure anything out. |
In every case I see the same results as for a normal timeout.
and
Both end up with the sentinel in the exception and It seems the |
After experimenting, I have a conclusion that is very close to @Dreamsorcerer results: the solution doesn't work well. #237 provides an alternative approach: on_timeout event schedules the task cancellation on the next event loop iteration ( The approach has the obvious drawback: raising TimeoutError requires one more loop iteration. I think it's ok: happy-path has the same performance, only the timeout case works slower. |
Not sure why I'm spending all this time on it, but I've fixed |
Sorry, I'm pretty busy these days. Will return to the issue in a few days on this week. |
Of course. To summarise the current state:
|
Found this issue debugging a weird memory leak in a code using aiohttp. It is not easy to to reproduce but when you set |
You can check if you are leaking tasks by running @Dreamsorcerer thank you for python/cpython#28149 . In the meantime I created a workaround wait function that does not cancel the task implicitly like wait_for but rather just return after the timeout and lets you cancel the task and handle everything explicitly. Kind of makes you think if the approach that Go took, where timeouts and deadlines are the responsibility of the running coroutine and not the event loop, isn't better. Sure, it places the burden of implementation on the library side and requires you to pass |
Replit: Updating package configuration --> python3 -m poetry add async-timeout Updating dependencies SolverProblemError
and discord (1.7.3) depends on discord.py (>=1.7.3), discord (>=1.7.3,<2.0.0) requires discord.py (>=1.7.3).
and aiohttp (3.6.2) depends on async-timeout (>=3.0,<4.0), aiohttp (>=3.6.2,<3.6.3 || >3.6.3,<3.7.0 || >3.7.0,<3.7.1 || >3.7.1,<3.7.2 || >3.7.2,<3.7.3 || >3.7.3,<3.7.4 || >3.7.4,<3.7.4.post0 || >3.7.4.post0,<3.8.0) requires async-timeout (>=3.0,<4.0). at venv/lib/python3.8/site-packages/poetry/puzzle/solver.py:241 in _solve Replit: Package operation failed. i got this error and i am unable to make my bot online but few times my discord bot gets online but most of the time the package gives error. Please help me to make my bot online |
async_timeout does not support python 3.11 aio-libs/async-timeout#295 And have two years old annoying bugs: aio-libs/async-timeout#229 redis#2551 Since asyncio.timeout has been shipped in python 3.11, we should start using it. Partially fixes 2551
async_timeout does not support python 3.11 aio-libs/async-timeout#295 And have two years old annoying bugs: aio-libs/async-timeout#229 redis#2551 Since asyncio.timeout has been shipped in python 3.11, we should start using it. Partially fixes 2551
async_timeout does not support python 3.11 aio-libs/async-timeout#295 And have two years old annoying bugs: aio-libs/async-timeout#229 redis#2551 Since asyncio.timeout has been shipped in python 3.11, we should start using it. Partially fixes 2551
async_timeout does not support python 3.11 aio-libs/async-timeout#295 And have two years old annoying bugs: aio-libs/async-timeout#229 #2551 Since asyncio.timeout has been shipped in python 3.11, we should start using it. Partially fixes 2551
Will this be ever fixed? This bug is still present in import async_timeout
from .aio import cancel_tasks, wait_for_first
class _Timeout(async_timeout.Timeout):
RANDOM_TOKEN = '0f0dd596-373b-42df-aa0b-682d046c5d24'
def __exit__(self, exc_type, exc_val, exc_tb):
self._do_exit(exc_type, exc_val)
return None
async def __aexit__(self, exc_type, exc_val, exc_tb):
self._do_exit(exc_type, exc_val)
return None
def _do_exit(self, exc_type, exc_val):
if exc_type is asyncio.CancelledError and str(exc_val) == _Timeout.RANDOM_TOKEN \
and self._state == async_timeout._State.TIMEOUT:
self._timeout_handler = None
raise asyncio.TimeoutError
# timeout has not expired
self._state = async_timeout._State.EXIT
self._reject()
def _on_timeout(self, task: "asyncio.Task[None]") -> None:
task.cancel(_Timeout.RANDOM_TOKEN)
self._state = async_timeout._State.TIMEOUT
# drop the reference early
self._timeout_handler = None
async_timeout.Timeout = _Timeout |
I need to do some more testing, but I think it may be solved with asyncio.timeout(). |
Version 5.0+ works exactly as |
Describe the bug
There is a race condition in code that handles connection timeout. If you call
cancel
on a task that is currently pending increate_connection
and connection timeout was already fired thenasyncio.CancelledError
is not propagated and you getasyncio.TimeoutError
instead. The main problem is in how timeouts are handled in async_timeout package. When exitting the context manager after timeout had passed allCancelledError
exceptions are swallowed andTimeoutError
is raised instead. Unfortunately this is true also if you explicitly cancel the task yourself.The main problem is that you cannot cancel a task that is using aiohttp because you never know if
CancelledError
will be raised.To Reproduce
EDIT: THIS REPRODUCER DOES NOT SHOW THE BEHAVIOUR CORRECTLY - PLEASE REFER TO COMMENTS BELLOW!
Expected behavior
asyncio.CancelledError
should never be suppressed when you cancel the task explicitly.Logs/tracebacks
Python Version
Python 3.8.10
aiohttp Version
3.7.4.post0
multidict Version
4.7.6
yarl Version
1.6.0
OS
Linux
Related component
Client
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: