You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found this while working through a test set in #129.
There's a branch bug_in_debug with a commit for a test that demonstrates the issue.
I'm not entirely sure what's going on here yet, but the jist is that if a child's child enters the debugger before a cancel message is sent from the root level actor, somehow that subactor two levels down isn't being cancelled when it returns control to trio.
Right now the result is that the target child is not moving on after disengaging pdb.
The root at this point has received an error via Portal.result() from a now cancelled sibling to spawner1 actor, spawner0 and has cancelled the nursery containing both:
Oct 07 06:09:04 (anonymous, 139223, tractor._main)) [WARNING] tractor.tractor._trionics _trionics.py:142 Cancelling nursery in ('anonymous', '2f2d27d5-f13e-4f5c-b110-be9059d0369b')
Oct 07 06:09:04 (anonymous, 139223, tractor._portal.Portal.cancel_actor)) [WARNING] tractor _portal.py:275 Sending actor cancel request to ('spawner0', 'de22443b-7a7f-44b8-8315-bb6ad6d55730') on <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 43629), raddr=('127.0.0.1', 50468)>
Oct 07 06:09:04 (anonymous, 139223, tractor._portal.Portal.cancel_actor)) [WARNING] tractor _portal.py:275 Sending actor cancel request to ('spawner1', 'b12b4b20-f582-449f-934c-a4b8216beb43') on <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 43629), raddr=('127.0.0.1', 50472)>
Oct 07 06:09:04 (anonymous, 139223, tractor._spawn.cancel_on_completion)) [WARNING] tractor _spawn.py:139 Cancelling ('spawner0', 'de22443b-7a7f-44b8-8315-bb6ad6d55730') after error ('spawner0', 'de22443b-7a7f-44b8-8315-bb6ad6d55730')
But, spawn_until_0 gets the tty lock just before each actor receives the cancel msg:
ct 07 06:09:04 (anonymous, 139223, _hijack_stdin_relay_to_child)) [WARNING] tractor.tractor._debug _debug.py:166 Actor ('spawn_until_0', '5e16346a-accf-4e32-90ad-b7faec54e671') ACQUIRED stdin hijack lock
Oct 07 06:09:04 (spawner0, 139226, tractor._actor.Actor._process_messages)) [WARNING] tractor _actor.py:534 ('spawner0', 'de22443b-7a7f-44b8-8315-bb6ad6d55730') was remotely cancelled
Oct 07 06:09:04 (spawner1, 139230, tractor._actor.Actor._process_messages)) [WARNING] tractor _actor.py:534 ('spawner1', 'b12b4b20-f582-449f-934c-a4b8216beb43') was remotely cancelled
Passing c to pdb again results in the next prompt with the following output:
[7] > /home/goodboy/repos/tractor/tractor/_portal.py(213)_return_from_resptype()
-> raise unpack_error(msg, self.channel)
Oct 07 06:09:04 (spawner0, 139226, tractor._debug._breakpoint.<locals>.wait_for_parent_stdin_hijack)) [WARNING] tractor _portal.py:254 Cancelling all streams with ('anonymous', '2f2d27d5-f13e-4f5c-b110-be9059d0369b')
Oct 07 06:09:04 (spawner0, 139226, tractor._debug._breakpoint.<locals>.wait_for_parent_stdin_hijack)) [WARNING] tractor _portal.py:91 <tractor._portal.StreamReceiveChannel object at 0x7f47fc8a0070> is already closed
Oct 07 06:09:04 (spawner0, 139226, cancel)) [WARNING] tractor _actor.py:153 Task <bound method Actor.cancel of <tractor._actor.Actor object at 0x7f47fc6b6a90>> likely errored or cancelled before it started
Oct 07 06:09:04 (spawner0, 139226, tractor._actor.Actor._async_main)) [WARNING] tractor _actor.py:754 Closing all actor lifetime contexts
(Pdb++) Oct 07 06:09:04 (anonymous, 139223, trio._highlevel_serve_listeners._run_handler)) [ERROR] tractor.ipc _ipc.py:54 Stream connection ('127.0.0.1', 50468) broke
Oct 07 06:09:04 (anonymous, 139223, tractor._spawn.new_proc)) [WARNING] tractor _spawn.py:374 Cancelling existing result waiter task for ('spawner0', 'de22443b-7a7f-44b8-8315-bb6ad6d55730')
Oct 07 06:09:05 (anonymous, 139223, tractor._portal.Portal.cancel_actor)) [WARNING] tractor _portal.py:288 May have failed to cancel ('spawner1', 'b12b4b20-f582-449f-934c-a4b8216beb43')
Oct 07 06:09:05 (anonymous, 139223, tractor._portal.Portal.cancel_actor)) [WARNING] tractor _portal.py:288 May have failed to cancel ('spawner0', 'de22443b-7a7f-44b8-8315-bb6ad6d55730')
Oct 07 06:09:05 (anonymous, 139223, tractor._spawn.cancel_on_completion)) [WARNING] tractor _portal.py:288 May have failed to cancel ('spawner0', 'de22443b-7a7f-44b8-8315-bb6ad6d55730')
Continuing again results in the hang:
c
Oct 07 06:12:37 (spawn_until_0, 139235, tractor._debug._breakpoint.<locals>.wait_for_parent_stdin_hijack)) [WARNING] tractor _portal.py:96 Cancelling stream c3fb672d-4e0a-4d7b-99d4-9412c66c93de to ('anonymous', '2f2d27d5-f13e-4f5c-b110-be9059d0369b')
Oct 07 06:12:37 (anonymous, 139223, _hijack_stdin_relay_to_child)) [ERROR] tractor.tractor._debug _debug.py:132 TTY lock released by <Task '_hijack_stdin_relay_to_child' at 0x7faa22f29f40>:('spawn_until_0', '5e16346a-accf-4e32-90ad-b7faec54e671')
Oct 07 06:12:37 (spawn_until_0, 139235, tractor._debug._breakpoint.<locals>.wait_for_parent_stdin_hijack)) [WARNING] tractor _portal.py:254 Cancelling all streams with ('anonymous', '2f2d27d5-f13e-4f5c-b110-be9059d0369b')
Oct 07 06:12:37 (spawn_until_0, 139235, tractor._debug._breakpoint.<locals>.wait_for_parent_stdin_hijack)) [WARNING] tractor _portal.py:91 <tractor._portal.StreamReceiveChannel object at 0x7f0370c7be50> is already closed
Adding trace level logging doesn't tell much other then that spawn_until_0 has left the debugger...
The text was updated successfully, but these errors were encountered:
This appears to demonstrate the same bug found in #156. It looks like
cancelling a subactor with a child, while that child is running sync code,
can result in the child never getting cancelled due to some strange
condition where the internal nurseries aren't being torn down as
expected?
This appears to demonstrate the same bug found in #156. It looks like
cancelling a subactor with a child, while that child is running sync code,
can result in the child never getting cancelled due to some strange
condition where the internal nurseries aren't being torn down as
expected when a `trio.Cancelled` is raised.
I found this while working through a test set in #129.
There's a branch
bug_in_debug
with a commit for a test that demonstrates the issue.I'm not entirely sure what's going on here yet, but the jist is that if a child's child enters the debugger before a cancel message is sent from the root level actor, somehow that subactor two levels down isn't being cancelled when it returns control to
trio
.Right now the result is that the target child is not moving on after disengaging pdb.
The state leading up to it is:
The root at this point has received an error via
Portal.result()
from a now cancelled sibling tospawner1
actor,spawner0
and has cancelled the nursery containing both:But,
spawn_until_0
gets the tty lock just before each actor receives the cancel msg:Passing
c
topdb
again results in the next prompt with the following output:Continuing again results in the hang:
Adding
trace
level logging doesn't tell much other then thatspawn_until_0
has left the debugger...The text was updated successfully, but these errors were encountered: