-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Immediate remote cancels #245
Conversation
Lul and so the bloodbath begins. |
As for `Actor.cancel()` requests, do the same for `Actor._cancel_task()` but use `_invoke()` to ensure correct msg transactions with caller. Don't cancel task cancels on a cancel-all-tasks operation in attempt at more determinism.
This is actually surprisingly easy to grok having gone through a lot of pain understanding edge cases in the zombie lord dev branch. Basically we just need to make sure actors are managed in a 2 step reap sequence. In the "soft" reap phase we wait for the process to terminate on its own concurrently with (maybe) waiting for its portal's final result (if it's a `.run_in_actor()`). If this path is cancelled or errors, then we do a "hard" reap where we timeout and send a signal to the proc to terminate immediately. The only last remaining trick is to tie in the root-is-debugger-aware logic to yet again avoid tty clobbers.
With the new fixes to the trio spawner we can expect that both root *and* depth > 1 nursery owning actors will now not clobber any children that are in debug (either via breakpoint or through crashing). The tests changed now include more checks which ensure the 2nd level parent-ish actors also bubble up through into `pdb` and don't kill any of their (crashed) children before they're done themselves debugging.
9b3e0a5
to
51259c4
Compare
12675f4
to
4f222a5
Compare
34ed193
to
3f4384b
Compare
3f4384b
to
b3c4851
Compare
log.cancel( | ||
f"Actor {self.uid} was remotely cancelled; " | ||
"waiting on cancellation completion..") | ||
await _invoke(self, cid, chan, func, kwargs, is_rpc=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for reference, this is the crux of the change: instead of scheduling Actor.cancel()
and ._cancel_task()
as is done for rpc endpoints, we invoke these methods (which really are handlers for special messages that we should add to our SC protocol) immediately via await
.
) | ||
await self.cancel_rpc_tasks(chan) | ||
|
||
# end of async for, channel disconnect vis ``trio.EndOfChannel`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo via
): | ||
''' | ||
Connect to the root actor via a ctx and invoke a task which locks | ||
a root-local TTY lock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop lock
from end.
This adds greedy remote cancellation to the msg loop making both task and actor-runtime cancel requests more immediate in the sense they are no longer scheduled for run with
.start()
in the service nursery and instead are_invoke()
d directly and served asap.These are pieces from an ongoing rework of the actor nursery going on over in https://github.com/goodboy/tractor/tree/zombie_lord_infinite, which is basically moving away entirely from the original multiplexed error handling style to per-task-managing-spawned-process style. The idea here is to make the actual nursery implementation much simpler whilst keeping the harder stuff (like was this actor-task cancelled, and do we care?) is done in the spawning back-end on a per task level thereby minimizing global state in supervisor strategy implementations.
Mostly putting this up independent of the spawning/nursery rework stuff to see how CI responds to the timing changes in the core msg loop.