Immediate remote cancels #245

goodboy · 2021-10-13T03:20:33Z

This adds greedy remote cancellation to the msg loop making both task and actor-runtime cancel requests more immediate in the sense they are no longer scheduled for run with .start() in the service nursery and instead are _invoke()d directly and served asap.

These are pieces from an ongoing rework of the actor nursery going on over in https://github.com/goodboy/tractor/tree/zombie_lord_infinite, which is basically moving away entirely from the original multiplexed error handling style to per-task-managing-spawned-process style. The idea here is to make the actual nursery implementation much simpler whilst keeping the harder stuff (like was this actor-task cancelled, and do we care?) is done in the spawning back-end on a per task level thereby minimizing global state in supervisor strategy implementations.

Mostly putting this up independent of the spawning/nursery rework stuff to see how CI responds to the timing changes in the core msg loop.

goodboy · 2021-10-13T03:22:26Z

Lul and so the bloodbath begins.

As for `Actor.cancel()` requests, do the same for `Actor._cancel_task()` but use `_invoke()` to ensure correct msg transactions with caller. Don't cancel task cancels on a cancel-all-tasks operation in attempt at more determinism.

This is actually surprisingly easy to grok having gone through a lot of pain understanding edge cases in the zombie lord dev branch. Basically we just need to make sure actors are managed in a 2 step reap sequence. In the "soft" reap phase we wait for the process to terminate on its own concurrently with (maybe) waiting for its portal's final result (if it's a `.run_in_actor()`). If this path is cancelled or errors, then we do a "hard" reap where we timeout and send a signal to the proc to terminate immediately. The only last remaining trick is to tie in the root-is-debugger-aware logic to yet again avoid tty clobbers.

…poll time

With the new fixes to the trio spawner we can expect that both root *and* depth > 1 nursery owning actors will now not clobber any children that are in debug (either via breakpoint or through crashing). The tests changed now include more checks which ensure the 2nd level parent-ish actors also bubble up through into `pdb` and don't kill any of their (crashed) children before they're done themselves debugging.

goodboy · 2021-10-17T12:07:05Z

tractor/_actor.py

+                                log.cancel(
+                                    f"Actor {self.uid} was remotely cancelled; "
+                                    "waiting on cancellation completion..")
+                                await _invoke(self, cid, chan, func, kwargs, is_rpc=False)


Just for reference, this is the crux of the change: instead of scheduling Actor.cancel() and ._cancel_task() as is done for rpc endpoints, we invoke these methods (which really are handlers for special messages that we should add to our SC protocol) immediately via await.

goodboy · 2021-10-17T12:08:20Z

tractor/_actor.py

-                    )
-                    await self.cancel_rpc_tasks(chan)
+
+                # end of async for, channel disconnect vis ``trio.EndOfChannel``


goodboy · 2021-10-17T12:09:49Z

tractor/_debug.py

+):
+    '''
+    Connect to the root actor via a ctx and invoke a task which locks
+    a root-local TTY lock.


drop lock from end.

… problem

goodboy added supervision cancellation SC teardown semantics and anti-zombie semantics labels Oct 13, 2021

This was referenced Oct 14, 2021

Cancel RPC tasks and actor machinery greedily #240

Closed

Less logging, add a CANCEL log level #243

Merged

Base automatically changed from less_logging to master October 14, 2021 17:37

goodboy added 18 commits October 14, 2021 13:39

Make actor runtime cancellation immediate

7643bbf

Don't whine about ; it ain't rpc

41f0992

Do immediate remote task cancels

bb9d9c7

As for `Actor.cancel()` requests, do the same for `Actor._cancel_task()` but use `_invoke()` to ensure correct msg transactions with caller. Don't cancel task cancels on a cancel-all-tasks operation in attempt at more determinism.

Unwind process opening and shield hard reap

46ff558

Lol, fix sub-actor case

2df16c1

Add a maybe-open-debugger helper

893bad7

Reduce some loglevels, stick in comment about blocking till next tick

6203507

Use debugger helper in nursery and spawn tasks

f3a6ab6

Breakout wait_for_parent_stdin_hijack(), increase root pdb checker …

d30ce96

…poll time

Add tty lock acquire ctx mngr

4b2710b

Handle depth > 1 nursery owners which use debug mode

daa28ea

Fix missing task status type

6f5c35d

Change lock helper to take an actor uid tuple

fa317d1

Remove union type for root getter

9d83ef8

Try to handle variable windows errors

7ee121a

Pass uid not actor object

51259c4

goodboy force-pushed the immediate_remote_cancels branch from 9b3e0a5 to 51259c4 Compare October 14, 2021 17:46

goodboy marked this pull request as ready for review October 14, 2021 20:01

goodboy added 4 commits October 15, 2021 09:16

Handle nested multierror case on windows

533457c

Add nooz

a42ec1f

Right, only worry about pdb lock when in debug mode

e4ed0fd

Use type match of expected error

4f222a5

goodboy force-pushed the immediate_remote_cancels branch from 12675f4 to 4f222a5 Compare October 15, 2021 14:26

goodboy added 2 commits October 15, 2021 11:42

Fix pluggy readme link and typo

5d827f7

Don't pop a child entry that was never inserted

5cfac58

goodboy force-pushed the immediate_remote_cancels branch 2 times, most recently from 34ed193 to 3f4384b Compare October 15, 2021 22:25

Grab lock if cancelled during spawn before hard kill

b3c4851

goodboy force-pushed the immediate_remote_cancels branch from 3f4384b to b3c4851 Compare October 15, 2021 22:26

goodboy commented Oct 17, 2021

View reviewed changes

goodboy merged commit 828754d into master Oct 17, 2021

goodboy deleted the immediate_remote_cancels branch October 17, 2021 12:16

goodboy mentioned this pull request Oct 21, 2021

Patch async enter all (addresses #242) #246

Merged

overclockworked64 added a commit to overclockworked64/tractor that referenced this pull request Oct 22, 2021

Get rid of external teardown trigger because goodboy#245 resolves the…

3020793

… problem

This was referenced Oct 23, 2021

Add macos-latest to CI's os matrix #228

Closed

Add zombie tracking to test suite #251

Open

goodboy pushed a commit that referenced this pull request Oct 23, 2021

Get rid of external teardown trigger because #245 resolves the problem

87e3d32

goodboy mentioned this pull request Nov 2, 2021

Alpha3 #259

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Immediate remote cancels #245

Immediate remote cancels #245

goodboy commented Oct 13, 2021

goodboy commented Oct 13, 2021

goodboy Oct 17, 2021

goodboy Oct 17, 2021

goodboy Oct 17, 2021

Immediate remote cancels #245

Immediate remote cancels #245

Conversation

goodboy commented Oct 13, 2021

goodboy commented Oct 13, 2021

goodboy Oct 17, 2021

Choose a reason for hiding this comment

goodboy Oct 17, 2021

Choose a reason for hiding this comment

goodboy Oct 17, 2021

Choose a reason for hiding this comment