You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cleanup internal data-structures when process has been forked (#2676)
Closes#1921
### What
The crux of the problem is the following:
> The child process is created with a single thread—the one that called
fork(). The entire virtual address space of the parent is replicated in
the child, ...
The major consequence of this is that our global `RecordingStream`
context is duplicated into the child memory space but none of the
threads (batcher, tcp-sender, dropper, etc.) are duplicated. When we go
to call `connect()` inside the forked process, we try to replace the
global recording-stream, which subsequently tries to call drop on the
forked copy of `RecordingStreamInner` . However, without any existing
threads to process the flush, things just hang inside that flush call.
We take a few actions to alleviate this problem:
1. Introduce a new SDK function: `cleanup_if_forked` which compares the
process-ids on existing globals and forgets them as necessary.
1. In python, use `os.register_at_fork` to proactively call
`cleanup_if_forked` in any forked child processes.
1. Also add a call to `cleanup_if_forked` inside of init() in case we're
forking through a more exotic mechanism.
1. Check for the forked state anywhere we potentially flush to avoid
deadlocks and produce a visible user-error.
Additionally, it turns out that forked processes bypass the normal
python `atexit` handler which means we don't get proper shutdown/flush
behavior when the forked processes terminate. To help users workaround
this, we introduce a `@shutdown_at_exit` decorator which can be used to
decorate functions launched via multiprocessing.
### Testing
On linux:
```
$ python examples/python/multiprocessing/main.py
```
observe demo exits cleanly and all data shows in viewer.
### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2676) (if
applicable)
- [PR Build Summary](https://build.rerun.io/pr/2676)
- [Docs
preview](https://rerun.io/preview/pr%3Ajleibs%2Fcleanup_if_forked/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Ajleibs%2Fcleanup_if_forked/examples)
re_log::error_once!("Fork detected while dropping RecordingStreamInner. cleanup_if_forked() should always be called after forking. This is likely a bug in the SDK.");
360
+
return;
361
+
}
362
+
356
363
// NOTE: The command channel is private, if we're here, nothing is currently capable of
0 commit comments