Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Children Failed to Join #2

Open
ruidan-li opened this issue Oct 4, 2018 · 1 comment
Open

Children Failed to Join #2

ruidan-li opened this issue Oct 4, 2018 · 1 comment

Comments

@ruidan-li
Copy link

Hello!

I am currently using the kinesis-python (https://github.com/NerdWalletOSS/kinesis-python) library, which use your offspring library to have multiple shard readers, and it turns out that the terminated children fail to join sometime. My main process will do some stuff and call sys.exit() when received SIGINT or SIGTERM, and based on the log, it shows "Caught signal 15" (https://github.com/borgstrom/offspring/blob/master/src/offspring/process.py#L112). And then self.end() is called and so is sys.exit() in the run(). However, the children sometimes never join, and after tracing, it is stuck at os.waitpid() (in multiprocessing/forking.py). Trying to figure out what is going on, I placed the sys.exit() in the signal_handler and it works. So I am wondering if it is possible to refactor the SubprocessLoop.run and the signal_handler, to place self.end() and sys.exit() inside the signal_hander instead. I would also be happy to hear your thoughts on this issue!

Thanks,
Ruidan

@Sytten
Copy link

Sytten commented Jul 23, 2019

Hum I spent 4h today to arrive at the exact same conclusion. This library is broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants