-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition when multiple processes try to compile a module at once #67
Comments
Thanks for the interesting issue!! Overall, I really like this and would like to implement something like this since it seems clearly useful. The only thing that makes me hesitate is that it seems like a problem that is inherent to build systems - if you run the same build with the same build system concurrently, you are bound to run into a problem. So, why should cppimport fix a problem that essentially comes from the underlying tools? My response to this question would be: cppimport is attempting to make c++ files behave a little bit more like Python modules, while accepting that the correspondence is very leaky. So, any step to make that correspondence a little less leaky is a good thing. Would you be interested in putting together a pull request for this? I'm generally very light touch on code reviews since I want to encourage contributions. So, it won't be an arduous process! =) |
Hi @joshlk, just wanted to check in if this is something you'd be interested in making a PR for! Are you still using your file lock solution? Any issues with that approach crop up over the last couple months? |
Yes it is. Im just trying to get approval from my work but it should be sorted soon |
Awesome! |
Sorry it's taken so long - I had to jump through a bunch of internal hoops. Here is a PR: #71 |
Hi,
Great package by the way!
I've encountered an issue when multiple processes are spawned that all race to compile the same module. This can also occur when multiple processes are spawned on different hosts and share the same network filesystem. Such a situation is common when distributing work between multiple processes or hosts for AI or data analytics.
Here is a demonstration (in the shell):
On my system around 4 out of 100 processes exit in an error. The shell output includes:
These errors don't appear when the binary already exists.
To mitigate this issue in our applications we have used a file lock so that only one process attempts to compile the module at one time. A process first checks if the binary file exists, otherwise attempts to obtain the file lock. If it can't obtain the lock it waits until either the binary exists, can obtain the file lock or times out. Here is an example how it can be done (app code):
It would be great if we could upstream the above to
cppimport
to prevent the race condition errors. If you are happy with this solution I could contribute the above to the appropriate place incppimport
.The text was updated successfully, but these errors were encountered: