-
-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Put process-creation into a thread #1109
Comments
It's easy to find benchmarks of how long it takes a Windows process to start up (i.e., from @eryksun Sorry to bother, but you seem to know all kinds of mysterious things about Windows internals. Do you happen to know at what point during process startup |
Interesting observation. |
On Linux,
So that seems to make clear that the parent doesn't wait for the child at all..... though it's somewhat contradicted by these comments in the Python source claiming that Linux's macOS's
...which strongly suggests that the parent process process blocks to do significant disk I/O before (Of course it's always a little unclear how much we should care about disk I/O... does anyone run macOS executables off of spinning-platters anymore?) It's also possible to force |
OK yeah looking at the current glibc sources: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/spawni.c;h=c1abf3f9608642fc530baa514b1703a9e7a1a0f2;hb=HEAD ...the linux man page is a complete pack of lies. Oh also, if subprocess.py decides to use So I guess we can consider it established that on Unixes, On Windows, it might do file-I/O, and might also do other slow stuff, we're not sure. |
@asvetlov Yeah... Twisted treats file-I/O as non-blocking, and I'm not sure if Trio should do the same or not, so I'm not sure if this carries over or not. It would be very easy to push subprocess spawns into a thread. It would add some overhead. On my Linux laptop, I get:
On a random Mac mini I happen to have access to, I get:
I'm not entirely sure I trust these numbers aren't paying some extra overhead as the GC calls |
OK yeah Linux laptop:
Mac mini (have to drop the number of loops to avoid a resource exhaustion error....):
So the qualitative conclusions don't change. |
Some war stories about bugs in Windows process spawning here:
While these are bugs, after fixing the bugs the best-case |
Yeah, in asyncio we already start subprocesses by |
The issue with a 700 ms delay that's associated with querying extended attributes is likely due to the kernel security function The time related to the initial setup in the process itself is unrelated. The initial thread is suspended. When A special asynchronous procedure call (APC) for |
In #1113 we ended up not putting process creation into a thread, but instead just setting up the public API changes necessary to do that later. So I guess I'll re-open this as the "actually put it in a thread" issue. |
So the remaining todo item here is: once |
[Original title: Is process creation actually non-blocking?]
In #1104 I wrote:
But uh... it just occurred to me that I'm actually not sure if this is true! I mean, right now we just use
subprocess.Popen
, which is indeed a synchronous interface. And on Unix, spawning a new process and getting a handle on it is generally super cheap – it's justfork
. Theexec
is expensive, but that happens after the child has split off – the parent doesn't wait for it.But on Windows, you call
CreateProcess
, which I think might block the caller while doing all the disk access to set up the new process? Process creation on Windows are notoriously slow, and I don't know how much of that the parent process has to sit and wait for beforeCreateProcess
can return.And even on Unix, you use
vfork
, in which case the parent process is blocked until theexec
. And on recent Pythons,subprocess
usesposix_spawn
. On Linux this might usevfork
(I'm not actually sure?). And on macOS it uses a nativeposix_spawn
syscall, so who knows what that does. Again, this might not be a big deal... maybe the parent gets to go again the instant the child callsexec
, or sooner, without having to wait for any disk access or anything. But I'm not sure!So... we should figure this out. Because if process creation is slow enough that we need to treat it as a blocking operation, we might need to change the process API to give it an async constructor. (Presumably by making
Process.__init__
private, and addingawait trio.open_process(...)
– similar to how we handle files.)The text was updated successfully, but these errors were encountered: