Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

child_process.spawn/exec blocks main thread while spawning child process #9250

Closed
mgartner opened this issue Feb 18, 2015 · 6 comments
Closed

Comments

@mgartner
Copy link

Offloading expensive computations to other processes and cores using spawn or exec blocks the main thread for significant portions of time. Obviously the spawned process is not blocking node's event loop while it is running, but the act of spawning the process within node seems to be expensive and blocks. When dealing with a high number of concurrent spawns, the major bottleneck is the node event loop blocking while trying to create child processes.

I noticed this by spawning hundreds of concurrent child processes and noting that only 1 CPU core was reaching 90+% utilization. The other cores running the child processes had less than 50% utilization.

Is there a more efficient way to spawn child processes than what is currently done in node? Or is there a way to spawn processes on a background node thread instead of blocking the main thread?

@mgartner mgartner changed the title child_process.spawn/exec blocks main thread child_process.spawn/exec blocks main thread while spawning child process Feb 18, 2015
@davepacheco
Copy link

If it works for your use-case, you could try: https://github.com/davepacheco/node-spawn-async

@cjihrig
Copy link

cjihrig commented Feb 18, 2015

@davepacheco that's pretty cool. I'm going to close this and say that if you need to spawn a lot of processes, maybe you should use something like node-spawn-async. If I'm wrong, we can revisit this.

@cjihrig cjihrig closed this as completed Feb 18, 2015
@mgartner
Copy link
Author

I'll have to try node-spawn-async. It looks like it just uses 1 additional thread, so I'm not sure that will help the big picture because that thread will just end up blocking. I think it would help keep other requests that don't spawn child_process from being blocked on the main event loop though. It hasn't been updated in 2 years so that also concerns me a bit, but it's worth a shot.

Even so, are there any significant reasons why the standard child_process module shouldn't or can't be improved to prevent blocking the main loop? It seems like spawning a child process is a common way to handle computationally intensive tasks and that it is significantly bottlenecked by the rate at which child processes can be spawned by one thread.

@davepacheco
Copy link

From an API perspective, I think it'd be reasonable for Node to provide non-blocking APIs here. I don't know how challenging that is from an implementation perspective.

That said, forking and exec'ing are relatively heavyweight operations. I don't think it's a good idea to fork/exec at high rates as part of normal operation. (Besides the performance implications, it's often challenging to build robust argument passing, error handling, and error reporting for shell-like use.) spawn-async exists (and uses only a single worker process) in order to avoid latency bubbles for occasional forks (i.e., once/second or less), not to maximize throughput of forks/execs. As for its age: we've been using it in production at Joyent as part of the Manta service continuously since the module was created, and we do a few tens of thousands of spawns with it per day. It's not that it's abandoned -- it's just that it's basically done for what we wanted from it.

@mgartner
Copy link
Author

Thanks for the insight into spawn-async!

I agree that it's not ideal to fork very often - for example on every request. This is pretty conflicting, though. Node only provides access to a single thread in user-space, so to prevent blocking the main thread and utilize multi-core machines for CPU-intensive tasks, you have only two main options - spawn a child process, or write a server (http/unix socket) in another language and offload the work there. Spawning a child process has been shown to be too expensive to do hundreds of times concurrently, so one of the two options is basically out of the question in a production environment

Node is obviously great for applications that spend a lot of time waiting on I/O. However, in the long term I don't think "use something else" is a great long-term answer for doing CPU intensive tasks, especially when Javascript happens to be one of the faster scripting languages. I'd like to take advantage of that.

@davepacheco
Copy link

Forking isn't off the table for increased parallelism. You're just much better off by forking worker processes at startup and then not forking during request handling. This isn't really very different from multi-process models (like apache prefork) or even multi-threaded models (like thread pools). In all of these cases, you're much better off amortizing the cost of creating the workers (whether they're processes or threads) across a large number of requests. The built-in cluster module, which admittedly has its flaws, provides a pattern for doing this. Outside of the cluster module, we use the pattern of forking (ncpus) worker processes for each logical Node service and then either fronting those processes with haproxy or else having clients know about all of the workers and load-balancing on the clients.

You're right that these are important considerations, and they're not trivial. But I don't think Node is intrinsically any worse off here than anything else.

acdvorak added a commit to material-components/material-components-web that referenced this issue Nov 3, 2017
Responses to queued clients are currently blocked on the current child_process.spawn() invocation. See nodejs/node-v0.x-archive#9250
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants