Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disadvantages of using cluster api without any listening sockets #970

Closed
pimlie opened this issue Nov 14, 2017 · 4 comments
Closed

Disadvantages of using cluster api without any listening sockets #970

pimlie opened this issue Nov 14, 2017 · 4 comments

Comments

@pimlie
Copy link

pimlie commented Nov 14, 2017

I was wondering if there are any major disadvantages of using the Cluster api to parallelise cpu-bound tasks that don't use/require any listening sockets?

It seems that at the moment the Cluster api is the only built-in api that can be used for general master/worker setups. Eg it provides all functionalities for creating a master process that delegates jobs to workers, regardless whether those workers are using listening sockets or not. But the documentation only speaks about using the cluster package for clustering connections.

To give a real-life example, currently I am using Cluster in the nuxt-generate-cluster package. This package provides a cli command to nuxt to generate all dynamic pages using multiple processes. That works great, but it still feels like using the Cluster api is wrong because the cli command just runs once and doesnt listen for incoming inconnections.
That said, I looked at implementing a master/worker solution myself using child_process directly but I dont see any benefits for that except for providing me extra work due to possible bugs I introduce.

So what is the general opinion about this? Should you only use the Cluster api when using listening sockets? Or is it ok to use it without any?

Follow-up questions:

  • If it is 👍
    Should the docs reflect that it is ok as well?
  • If it is 👎
    Would it be an idea to split the master/worker implementation from Cluster into a separate api which Cluster itself will extend as well?
@bnoordhuis
Copy link
Member

Seem fine to me, no reason you couldn't or shouldn't use it that way.

As to whether it should be called out in the documentation, I'm leaning towards no. Documentation should be as simple and straightforward as possible. Docs that wander all over the place are terrible for getting things done.

Cluster's primary use case is networking because that's one of the biggest if not the biggest use case for node, so IMO it's proper that networking is what the documentation talks about.

Reasonable / unreasonable?

@pimlie
Copy link
Author

pimlie commented Nov 14, 2017

Thanks for your quick response. I looked at the cluster implementation but unfortunately wasn't really sure how child.js and a worker relate to each other. I guess child.js is used somewhere deeper within node. But is it correct there is no (or no major) overhead due to the networking part as that is only introduced once you really start listening for a socket? Afaik that overhead is only introduced once a worker has called listen() on the http/net api which will trigger the queryServer method?

I agree that documentation should be as simple as possible, but on the other hand you could say the docs are a bit ambiguous which is also not good. E.g. all the examples in the docs only speak indeed about Cluster's primary use for networking and even the name Cluster screams much more 'just for networking' then not. I would think that if you could use Cluster networking-less without any overhead/disadvantages a configuration option like schedulingPolicy would at least mention it only applies when you actually use networking?

Maybe just a small note would be enough under How it works? Something like, Note: although the primary use for the cluster module is networking, this module can also be used without additional/major(?) overhead for a generic networking-less master/worker implementation?

Just because whats trivial for one (probably you?) may be inconclusive for another (I guess me) 😉

@bnoordhuis
Copy link
Member

But is it correct there is no (or no major) overhead due to the networking part as that is only introduced once you really start listening for a socket?

Correct. It's at its core just a wrapper around child_process with some builtin smarts for sending sockets across processes.

Maybe just a small note would be enough under How it works?

I can't promise it gets accepted but you're welcome to try a pull request.

@pimlie
Copy link
Author

pimlie commented Nov 14, 2017

Great, thanks for the support!

@pimlie pimlie closed this as completed Nov 14, 2017
addaleax pushed a commit to nodejs/node that referenced this issue Nov 18, 2017
Although the primary use-case for the cluster module is networking, the
module provides a generic master/worker interface that could also be
used if you dont use networking at all. Currently the docs are a bit
ambiguous about this as only the primary use-case is ever mentioned,
this remark should clarify that the cluster module can also be used
without disadvantages if you dont use networking.

PR-URL: #17031
Refs: nodejs/help#970
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
MylesBorins pushed a commit to nodejs/node that referenced this issue Dec 12, 2017
Although the primary use-case for the cluster module is networking, the
module provides a generic master/worker interface that could also be
used if you dont use networking at all. Currently the docs are a bit
ambiguous about this as only the primary use-case is ever mentioned,
this remark should clarify that the cluster module can also be used
without disadvantages if you dont use networking.

PR-URL: #17031
Refs: nodejs/help#970
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
gibfahn pushed a commit to nodejs/node that referenced this issue Dec 19, 2017
Although the primary use-case for the cluster module is networking, the
module provides a generic master/worker interface that could also be
used if you dont use networking at all. Currently the docs are a bit
ambiguous about this as only the primary use-case is ever mentioned,
this remark should clarify that the cluster module can also be used
without disadvantages if you dont use networking.

PR-URL: #17031
Refs: nodejs/help#970
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
MylesBorins pushed a commit to nodejs/node that referenced this issue Dec 19, 2017
Although the primary use-case for the cluster module is networking, the
module provides a generic master/worker interface that could also be
used if you dont use networking at all. Currently the docs are a bit
ambiguous about this as only the primary use-case is ever mentioned,
this remark should clarify that the cluster module can also be used
without disadvantages if you dont use networking.

PR-URL: #17031
Refs: nodejs/help#970
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
gibfahn pushed a commit to nodejs/node that referenced this issue Dec 20, 2017
Although the primary use-case for the cluster module is networking, the
module provides a generic master/worker interface that could also be
used if you dont use networking at all. Currently the docs are a bit
ambiguous about this as only the primary use-case is ever mentioned,
this remark should clarify that the cluster module can also be used
without disadvantages if you dont use networking.

PR-URL: #17031
Refs: nodejs/help#970
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants