Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Cluster 2.0 #2038

Closed
wants to merge 155 commits into from
Closed

Cluster 2.0 #2038

wants to merge 155 commits into from

Conversation

AndreasMadsen
Copy link
Member

In 0.6.x the cluster module was just a small extension on the node_process.fork() function. The propose of this pull request is to make the cluster module easier to setup, without removing the basic functionality. It also add a lot of new functionality there make it possible for userland plugins to interact with the cluster module.

Documentation:
The documentation can be found here.

This request fix the following issues:

This request contain the following changes:

  • Added .eachWorker(fn) method.
  • Allow workers to be another file, using .setupMaster()
  • Allow parsing arguments to workers, using .fork([env])
  • Internal messages won't be send to the message event
  • Allow workers to commit suicide, using worker.destroy()
  • Moved process from the child_process.fork to cluster.fork().process.
  • Make the API in master and workers equal using an internal new Worker() object.
  • Easy acces to worker details, inside the worker.
  • Added .disconnect() method to make a gracefully shutdown
  • Added events: fork, listening, disconnect
  • Made cluster silent
  • You can use SIGTERM, SIGINT and SIGQUIT on a worker
  • Add cluster.workerOnline property to get how many workers there are online
  • Support SIGTERM, SIGINT, SIGQUIT and SIGCHLD on master.
  • Create a cluster.disconnect() method.
  • Create a cluster.destroy() method.
  • cluster.isWorker and cluster.isMaster is now protected
  • Kill all workers the moment the master exit
  • Allow echo callback from both worker and master
  • Allow echo callback to be used by userland
  • Added a workerID, it's a number there will be reused when the worker spawn/respawn.
  • Added a uniqueID, if is a unique number there will change for each spawn/respawn.
  • Throw error when conflict between .autoFork() and manual .fork().
  • Prevent and detect respawn infinite loops, caused by errors in workers.
  • Emit a citicalError event when a respawn infinite loops is detected.
  • Added startup property to worker object.
  • Added a silent option to the setupMaster to prevent output from workers to be showen in the master as outout.
  • Add a zero-downtime restart method.
  • When getting a SIGHUP signal the cluster will restart graceful.
  • Added settings object to the cluster
  • Added setup to cluster, this will emit when setupMaster execute.
  • You now kill worker using destroy not kill
  • The worker event kill is changed to death
  • Updated and improve documentation
  • Added testcases (a lot of testcases)

This changes was made in other modules:

  • To support silent option, the child_process.fork() has been updated to take silent as a option in the options argument. This is a very small change.

This changes will be included in the near future:

…ster, added options argument; HandleWorkerMessage: emit events, echo allwas queryID if required, added listening command; fork: give each worker a protected workerID; queryMaster: made public, renamed to worker.send; _getServer: send a listening command to master.
@AndreasMadsen
Copy link
Member Author

I'm in the process of writing the documentation.

@AndreasMadsen
Copy link
Member Author

I finished documenting the changes i have made.
However in the documentation i have:

  • renamed cluster.worker.send to cluster.worker.respond
  • added a message event cluster.worker.on('message')
  • changed the behavouore cluster.fork().on('message')

This changes has not yet been made in the cluster.js native module.

@AndreasMadsen
Copy link
Member Author

I has added @visionmedia commit 58b558a to this pull request.

@AndreasMadsen
Copy link
Member Author

I am considering to make the Worker class useful to both master and worker. In that way the API will almost be the same for the master and worker.

In master
A Worker object could in the master be obtained by cluster.workers[id].

  • Its message event will be emitted when it receive non-internal data form the worker.
  • Its kill method would set a suicide state, and kill the process
  • Its send method would send a message, and call a callback when a echo is received from the worker.

This is all ready done.

In worker
The Worker object could in the worker be obtained by cluster.worker.

  • Its message event will be emitted when it receive non-internal data form the master.
  • Its kill method would send a suicide state to the master, when the callback is called it will kill itself
  • Its send (currently respond) method would send a message to the master, and call a callback when a echo is received from the master.

@AndreasMadsen
Copy link
Member Author

I noticed that my commits don't fit the make jslint should i fix that or do it in another pull request when this is pulled?

@AndreasMadsen
Copy link
Member Author

I have updated the pull request so both the worker and master are using the Worker class.
This had the side effekt that the code got a lot simpler.
At 5c1d481 I was worried about the code would some jiberish, since i had to make individual changes for both worker and master, but I do not this this is a problem any more.

I do not have any more plans for changes, but i would like to discuss:

  • Why do new workers need to have a new id, when they can just use the one from the dead worker.
  • Preventing evil error wheels.
  • The finalized event (when all workers are listening).
  • Additional testcases, there was not a single one before my commit

Update
I will try to fix the issues there are reported

@AndreasMadsen
Copy link
Member Author

As proposed in issue #2088 i have added .disconnect() method in commit 25f5793. It is extremely difficult to test if the worker do exit when/if all connections stops, since there are no IPC.

@tomyan
Copy link

tomyan commented Jan 2, 2012

Really liking the look of these changes :-) Wondering if it will be possible to set the exec option per worker that's spawned. I have a server that behaves differently depending on command line options. I'd like to be able to exercise these different behaviours from a single set of tests. I could use child_process.fork to run the server multiple times, but maybe it would be more convenient to be able to run workers that do different jobs from the same master?

Thanks

Tom

@AndreasMadsen
Copy link
Member Author

@tomyan sure this is possible :)
However I'm not sure what the purpose is (could you write a simple example or API change suggestion).

Please note: This module is not made for easy testing. And too have different workers running do not mak much sense, since there is no way to know how the OS will balance this and then clients will be treaded differently :/

Can you not use the env option in .fork(env) or use child_process.fork to spawn not a worker but a master/cluster.

I'm open to suggestion but I do not see the purpose of this yet.

@AndreasMadsen
Copy link
Member Author

There was some errors related to when there is a critical error in the worker and autoFork is on. When the error was detected it tried to disconnect all workers (good) but did it multiply times on the same worker, resulting on some also random errors. Also the critical error handling is tricky but I think the latest patch to a good job.

@AndreasMadsen
Copy link
Member Author

That was a difficult merge!

@isaacs
Copy link

isaacs commented Mar 19, 2012

Closing this. Everything has been split up into separate reqs.

@isaacs isaacs closed this Mar 19, 2012
@AndreasMadsen
Copy link
Member Author

@isaacs true, but could you also close this #2060

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants