-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: ignore queryServer msgs on disconnection #4465
Conversation
/cc @Trott |
The altered test doesn't fail with current master. Is there an easy way to devise a test that will fail reliably without this change? |
This is not necessarily a bad thing, but this change greatly increases the frequency of Stress test against master (should fail here and there): https://ci.nodejs.org/job/node-stress-single-test/257/nodes=win2012r2/console Stress test with this change (fails a lot more): https://ci.nodejs.org/job/node-stress-single-test/260/nodes=win2012r2/console (By the way, that PR could use a |
@Trott I've updated the test so it always fails in my |
Finally I have not modified |
})); | ||
|
||
const cpus = os.cpus().length; | ||
const tries = cpus * 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe move these declarations up above the worker1.on('message', ...)
because someone reading the test will probably expect tries
to be defined before it is used on line 22?
LGTM. One minor nit that you can ignore if you want. CI: https://ci.nodejs.org/job/node-test-commit/1587/ Nice work! |
The test is timing out on SmartOS (which is based on Solaris): @jbergstroem Might this have a cause that lies elsewhere? I feel like there's been an uptick in SmartOS issues lately, but I might be imagining it... |
@Trott the "only" changes made recently that would affect all tests are yesterdays land of where sockets are written and today's change of where we write the temporary directory ($HOME/node-tmp). |
So the test is probably legitimately timing out (probably hanging) on SmartOS and only SmartOS. That's kind of a bummer. But that's why we test... |
LGTM |
This still needs a tweak so the test doesn't time out on SmartOS on CI. (Just putting this note here so no one merges this without realizing that or something.) |
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`: ``` assert.js:89 throw new assert.AssertionError({ ^ AssertionError: Resource leak detected. at removeWorker (cluster.js:321:7) at ChildProcess.<anonymous> (cluster.js:356:9) at ChildProcess.g (events.js:264:16) at emitTwo (events.js:88:13) at ChildProcess.emit (events.js:173:7) at Process.ChildProcess._handle.onexit (internal/child_process.js:200:12) ``` The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch.
@Trott I've updated the PR to limit the number of workers. Let's see if now it passes in all platforms. Thanks |
CI is green! |
✅ LGTM |
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: nodejs#4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Landed in f9f1dd9 |
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: #4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: #4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: #4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: #4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: nodejs#4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: nodejs#4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
It avoids the creation of unnecessary handles. This issue is causing intermitent failures in `test-cluster-disconnect-race` on `FreeBSD` and `OS X`. The problem is that the `worker2.disconnect` is being called on the master before the `queryServer` is handled, causing the worker to be deleted, then the Server handle is created afterwards. Later on, when `removeWorker` is called from the `exit` handler, there are no workers left, but one handle, thus the `AssertionError`. Add a new `test/sequential/test-cluster-disconnect-leak` based on `test-cluster-disconnect-race` that creates lots of workers and fails consistently without this patch. PR-URL: nodejs#4465 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Johan Bergström <[email protected]> Reviewed-By: Rich Trott <[email protected]>
It avoids the creation of unnecessary handles. This issue is causing
intermitent failures in
test-cluster-disconnect-race
onFreeBSD
and
OS X
:The problem is that the
worker2.disconnect
is being called on themaster before the
queryServer
is handled, causing the worker tobe deleted, then the Server handle is created afterwards. Later on,
when
removeWorker
is called from theexit
handler, there are noworkers left, but one handle, thus the
AssertionError
.Modify
test-cluster-disconnect-race
to check there are no leaks onexit.