Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky test-child-process-fork-net #5122

Closed
Trott opened this issue Feb 6, 2016 · 11 comments
Closed

Investigate flaky test-child-process-fork-net #5122

Trott opened this issue Feb 6, 2016 · 11 comments
Labels
arm Issues and PRs related to the ARM platform. child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests.

Comments

@Trott
Copy link
Member

Trott commented Feb 6, 2016

Example failure:

@Trott Trott added test Issues and PRs related to the tests. arm Issues and PRs related to the ARM platform. labels Feb 6, 2016
@mscdex mscdex added the child_process Issues and PRs related to the child_process subsystem. label Feb 6, 2016
@Trott
Copy link
Member Author

Trott commented Feb 28, 2016

@mscdex
Copy link
Contributor

mscdex commented Apr 9, 2016

@Trott
Copy link
Member Author

Trott commented Apr 9, 2016

Seems like the capacity for the Raspberry Pi 2 devices to connect to localhost has diminished a lot. That doesn't make any sense, but there's also no arguing with the fact that we've probably fixed a half a dozen tests or more by simply reducing their connections to localhost from modest (in the 10-200 range) to tiny (less than 20). Not sure what's up there. Looping in @nodejs/build for ideas... In the meantime, I'll experiment with reducing the number of clients/connections in this test to see if it fixes things...

@Trott
Copy link
Member Author

Trott commented Apr 9, 2016

Stress test against current master for a benchmark on failure frequency: https://ci.nodejs.org/job/node-stress-single-test/588/nodes=pi2-raspbian-wheezy/console

@Trott
Copy link
Member Author

Trott commented Apr 9, 2016

And here's one for reducing the connections from 10 to 8: https://ci.nodejs.org/job/node-stress-single-test/589/console

@Trott
Copy link
Member Author

Trott commented Apr 10, 2016

Argh! So, the test does not (perhaps can not?) clean up after itself if it fails, so once it fails once, a bunch of other things fail with EADDRINUSE. This includes this test itself if it is run again as in a stress test, so we can't even really get much of a stress test other than "Yeah, it fails once in a while." But we can't really gauge relative flakiness of different scenarios.

For what it's worth, the test has not been significantly changed since it was created in 2012.

@Trott
Copy link
Member Author

Trott commented Apr 10, 2016

Reducing to four connections and trying again with the stress test. If that doesn't fix it, then something must be really wrong... https://ci.nodejs.org/job/node-stress-single-test/590/nodes=pi2-raspbian-wheezy/console

@mscdex
Copy link
Contributor

mscdex commented Apr 10, 2016

Argh! So, the test does not (perhaps can not?) clean up after itself if it fails, so once it fails once, a bunch of other things fail with EADDRINUSE.

Yeah, I've fixed some tests in the past that had this particular cascading effect, but I'm sure there's probably a lot still out there that do not clean up properly on unexpected failure.

@Trott
Copy link
Member Author

Trott commented Apr 10, 2016

Reducing from 10 to 4 worked. Might just go with that. /cc @AndreasMadsen in case there's any additional insight to why 10 in the first place. (I've been assuming the answer is "why not 10?", but I don't actually know.)

Might also take a moment to see if it's feasible to more definitively close the server on failure...

Trott added a commit to Trott/io.js that referenced this issue Apr 10, 2016
Reduce client connections from 10 to 4 in a test that is causing issues
on Raspberry Pi 2 devices in CI.

Fixes: nodejs#5122
@Trott
Copy link
Member Author

Trott commented Apr 10, 2016

Proposed fix, or mitigation at least: #6138

@AndreasMadsen
Copy link
Member

@Trott Hmm, it is such a long time ago that I don't remember the specific reason. But it doesn't look like it is a stress test, so I think reducing it to 4 connections is reasonable.

@Trott Trott closed this as completed in aba035f Apr 12, 2016
MylesBorins pushed a commit that referenced this issue Apr 19, 2016
Reduce client connections from 10 to 4 in a test that is causing issues
on Raspberry Pi 2 devices in CI.

Fixes: #5122
PR-URL: #6138
Reviewed-By: Benjamin Gruenbaum <[email protected]>
MylesBorins pushed a commit that referenced this issue Apr 20, 2016
Reduce client connections from 10 to 4 in a test that is causing issues
on Raspberry Pi 2 devices in CI.

Fixes: #5122
PR-URL: #6138
Reviewed-By: Benjamin Gruenbaum <[email protected]>
MylesBorins pushed a commit that referenced this issue Apr 20, 2016
Reduce client connections from 10 to 4 in a test that is causing issues
on Raspberry Pi 2 devices in CI.

Fixes: #5122
PR-URL: #6138
Reviewed-By: Benjamin Gruenbaum <[email protected]>
jasnell pushed a commit that referenced this issue Apr 26, 2016
Reduce client connections from 10 to 4 in a test that is causing issues
on Raspberry Pi 2 devices in CI.

Fixes: #5122
PR-URL: #6138
Reviewed-By: Benjamin Gruenbaum <[email protected]>
MylesBorins pushed a commit that referenced this issue May 17, 2016
Reduce client connections from 10 to 4 in a test that is causing issues
on Raspberry Pi 2 devices in CI.

Fixes: #5122
PR-URL: #6138
Reviewed-By: Benjamin Gruenbaum <[email protected]>
MylesBorins pushed a commit that referenced this issue May 18, 2016
Reduce client connections from 10 to 4 in a test that is causing issues
on Raspberry Pi 2 devices in CI.

Fixes: #5122
PR-URL: #6138
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arm Issues and PRs related to the ARM platform. child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants