Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky test-child-process-emfile #3430

Closed
wants to merge 5 commits into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Oct 19, 2015

Require the test setup to obtain an EMFILE error and not ENFILE as
ENFILE means there is a race condition with other processes that may
close files before spawn() is called by the test.

This test as is had failed 6 out of 26 times that are in the Jenkins history for freebsd101-32 at https://ci.nodejs.org/job/node-test-commit-freebsd/. With the fix, it has run successfully 12 out of 12 times. (UPDATE: 14 out of 14 now.)

Fixes: #2666

@Trott Trott added the test Issues and PRs related to the tests. label Oct 19, 2015
@Trott
Copy link
Member Author

Trott commented Oct 19, 2015

CI: https://ci.nodejs.org/job/node-test-pull-request/531/

EDIT: Pushed a commit to remove some cruft that wasn't needed. New CI: https://ci.nodejs.org/job/node-test-pull-request/532/

@mscdex mscdex added the child_process Issues and PRs related to the child_process subsystem. label Oct 19, 2015

if (common.isWindows) {
console.log('1..0 # Skipped: no RLIMIT_NOFILE on Windows');
return;
}

var openFds = [];
const ulimit = Number(child_process.execSync('ulimit -n'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably exec ulimit -Hn here because node sets RLIMIT_NOFILE to the hard limit at startup. I'm not sure how portable the -H flag is though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of -H requires super-user privileges, at least on OS X. This means make test will fail for people unless they run it as root.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even when just querying the hard limit? That seems to work on my 10.8 macbook, FWIW.

That did remind me of something: ulimit can print "unlimited". That gets coerced to NaN and will fail the ulimit > 64 check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re portability: ulimit -Hn works for me without root privileges on our smartos and freebsd boxes (also works locally on osx 10.11).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ulimit -Hn works fine on OS X without super-user privileges. It's setting (e.g., ulimit -Hn 64) that requires super-user.

Good catch on the unlimited string. I'll fix that...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbergstroem I guess I can go and run it on CI and see what happens...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Trott breaks on setting smartos though: -bash: ulimit: open files: cannot modify limit: Invalid argument

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use ulimit -Hn for querying, ulimit -n num is fine for setting it because it sets both the hard and soft limit. Just querying for ulimit -n would be wrong in the (admittedly unlikely) case that the soft limit is <= 64 but the hard limit is > 64.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbergstroem ulimit -Hn <integer> works without super-user privs for me if and only if I first set the soft limit as low or lower than what I'm trying to set the hard limit to. (In retrospect: Duh!) And sudo doesn't fix it. I thought it did because I was testing in different shell instances and must have had the soft limit already set appropriately in one of them.

TL;DR: You were right to be surprised (skeptical?) of my results.

The @bnoordhuis plan of setting with -n and querying with -Hn makes the most sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. The error message you got on smartos is what I was getting on OS X so maybe check that you weren't trying to set the hard limit lower than the soft limit? (Although it doesn't really matter since we can and, in this case, should set the soft limit as well so we can just use -n.)

@bnoordhuis
Copy link
Member

LGTM with a comment.

@MylesBorins
Copy link
Contributor

@jasnell LTS?

@Trott
Copy link
Member Author

Trott commented Oct 20, 2015

Added handling for unlimited being coerced to NaN.

New CI: https://ci.nodejs.org/job/node-test-pull-request/537/ Er...nope. Looks like something's up with Jenkins? Will try again later....

@jbergstroem
Copy link
Member

..that'd be my bad. Was doing some changes to the node-test-commit job around the same time you posted it. https://ci.nodejs.org/job/node-test-commit/888/

// ever happens.
const result = child_process.spawnSync(
'/bin/sh',
['-c', `ulimit -n 64 && ${process.execPath} ${__filename}`]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider putting quotes around the paths, in case there are spaces in them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, good one. Thanks. Done. There's probably still edge cases (quotation marks in the path?) but this should deal with all the most likely ones. Thanks again.

Require the test setup to obtain an EMFILE error and not ENFILE as
ENFILE means there is a race condition with other processes that may
close files before `spawn()` is called by the test.

Fixes: nodejs#2666
@Trott Trott force-pushed the flaky-test-child-process-emfile branch from 114d317 to 1a6d2e6 Compare October 20, 2015 16:44
@Trott
Copy link
Member Author

Trott commented Oct 20, 2015

CI with latest @bnoordhuis suggestions: https://ci.nodejs.org/job/node-test-commit/892/

@bnoordhuis
Copy link
Member

LGTM

@jasnell
Copy link
Member

jasnell commented Oct 20, 2015

LGTM. @Trott can I ask you to squash the commits down please?

@Trott
Copy link
Member Author

Trott commented Oct 20, 2015

@jasnell Yup. I always squash 'em before I land.

Trott added a commit to Trott/io.js that referenced this pull request Oct 20, 2015
Require the test setup to obtain an EMFILE error and not ENFILE as
ENFILE means there is a race condition with other processes that may
close files before `spawn()` is called by the test.

Fixes: nodejs#2666
PR-URL: nodejs#3430
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@Trott
Copy link
Member Author

Trott commented Oct 20, 2015

Landed in d89bc7b.

@Trott Trott closed this Oct 20, 2015
Trott added a commit that referenced this pull request Oct 21, 2015
Require the test setup to obtain an EMFILE error and not ENFILE as
ENFILE means there is a race condition with other processes that may
close files before `spawn()` is called by the test.

Fixes: #2666
PR-URL: #3430
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@rvagg rvagg mentioned this pull request Oct 21, 2015
Trott added a commit that referenced this pull request Oct 26, 2015
Require the test setup to obtain an EMFILE error and not ENFILE as
ENFILE means there is a race condition with other processes that may
close files before `spawn()` is called by the test.

Fixes: #2666
PR-URL: #3430
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@jasnell
Copy link
Member

jasnell commented Oct 26, 2015

Landed in v4.x-staging in 0b32414

Trott added a commit that referenced this pull request Oct 29, 2015
Require the test setup to obtain an EMFILE error and not ENFILE as
ENFILE means there is a race condition with other processes that may
close files before `spawn()` is called by the test.

Fixes: #2666
PR-URL: #3430
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@Trott Trott deleted the flaky-test-child-process-emfile branch January 13, 2022 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate flaky test test-child-process-emfile
6 participants