Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: cache stat() results more aggressively #4575

Closed
wants to merge 4 commits into from

Conversation

bnoordhuis
Copy link
Member

Reduce the number of stat() system calls that require() makes by caching
the results more aggressively.

To avoid unbounded growth without implementing a LRU cache, scope the
cache to the lifetime of the first call to require(). Recursive calls
(i.e. require() calls in the included code) transparently profit from
the cache.

The benchmarked application is the loopback-sample-app[0] and it sees
the number of stat calls at start-up go down by 40%, from 4736 to 2810.

[0] https://github.com/strongloop/loopback-sample-app

CI: https://ci.nodejs.org/job/node-test-pull-request/1160/

fs.statSync() creates and returns a heavy-weight fs.Stat object whereas
fs.accessSync() simply returns nothing.  The return value is ignored,
the call is for its side effect of throwing an ELOOP error in case of
cyclic symbolic links.
Reduce the number of stat() system calls that require() makes by caching
the results more aggressively.

To avoid unbounded growth without implementing a LRU cache, scope the
cache to the lifetime of the first call to require().  Recursive calls
(i.e. require() calls in the included code) transparently profit from
the cache.

The benchmarked application is the loopback-sample-app[0] and it sees
the number of stat calls at start-up go down by 40%, from 4736 to 2810.

[0] https://github.com/strongloop/loopback-sample-app
Avoid an unneeded ArgumentsAdaptorTrampoline stack frame by passing the
the right number of arguments to Module._load() in Module.require().
Shortens the following stack trace with one frame:

    LazyCompile:~Module.load module.js:345
    LazyCompile:Module._load module.js:282
    Builtin:ArgumentsAdaptorTrampoline
    LazyCompile:*Module.require module.js:361
    LazyCompile:*require internal/module.js:11
@bnoordhuis bnoordhuis added the module Issues and PRs related to the module subsystem. label Jan 7, 2016
@bnoordhuis
Copy link
Member Author

Related to nodejs/benchmarking#22 (comment), /cc @sam-github.

Use internalModuleReadFile() to read the file from disk to avoid the
fs.fstatSync() call that fs.readFileSync() makes.

It does so to know the file size in advance so it doesn't have to
allocate O(n) buffers when reading the file from disk.

internalModuleReadFile() is plenty efficient though, even more so
because we want a string and not a buffer.  This way we also don't
allocate a buffer that immediately gets thrown away again.

This commit reduces the number of fstat() system calls in a benchmark
application[0] from 549 to 29, all made by the application itself.

[0] https://github.com/strongloop/loopback-sample-app
@jasnell
Copy link
Member

jasnell commented Jan 8, 2016

LGTM

@bnoordhuis
Copy link
Member Author

Going to run the CI one more time, the last one aborted on the win-vs2015 buildbot.

https://ci.nodejs.org/job/node-test-pull-request/1173/

@jasnell
Copy link
Member

jasnell commented Jan 11, 2016

@bnoordhuis ... CI blew out on the buildbot again. Otherwise this looks fine.

@cjihrig
Copy link
Contributor

cjihrig commented Jan 11, 2016

LGTM

@sam-github
Copy link
Contributor

Haven't had time to look at the code, but this would make a meaningful difference in app startup times for apps with large (aka "real") dependency trees, I'm all for it.

@vkurchatkin
Copy link
Contributor

+1 for 5, but I don't think it's suitable for LTS

jasnell pushed a commit that referenced this pull request Jan 12, 2016
fs.statSync() creates and returns a heavy-weight fs.Stat object whereas
fs.accessSync() simply returns nothing.  The return value is ignored,
the call is for its side effect of throwing an ELOOP error in case of
cyclic symbolic links.

PR-URL: #4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
jasnell pushed a commit that referenced this pull request Jan 12, 2016
Reduce the number of stat() system calls that require() makes by caching
the results more aggressively.

To avoid unbounded growth without implementing a LRU cache, scope the
cache to the lifetime of the first call to require().  Recursive calls
(i.e. require() calls in the included code) transparently profit from
the cache.

The benchmarked application is the loopback-sample-app[0] and it sees
the number of stat calls at start-up go down by 40%, from 4736 to 2810.

[0] https://github.com/strongloop/loopback-sample-app

PR-URL: #4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
jasnell pushed a commit that referenced this pull request Jan 12, 2016
Avoid an unneeded ArgumentsAdaptorTrampoline stack frame by passing the
the right number of arguments to Module._load() in Module.require().
Shortens the following stack trace with one frame:

    LazyCompile:~Module.load module.js:345
    LazyCompile:Module._load module.js:282
    Builtin:ArgumentsAdaptorTrampoline
    LazyCompile:*Module.require module.js:361
    LazyCompile:*require internal/module.js:11

PR-URL: #4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
jasnell pushed a commit that referenced this pull request Jan 12, 2016
Use internalModuleReadFile() to read the file from disk to avoid the
fs.fstatSync() call that fs.readFileSync() makes.

It does so to know the file size in advance so it doesn't have to
allocate O(n) buffers when reading the file from disk.

internalModuleReadFile() is plenty efficient though, even more so
because we want a string and not a buffer.  This way we also don't
allocate a buffer that immediately gets thrown away again.

This commit reduces the number of fstat() system calls in a benchmark
application[0] from 549 to 29, all made by the application itself.

[0] https://github.com/strongloop/loopback-sample-app

PR-URL: #4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@jasnell
Copy link
Member

jasnell commented Jan 12, 2016

Landed in 809bf5e, 83f8d98, 038b636 and 7c60328

@jasnell jasnell closed this Jan 12, 2016
@Trott
Copy link
Member

Trott commented Jan 13, 2016

It sure does look like this PR really did break tests.

Here's a CI I just ran for the last commit before this PR landed:
https://ci.nodejs.org/job/node-test-commit/1717/

All green! \o/

And here's a CI I just ran for the code right the four commits in this PR landed:
https://ci.nodejs.org/job/node-test-commit/1718/

Same two tests failing on Windows. :-/

@jasnell
Copy link
Member

jasnell commented Jan 13, 2016

hmm ... ok. @bnoordhuis

@Trott
Copy link
Member

Trott commented Jan 13, 2016

bnoordhuis commented in another issue that he's sick right now. I'm juggling a few things right now, but I might see if I can figure out which commit or set of commits would need to be reverted to fix the issue, then submit a PR to revert. If someone is already working on this or is tackling it from the other end (looking at the tests that are failing to see if they are flawed), let me know and I'll go do something else.

@Trott
Copy link
Member

Trott commented Jan 13, 2016

Cool. Now I have four other jobs running, each one reverting one of the commits in this PR. Hopefully, the culprit is one and only one commit. If not, more CI FUN!

@Trott
Copy link
Member

Trott commented Jan 13, 2016

So, this PR results in at least two tests failing. Reverting one commit can fix one, but you can't fix both without reverting at least two commits... So now I have a bunch more CI runs to try (unless we think the best thing to do is just revert all four commits, but I'd rather do the smallest revert possible).

@Trott
Copy link
Member

Trott commented Jan 13, 2016

I'm pretty sure this will be the one. It reverts 809bf5e and 7c60328:

https://ci.nodejs.org/job/node-test-commit-windows-fanned/919/

@cjihrig
Copy link
Contributor

cjihrig commented Jan 13, 2016

038b636 looks harmless. Can we try to keep it.

7c60328 is in the stack trace of one of the failures. We should probably revert it.

809bf5e appears to be the cause of one of the failures.

Maybe we can keep 83f8d98.

@Trott
Copy link
Member

Trott commented Jan 13, 2016

@cjihrig Your analysis is exactly in line with the revert combo I'm trying right now. We appear to be converging on the solution...

@Trott
Copy link
Member

Trott commented Jan 13, 2016

Winner!: https://ci.nodejs.org/job/node-compile-windows/869/

PR to revert incoming... #4679

rvagg pushed a commit that referenced this pull request Jan 14, 2016
Avoid an unneeded ArgumentsAdaptorTrampoline stack frame by passing the
the right number of arguments to Module._load() in Module.require().
Shortens the following stack trace with one frame:

    LazyCompile:~Module.load module.js:345
    LazyCompile:Module._load module.js:282
    Builtin:ArgumentsAdaptorTrampoline
    LazyCompile:*Module.require module.js:361
    LazyCompile:*require internal/module.js:11

PR-URL: #4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Trott added a commit to Trott/io.js that referenced this pull request Jan 15, 2016
This reverts commit 809bf5e
("change statSync to accessSync in realpathSync").

It was causing tests to fail on Windows CI.

Ref: nodejs#4575
PR-URL: nodejs#4679
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Trott added a commit to Trott/io.js that referenced this pull request Jan 15, 2016
This reverts commit 7c60328.

It is causing CI failures on Windows.

Ref: nodejs#4575
PR-URL: nodejs#4679
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Trott added a commit that referenced this pull request Jan 16, 2016
This reverts commit 809bf5e
("change statSync to accessSync in realpathSync").

It was causing tests to fail on Windows CI.

Ref: #4575
PR-URL: #4679
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Trott added a commit that referenced this pull request Jan 16, 2016
This reverts commit 7c60328.

It is causing CI failures on Windows.

Ref: #4575
PR-URL: #4679
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
@rvagg
Copy link
Member

rvagg commented Jan 18, 2016

FTR I've marked this as dont-land-on-v5.x along with the reversion at #4679. However, the following commits have been landed as they were not implicated in the Windows CI failures:

  • 26392ed - module: cache stat() results more aggressively
  • d8f5bd4 - module: avoid ArgumentsAdaptorTrampoline frame

evanlucas pushed a commit that referenced this pull request Jan 19, 2016
Reduce the number of stat() system calls that require() makes by caching
the results more aggressively.

To avoid unbounded growth without implementing a LRU cache, scope the
cache to the lifetime of the first call to require().  Recursive calls
(i.e. require() calls in the included code) transparently profit from
the cache.

The benchmarked application is the loopback-sample-app[0] and it sees
the number of stat calls at start-up go down by 40%, from 4736 to 2810.

[0] https://github.com/strongloop/loopback-sample-app

PR-URL: #4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request Jan 20, 2016
Reduce the number of stat() system calls that require() makes by caching
the results more aggressively.

To avoid unbounded growth without implementing a LRU cache, scope the
cache to the lifetime of the first call to require().  Recursive calls
(i.e. require() calls in the included code) transparently profit from
the cache.

The benchmarked application is the loopback-sample-app[0] and it sees
the number of stat calls at start-up go down by 40%, from 4736 to 2810.

[0] https://github.com/strongloop/loopback-sample-app

PR-URL: #4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
evanlucas added a commit that referenced this pull request Jan 20, 2016
Notable changes:

* events: make sure console functions exist (Dave) #4479
* fs: add autoClose option to fs.createWriteStream (Saquib) #3679
* http: improves expect header handling (Daniel Sellers) #4501
* node: allow preload modules with -i (Evan Lucas) #4696
* v8,src: expose statistics about heap spaces (`v8.getHeapSpaceStatistics()`) (Ben Ripkens) #4463
* Minor performance improvements:
  - lib: Use arrow functions instead of bind where possible (Minwoo Jung) #3622
  - module: cache stat() results more aggressively (Ben Noordhuis) #4575
  - querystring: improve parse() performance (Brian White) #4675

PR-URL: #4742
evanlucas added a commit that referenced this pull request Jan 21, 2016
Notable changes:

* events: make sure console functions exist (Dave) #4479
* fs: add autoClose option to fs.createWriteStream (Saquib) #3679
* http: improves expect header handling (Daniel Sellers) #4501
* node: allow preload modules with -i (Evan Lucas) #4696
* v8,src: expose statistics about heap spaces (`v8.getHeapSpaceStatistics()`) (Ben Ripkens) #4463
* Minor performance improvements:
  - lib: Use arrow functions instead of bind where possible (Minwoo Jung) #3622
  - module: cache stat() results more aggressively (Ben Noordhuis) #4575
  - querystring: improve parse() performance (Brian White) #4675

PR-URL: #4742
@mhdawson
Copy link
Member

@bnoordhuis. Interestingly require performance for master is much better than 4.x. based on benchmark created by Michael Paulson. I'm guessing this might be part of the reason. If you are interested see charts through link in: nodejs/build#281 (comment)

scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
fs.statSync() creates and returns a heavy-weight fs.Stat object whereas
fs.accessSync() simply returns nothing.  The return value is ignored,
the call is for its side effect of throwing an ELOOP error in case of
cyclic symbolic links.

PR-URL: nodejs#4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Reduce the number of stat() system calls that require() makes by caching
the results more aggressively.

To avoid unbounded growth without implementing a LRU cache, scope the
cache to the lifetime of the first call to require().  Recursive calls
(i.e. require() calls in the included code) transparently profit from
the cache.

The benchmarked application is the loopback-sample-app[0] and it sees
the number of stat calls at start-up go down by 40%, from 4736 to 2810.

[0] https://github.com/strongloop/loopback-sample-app

PR-URL: nodejs#4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Avoid an unneeded ArgumentsAdaptorTrampoline stack frame by passing the
the right number of arguments to Module._load() in Module.require().
Shortens the following stack trace with one frame:

    LazyCompile:~Module.load module.js:345
    LazyCompile:Module._load module.js:282
    Builtin:ArgumentsAdaptorTrampoline
    LazyCompile:*Module.require module.js:361
    LazyCompile:*require internal/module.js:11

PR-URL: nodejs#4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Use internalModuleReadFile() to read the file from disk to avoid the
fs.fstatSync() call that fs.readFileSync() makes.

It does so to know the file size in advance so it doesn't have to
allocate O(n) buffers when reading the file from disk.

internalModuleReadFile() is plenty efficient though, even more so
because we want a string and not a buffer.  This way we also don't
allocate a buffer that immediately gets thrown away again.

This commit reduces the number of fstat() system calls in a benchmark
application[0] from 549 to 29, all made by the application itself.

[0] https://github.com/strongloop/loopback-sample-app

PR-URL: nodejs#4575
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
This reverts commit 809bf5e
("change statSync to accessSync in realpathSync").

It was causing tests to fail on Windows CI.

Ref: nodejs#4575
PR-URL: nodejs#4679
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
This reverts commit 7c60328.

It is causing CI failures on Windows.

Ref: nodejs#4575
PR-URL: nodejs#4679
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Notable changes:

* events: make sure console functions exist (Dave) nodejs#4479
* fs: add autoClose option to fs.createWriteStream (Saquib) nodejs#3679
* http: improves expect header handling (Daniel Sellers) nodejs#4501
* node: allow preload modules with -i (Evan Lucas) nodejs#4696
* v8,src: expose statistics about heap spaces (`v8.getHeapSpaceStatistics()`) (Ben Ripkens) nodejs#4463
* Minor performance improvements:
  - lib: Use arrow functions instead of bind where possible (Minwoo Jung) nodejs#3622
  - module: cache stat() results more aggressively (Ben Noordhuis) nodejs#4575
  - querystring: improve parse() performance (Brian White) nodejs#4675

PR-URL: nodejs#4742
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module Issues and PRs related to the module subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants