Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: improve cwd performance #27224

Closed
wants to merge 3 commits into from

Conversation

BridgeAR
Copy link
Member

This caches the current working directory and only updates the variable
if process.chdir() is called.

This improves the process.chdir performance by factor 10 and path.resolve by
up to 66%. That again is used a lot in cjs modules.


I guess this is currently not possible since native modules could change the working directory as well, not only JS code. Or is that not a blocker / does not apply?
Do we have a way to detect if any modules are loaded or not? If so, we could have a fallback for the moment a native module is loaded.
And since process.chdir can not be called in workers: how does that work in combination with native modules that could change the working directory? Can we cache the working directory in case we use workers?

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot

This comment has been minimized.

@BridgeAR BridgeAR added performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem. labels Apr 14, 2019
@addaleax addaleax added the semver-major PRs that contain breaking changes and should be released in the next major version. label Apr 14, 2019
@addaleax
Copy link
Member

I think the native addons situation does make this semver-major, at least, although it’s unlikely that many native addons change the current working directory.

I’m not sure what the most expensive part of caling cwd() is, but we could probably at least cache the JS string instance on the native side.

Can we cache the working directory in case we use workers?

With the approach in this PR, you’d have to notify workers when the cwd changes – for low overhead, you could e.g. use SharedArrayBuffer + Atomics to increase a counter for each call to process.chdir(), for example.

@BridgeAR
Copy link
Member Author

@joyeecheung I tried to add a SharedArrayBuffer to add a counter between the main thread and workers but I am not sure how to actually reach /node/lib/internal/bootstrap/node.js from /node/lib/internal/main/worker_thread.js and /node/lib/internal/worker.js from /node/lib/internal/bootstrap/node.js.

I circumvented this issue by only improving the performance for the main thread for now but it would actually be nice to improve workers as well. process.cwd() is commonly used during bootstrap.

@BridgeAR BridgeAR marked this pull request as ready for review April 14, 2019 23:56
@nodejs-github-bot

This comment has been minimized.

@joyeecheung
Copy link
Member

@BridgeAR I think what you need is communication between worker.js and main/worker_thread.js? Or you can expose a wrapper in process/execution.js. Why do you need to go through bootstrap.js? A SharedArrayBuffer containing something like cwd is not supposed to be part of the snapshot.

@BridgeAR
Copy link
Member Author

@joyeecheung the cwd and chdir functions needs access to the shared buffer:

// Worker thread
let cachedCwd = realMethods.cwd();
let lastCounter = -1;
function cwd() {
  const currentCounter = Atomics.load(counter, 0);
  if (currentCounter === lastCounter)
    return cachedCwd;
  lastCounter = currentCounter;
  cachedCwd = realMethods.cwd();
  return cachedCwd;
}

// Main thread only.
// This buffer would have to be shared and passed through to each worker.
const counter = new Uint64Array(new SharedArrayBuffer(8));

function chdir(cwd) {  
  Atomics.add(counter, 0, 1);
  realMethods.chdir(cwd);
}

I could theoretically override the two functions in case the first worker is started and to use a shared buffer at that point but AFAIK we do not want to override the functions anymore at that point, right?

@BridgeAR
Copy link
Member Author

Local benchmark:

// Before
console.time(); for (var i = 0; i < 1e7; i++) process.cwd(); console.timeEnd()
// default: 5596.328ms

// After
console.time(); for (var i = 0; i < 1e7; i++) process.cwd(); console.timeEnd()
// default: 477.811ms

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@joyeecheung
Copy link
Member

joyeecheung commented Apr 15, 2019

@BridgeAR You could delay the creation of process.cwd and process.chdir until pre-execution; or export the SAB in e.g. process/execution.js (and send it in worker.js whenever a new worker is spawned), then in main_thread_only.js and worker_thread_only.js, wrap the two variants of methods differently in wrapProcessMethods.

@joyeecheung
Copy link
Member

um, wait, in case of workers, I suppose the SAB has to be sent together with the LOAD_SCRIPT message, at that point you've already need to run multiple methods that need process.cwd() to do their job, so I guess the best you can do is to cache only after that (it should be fine to override process.cwd() during LOAD_SCRIPT).

@nodejs-github-bot

This comment has been minimized.

@BridgeAR
Copy link
Member Author

I just updated the PR to also work with workers.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

lib/internal/worker.js Outdated Show resolved Hide resolved
lib/internal/worker.js Outdated Show resolved Hide resolved
test/parallel/test-worker-relative-path-double-dot.js Outdated Show resolved Hide resolved
lib/internal/worker.js Outdated Show resolved Hide resolved
lib/internal/main/worker_thread.js Show resolved Hide resolved
lib/internal/main/worker_thread.js Outdated Show resolved Hide resolved
@BridgeAR
Copy link
Member Author

@nodejs/tsc PTAL. This needs one more review from the TSC.

BridgeAR added a commit to BridgeAR/node that referenced this pull request Apr 26, 2019
This caches the current working directory and only updates the variable
if `process.chdir()` is called.

PR-URL: nodejs#27224
Reviewed-By: John-David Dalton <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
@BridgeAR
Copy link
Member Author

Landed in 1d022e8 🎉

@BridgeAR BridgeAR closed this Apr 26, 2019
BridgeAR added a commit to BridgeAR/node that referenced this pull request Apr 29, 2019
This caches the current working directory and only updates the variable
if `process.chdir()` is called.

PR-URL: nodejs#27224
Reviewed-By: John-David Dalton <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
BridgeAR added a commit to BridgeAR/node that referenced this pull request Apr 29, 2019
This makes sure nodejs#27224 is possible
to being backported in a semver-patch way.
targos pushed a commit that referenced this pull request May 5, 2019
This caches the current working directory and only updates the variable
if `process.chdir()` is called.

PR-URL: #27224
Reviewed-By: John-David Dalton <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Backport-PR-URL: #27483
targos pushed a commit that referenced this pull request May 5, 2019
This makes sure #27224 is possible
to being backported in a semver-patch way.

PR-URL: #27483
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Beth Griggs <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Shelley Vohr <[email protected]>
@targos targos mentioned this pull request May 6, 2019
@BridgeAR BridgeAR deleted the improve-cwd-performance branch January 20, 2020 12:02
fivetanley added a commit to fivetanley/ember-template-lint that referenced this pull request Aug 13, 2020
This commit improves the lookups of `ignore` and `pending` configs by reducing the amount of
iterations required over those config properties.

tl;dr:

- cached the config lookups for `pending` and `ignore` to reduce the amount of times we have to
  re-iterate the `pending` or `ignore` arrays provided by the config.
- cache string lookups for normalized moduleId paths since

This was done in three steps:

First Step

The normalized `moduleId` was cached. Running ember-template-lint on a big codebase (668 template
files) took 12.94 seconds to run. I used this command in the app's codebase directory to profile:

```
node --prof ../ember-template-lint/bin/ember-template-lint.js .
```

Next, I generated a graph with the [flamebearer][flamebearer] npm package:

```
node --prof-process --preprocess -j isolate*.log | flamebearer
```

<img width="1188" alt="Screen_Shot_2020-08-13_at_9_25_14_AM" src="https://user-images.githubusercontent.com/1275021/90181853-f29c8200-dd65-11ea-8163-c2e3f94a5e00.png">

Profiling showed that ~19% of the CPU time was spent inside of the `statusForModule` function. Upon
further investigation, I noticed that the [normalized moduleId based on the current working
directory was being generated in a loop][3], creating a new string for potentially every item in the
`pending` or `ignore` config. Our app has no `ignore` config, but does have a lot of `pending`
entries.

Moving the `fullPathModuleId` to a variable outside the loop took the runtime from 12.94 seconds to
11.04 seconds, about ~2 seconds of savings on its own!

An important thing to keep in mind for the next two steps: `statusForModule` is called _at least
once per rule_ to determine if the rule should be ignored or allowed to fail (or removed from the
pending file if now passing). This means that any functions ran or strings created by
`statusForModule` are, in the worst case scenario, calculated
`numberOfFiles * pendingRuleConfigItems * ignoreConfigItems` times if the loop doesn't find the
config for the file early, as it has to search the entire array of `pending` or `ignore` items for
every file.

Rather than rejoining the paths, they are instead generated once and stored in a cache object. This
reduces the number of strings generated and time generating the same strings over and over. Caching
the result of `process.cwd` (since it seems unlikely to change once the program is started)
potentially has a nice performance side effect for users on Node 10 (supported until April 2021) as
[process.cwd was not cached until Node 12.2](nodejs/node#27224).

Second Step

Lookups for `ignore` and `pending` were separated out into different caches. This was done because
while [`statusForModule` will check for a function to run][1] instead of a string, in practice this
is only true for `ignore` rules. The reason `ignore` rules need the function check is that they are
always a function due to being converted from [strings to functions via the micromatch module][2].
Caching the lookup of `ignore` rules reduced the need to run the functions for every rule/file.

Third Step

Last, but not least: `pending` rules always seem to be a list of `object`s or `string`s (as
generated by `--print-pending`) and having a list that potentially has a function in it doesn't seem
likely given that the recommendation from ember-template-lint is to copy/paste the output of
`--print-pending`.

After re-profiling, the runtime went from 12.94 seconds to 9.04 seconds on my machine. The
`get-config` file no longer showed up in the profile.

<img width="1116" alt="Screen Shot 2020-08-13 at 12 07 40 PM" src="https://user-images.githubusercontent.com/1275021/90181902-0647e880-dd66-11ea-9cd4-cc2778198139.png">

[1]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L376-L377
[2]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L280
[3]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L374
[flamebearer]: https://github.com/mapbox/flamebearer
fivetanley added a commit to fivetanley/ember-template-lint that referenced this pull request Aug 13, 2020
This commit improves the lookups of `ignore` and `pending` configs by reducing the amount of
iterations required over those config properties.

tl;dr:

- cached the config lookups for `pending` and `ignore` to reduce the amount of times we have to
  re-iterate the `pending` or `ignore` arrays provided by the config.
- cache string lookups for normalized moduleId paths since

This was done in three steps:

First Step

The normalized `moduleId` was cached. Running ember-template-lint on a big codebase (668 template
files) took 12.94 seconds to run. I used this command in the app's codebase directory to profile:

```
node --prof ../ember-template-lint/bin/ember-template-lint.js .
```

Next, I generated a graph with the [flamebearer][flamebearer] npm package:

```
node --prof-process --preprocess -j isolate*.log | flamebearer
```

<img width="1188" alt="Screen_Shot_2020-08-13_at_9_25_14_AM" src="https://user-images.githubusercontent.com/1275021/90181853-f29c8200-dd65-11ea-8163-c2e3f94a5e00.png">

Profiling showed that ~19% of the CPU time was spent inside of the `statusForModule` function. Upon
further investigation, I noticed that the [normalized moduleId based on the current working
directory was being generated in a loop][3], creating a new string for potentially every item in the
`pending` or `ignore` config. Our app has no `ignore` config, but does have a lot of `pending`
entries.

Moving the `fullPathModuleId` to a variable outside the loop took the runtime from 12.94 seconds to
11.04 seconds, about ~2 seconds of savings on its own!

An important thing to keep in mind for the next two steps: `statusForModule` is called _at least
once per rule_ to determine if the rule should be ignored or allowed to fail (or removed from the
pending file if now passing). This means that any functions ran or strings created by
`statusForModule` are, in the worst case scenario, calculated
`numberOfFiles * pendingRuleConfigItems * ignoreConfigItems` times if the loop doesn't find the
config for the file early, as it has to search the entire array of `pending` or `ignore` items for
every file.

Rather than rejoining the paths, they are instead generated once and stored in a cache object. This
reduces the number of strings generated and time generating the same strings over and over. Caching
the result of `process.cwd` (since it seems unlikely to change once the program is started)
potentially has a nice performance side effect for users on Node 10 (supported until April 2021) as
[process.cwd was not cached until Node 12.2](nodejs/node#27224).

Second Step

Lookups for `ignore` and `pending` were separated out into different caches. This was done because
while [`statusForModule` will check for a function to run][1] instead of a string, in practice this
is only true for `ignore` rules. The reason `ignore` rules need the function check is that they are
always a function due to being converted from [strings to functions via the micromatch module][2].
Caching the lookup of `ignore` rules reduced the need to run the functions for every rule/file.

Third Step

Last, but not least: `pending` rules always seem to be a list of `object`s or `string`s (as
generated by `--print-pending`) and having a list that potentially has a function in it doesn't seem
likely given that the recommendation from ember-template-lint is to copy/paste the output of
`--print-pending`.

After re-profiling, the runtime went from 12.94 seconds to 9.04 seconds on my machine. The
`get-config` file no longer showed up in the profile.

<img width="1116" alt="Screen Shot 2020-08-13 at 12 07 40 PM" src="https://user-images.githubusercontent.com/1275021/90181902-0647e880-dd66-11ea-9cd4-cc2778198139.png">

[1]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L376-L377
[2]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L280
[3]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L374
[flamebearer]: https://github.com/mapbox/flamebearer
fivetanley added a commit to fivetanley/ember-template-lint that referenced this pull request Aug 20, 2020
This commit improves the lookups of `ignore` and `pending` configs by reducing the amount of
iterations required over those config properties.

tl;dr:

- cached the config lookups for `pending` and `ignore` to reduce the amount of times we have to
  re-iterate the `pending` or `ignore` arrays provided by the config.
- cache string lookups for normalized moduleId paths since

This was done in three steps:

First Step

The normalized `moduleId` was cached. Running ember-template-lint on a big codebase (668 template
files) took 12.94 seconds to run. I used this command in the app's codebase directory to profile:

```
node --prof ../ember-template-lint/bin/ember-template-lint.js .
```

Next, I generated a graph with the [flamebearer][flamebearer] npm package:

```
node --prof-process --preprocess -j isolate*.log | flamebearer
```

<img width="1188" alt="Screen_Shot_2020-08-13_at_9_25_14_AM" src="https://user-images.githubusercontent.com/1275021/90181853-f29c8200-dd65-11ea-8163-c2e3f94a5e00.png">

Profiling showed that ~19% of the CPU time was spent inside of the `statusForModule` function. Upon
further investigation, I noticed that the [normalized moduleId based on the current working
directory was being generated in a loop][3], creating a new string for potentially every item in the
`pending` or `ignore` config. Our app has no `ignore` config, but does have a lot of `pending`
entries.

Moving the `fullPathModuleId` to a variable outside the loop took the runtime from 12.94 seconds to
11.04 seconds, about ~2 seconds of savings on its own!

An important thing to keep in mind for the next two steps: `statusForModule` is called _at least
once per rule_ to determine if the rule should be ignored or allowed to fail (or removed from the
pending file if now passing). This means that any functions ran or strings created by
`statusForModule` are, in the worst case scenario, calculated
`numberOfFiles * pendingRuleConfigItems * ignoreConfigItems` times if the loop doesn't find the
config for the file early, as it has to search the entire array of `pending` or `ignore` items for
every file.

Rather than rejoining the paths, they are instead generated once and stored in a cache object. This
reduces the number of strings generated and time generating the same strings over and over. Caching
the result of `process.cwd` (since it seems unlikely to change once the program is started)
potentially has a nice performance side effect for users on Node 10 (supported until April 2021) as
[process.cwd was not cached until Node 12.2](nodejs/node#27224).

Second Step

Lookups for `ignore` and `pending` were separated out into different caches. This was done because
while [`statusForModule` will check for a function to run][1] instead of a string, in practice this
is only true for `ignore` rules. The reason `ignore` rules need the function check is that they are
always a function due to being converted from [strings to functions via the micromatch module][2].
Caching the lookup of `ignore` rules reduced the need to run the functions for every rule/file.

Third Step

Last, but not least: `pending` rules always seem to be a list of `object`s or `string`s (as
generated by `--print-pending`) and having a list that potentially has a function in it doesn't seem
likely given that the recommendation from ember-template-lint is to copy/paste the output of
`--print-pending`.

After re-profiling, the runtime went from 12.94 seconds to 9.04 seconds on my machine. The
`get-config` file no longer showed up in the profile.

<img width="1116" alt="Screen Shot 2020-08-13 at 12 07 40 PM" src="https://user-images.githubusercontent.com/1275021/90181902-0647e880-dd66-11ea-9cd4-cc2778198139.png">

[1]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L376-L377
[2]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L280
[3]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L374
[flamebearer]: https://github.com/mapbox/flamebearer
fivetanley added a commit to fivetanley/ember-template-lint that referenced this pull request Sep 2, 2020
This commit improves the lookups of `ignore` and `pending` configs by reducing the amount of
iterations required over those config properties.

tl;dr:

- cached the config lookups for `pending` and `ignore` to reduce the amount of times we have to
  re-iterate the `pending` or `ignore` arrays provided by the config.
- cache string lookups for normalized moduleId paths since

This was done in three steps:

First Step

The normalized `moduleId` was cached. Running ember-template-lint on a big codebase (668 template
files) took 12.94 seconds to run. I used this command in the app's codebase directory to profile:

```
node --prof ../ember-template-lint/bin/ember-template-lint.js .
```

Next, I generated a graph with the [flamebearer][flamebearer] npm package:

```
node --prof-process --preprocess -j isolate*.log | flamebearer
```

<img width="1188" alt="Screen_Shot_2020-08-13_at_9_25_14_AM" src="https://user-images.githubusercontent.com/1275021/90181853-f29c8200-dd65-11ea-8163-c2e3f94a5e00.png">

Profiling showed that ~19% of the CPU time was spent inside of the `statusForModule` function. Upon
further investigation, I noticed that the [normalized moduleId based on the current working
directory was being generated in a loop][3], creating a new string for potentially every item in the
`pending` or `ignore` config. Our app has no `ignore` config, but does have a lot of `pending`
entries.

Moving the `fullPathModuleId` to a variable outside the loop took the runtime from 12.94 seconds to
11.04 seconds, about ~2 seconds of savings on its own!

An important thing to keep in mind for the next two steps: `statusForModule` is called _at least
once per rule_ to determine if the rule should be ignored or allowed to fail (or removed from the
pending file if now passing). This means that any functions ran or strings created by
`statusForModule` are, in the worst case scenario, calculated
`numberOfFiles * pendingRuleConfigItems * ignoreConfigItems` times if the loop doesn't find the
config for the file early, as it has to search the entire array of `pending` or `ignore` items for
every file.

Rather than rejoining the paths, they are instead generated once and stored in a cache object. This
reduces the number of strings generated and time generating the same strings over and over. Caching
the result of `process.cwd` (since it seems unlikely to change once the program is started)
potentially has a nice performance side effect for users on Node 10 (supported until April 2021) as
[process.cwd was not cached until Node 12.2](nodejs/node#27224).

Second Step

Lookups for `ignore` and `pending` were separated out into different caches. This was done because
while [`statusForModule` will check for a function to run][1] instead of a string, in practice this
is only true for `ignore` rules. The reason `ignore` rules need the function check is that they are
always a function due to being converted from [strings to functions via the micromatch module][2].
Caching the lookup of `ignore` rules reduced the need to run the functions for every rule/file.

Third Step

Last, but not least: `pending` rules always seem to be a list of `object`s or `string`s (as
generated by `--print-pending`) and having a list that potentially has a function in it doesn't seem
likely given that the recommendation from ember-template-lint is to copy/paste the output of
`--print-pending`.

After re-profiling, the runtime went from 12.94 seconds to 9.04 seconds on my machine. The
`get-config` file no longer showed up in the profile.

<img width="1116" alt="Screen Shot 2020-08-13 at 12 07 40 PM" src="https://user-images.githubusercontent.com/1275021/90181902-0647e880-dd66-11ea-9cd4-cc2778198139.png">

[1]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L376-L377
[2]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L280
[3]:
  https://github.com/ember-template-lint/ember-template-lint/blob/5937b63bed30380b4bce0f96b19061658a176840/lib/get-config.js#L374
[flamebearer]: https://github.com/mapbox/flamebearer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants