-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
path: performance and stability improvements on all platforms #5123
Conversation
This commit splits each path benchmark into separate posix and Windows benchmark files. This allows benchmarking (platform-)specific inputs against specific platforms (only).
This commit adds new tests, executes tests for other platforms instead of limiting platform-specific tests to those platforms, and fixes a few style/formatting inconsistencies.
CI is green except for some flaky tests on ARM: https://ci.nodejs.org/job/node-test-commit/2132/ |
Wow. Thanks @mscdex. I wonder if this brings any performance improvements to require as well. |
Is the diff not showing up for anyone else? |
This commit significantly improves performance of all path functions. Optimization strategies include: * Replacing regexps with manual parsers * Avoiding unnecessary array creation (including split() + join()) * Returning earlier where possible to avoid unnecessary work * Minimize unnecessary string creation and concatenations * Combining string iterations
@evanlucas I just ran the |
Nice. Bit difficult to review given the amount of change. I assume there are no API changes? |
Marking this don't land on v4 for now... this is definitely one that we'd want to prove out for a while before backporting. |
@jasnell With regard to API changes, these changes should be 100% backwards compatible. It passes all tests, including additional tests that this PR adds. citgm tests would be nice but from what I understand those are a bit cumbersome to do at the moment because npm is broken to some degree in master currently? |
LGTM given that CI is green :-) great work |
👍
I’d be interested in the runner as well. Is it open-source? |
@mathiasbynens Not yet, it's still a work-in-progress. I'm still trying to get benchmark.js to be flexible enough to be used for the (more complex) async benchmarks node currently has. It works great for the synchronous stuff (like the path tests) though. |
LGTM pending another CI run: https://ci.nodejs.org/job/node-test-pull-request/1587/ |
PR-URL: #5123 Reviewed-By: Roman Reiss <[email protected]> Reviewed-By: James M Snell <[email protected]>
This commit splits each path benchmark into separate posix and Windows benchmark files. This allows benchmarking (platform-)specific inputs against specific platforms (only). PR-URL: #5123 Reviewed-By: Roman Reiss <[email protected]> Reviewed-By: James M Snell <[email protected]>
This commit adds new tests, executes tests for other platforms instead of limiting platform-specific tests to those platforms, and fixes a few style/formatting inconsistencies. PR-URL: #5123 Reviewed-By: Roman Reiss <[email protected]> Reviewed-By: James M Snell <[email protected]>
This commit significantly improves performance of all path functions. Optimization strategies include: * Replacing regexps with manual parsers * Avoiding unnecessary array creation (including split() + join()) * Returning earlier where possible to avoid unnecessary work * Minimize unnecessary string creation and concatenations * Combining string iterations PR-URL: #5123 Reviewed-By: Roman Reiss <[email protected]> Reviewed-By: James M Snell <[email protected]>
@thealphanerd you should probably citgm this. |
PR-URL: #5123 Reviewed-By: Roman Reiss <[email protected]> Reviewed-By: James M Snell <[email protected]>
This commit splits each path benchmark into separate posix and Windows benchmark files. This allows benchmarking (platform-)specific inputs against specific platforms (only). PR-URL: #5123 Reviewed-By: Roman Reiss <[email protected]> Reviewed-By: James M Snell <[email protected]>
This commit adds new tests, executes tests for other platforms instead of limiting platform-specific tests to those platforms, and fixes a few style/formatting inconsistencies. PR-URL: #5123 Reviewed-By: Roman Reiss <[email protected]> Reviewed-By: James M Snell <[email protected]>
Cherry pick is fine, it's just these annoying error message format updates that have messed things up, manually working around them is not hard though so no need for another PR, I'll deal with it when it lands |
ah ok cool. I'm multi-tasking like mad now and was just doing sanity checks without diving to far in. Thanks for taking the time |
removed semver-major label |
* buffer: - You can now supply an encoding argument when filling a Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying an existing Buffer will also work with Buffer#fill(buffer[, start[, end]]). See the API documentation for details on how this works. (Trevor Norris) #4935 - Buffer#indexOf() no longer requires a byteOffset argument if you also wish to specify an encoding: Buffer#indexOf(val[, byteOffset][, encoding]). (Trevor Norris) #4803 * child_process: spawn() and spawnSync() now support a 'shell' option to allow for optional execution of the given command inside a shell. If set to true, cmd.exe will be used on Windows and /bin/sh elsewhere. A path to a custom shell can also be passed to override these defaults. On Windows, this option allows .bat. and .cmd files to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598 * http_parser: Update to http-parser 2.6.2 to fix an unintentionally strict limitation of allowable header characters. (James M Snell) #5237 * dgram: socket.send() now supports accepts an array of Buffers or Strings as the first argument. See the API docs for details on how this works. (Matteo Collina) #4374 * http: Fix a bug where handling headers will mistakenly trigger an 'upgrade' event where the server is just advertising its protocols. This bug can prevent HTTP clients from communicating with HTTP/2 enabled servers. (Fedor Indutny) #4337 * net: Added a listening Boolean property to net and http servers to indicate whether the server is listening for connections. (José Moreira) #4743 * node: The C++ node::MakeCallback() API is now reentrant and calling it from inside another MakeCallback() call no longer causes the nextTick queue or Promises microtask queue to be processed out of order. (Trevor Norris) #4507 * tls: Add a new tlsSocket.getProtocol() method to get the negotiated TLS protocol version of the current connection. (Brian White) #4995 * vm: Introduce new 'produceCachedData' and 'cachedData' options to new vm.Script() to interact with V8's code cache. When a new vm.Script object is created with the 'produceCachedData' set to true a Buffer with V8's code cache data will be produced and stored in cachedData property of the returned object. This data in turn may be supplied back to another vm.Script() object with a 'cachedData' option if the supplied source is the same. Successfully executing a script from cached data can speed up instantiation time. See the API docs for details. (Fedor Indutny) #4777 * performance: Improvements in: - process.nextTick() (Ruben Bridgewater) #5092 - path module (Brian White) #5123 - querystring module (Brian White) #5012 - streams module when processing small chunks (Matteo Collina) #4354
* buffer: - You can now supply an encoding argument when filling a Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying an existing Buffer will also work with Buffer#fill(buffer[, start[, end]]). See the API documentation for details on how this works. (Trevor Norris) #4935 - Buffer#indexOf() no longer requires a byteOffset argument if you also wish to specify an encoding: Buffer#indexOf(val[, byteOffset][, encoding]). (Trevor Norris) #4803 * child_process: spawn() and spawnSync() now support a 'shell' option to allow for optional execution of the given command inside a shell. If set to true, cmd.exe will be used on Windows and /bin/sh elsewhere. A path to a custom shell can also be passed to override these defaults. On Windows, this option allows .bat. and .cmd files to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598 * http_parser: Update to http-parser 2.6.2 to fix an unintentionally strict limitation of allowable header characters. (James M Snell) #5237 * dgram: socket.send() now supports accepts an array of Buffers or Strings as the first argument. See the API docs for details on how this works. (Matteo Collina) #4374 * http: Fix a bug where handling headers will mistakenly trigger an 'upgrade' event where the server is just advertising its protocols. This bug can prevent HTTP clients from communicating with HTTP/2 enabled servers. (Fedor Indutny) #4337 * net: Added a listening Boolean property to net and http servers to indicate whether the server is listening for connections. (José Moreira) #4743 * node: The C++ node::MakeCallback() API is now reentrant and calling it from inside another MakeCallback() call no longer causes the nextTick queue or Promises microtask queue to be processed out of order. (Trevor Norris) #4507 * tls: Add a new tlsSocket.getProtocol() method to get the negotiated TLS protocol version of the current connection. (Brian White) #4995 * vm: Introduce new 'produceCachedData' and 'cachedData' options to new vm.Script() to interact with V8's code cache. When a new vm.Script object is created with the 'produceCachedData' set to true a Buffer with V8's code cache data will be produced and stored in cachedData property of the returned object. This data in turn may be supplied back to another vm.Script() object with a 'cachedData' option if the supplied source is the same. Successfully executing a script from cached data can speed up instantiation time. See the API docs for details. (Fedor Indutny) #4777 * performance: Improvements in: - process.nextTick() (Ruben Bridgewater) #5092 - path module (Brian White) #5123 - querystring module (Brian White) #5012 - streams module when processing small chunks (Matteo Collina) #4354
* buffer: - You can now supply an encoding argument when filling a Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying an existing Buffer will also work with Buffer#fill(buffer[, start[, end]]). See the API documentation for details on how this works. (Trevor Norris) #4935 - Buffer#indexOf() no longer requires a byteOffset argument if you also wish to specify an encoding: Buffer#indexOf(val[, byteOffset][, encoding]). (Trevor Norris) #4803 * child_process: spawn() and spawnSync() now support a 'shell' option to allow for optional execution of the given command inside a shell. If set to true, cmd.exe will be used on Windows and /bin/sh elsewhere. A path to a custom shell can also be passed to override these defaults. On Windows, this option allows .bat. and .cmd files to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598 * http_parser: Update to http-parser 2.6.2 to fix an unintentionally strict limitation of allowable header characters. (James M Snell) #5237 * dgram: socket.send() now supports accepts an array of Buffers or Strings as the first argument. See the API docs for details on how this works. (Matteo Collina) #4374 * http: Fix a bug where handling headers will mistakenly trigger an 'upgrade' event where the server is just advertising its protocols. This bug can prevent HTTP clients from communicating with HTTP/2 enabled servers. (Fedor Indutny) #4337 * net: Added a listening Boolean property to net and http servers to indicate whether the server is listening for connections. (José Moreira) #4743 * node: The C++ node::MakeCallback() API is now reentrant and calling it from inside another MakeCallback() call no longer causes the nextTick queue or Promises microtask queue to be processed out of order. (Trevor Norris) #4507 * tls: Add a new tlsSocket.getProtocol() method to get the negotiated TLS protocol version of the current connection. (Brian White) #4995 * vm: Introduce new 'produceCachedData' and 'cachedData' options to new vm.Script() to interact with V8's code cache. When a new vm.Script object is created with the 'produceCachedData' set to true a Buffer with V8's code cache data will be produced and stored in cachedData property of the returned object. This data in turn may be supplied back to another vm.Script() object with a 'cachedData' option if the supplied source is the same. Successfully executing a script from cached data can speed up instantiation time. See the API docs for details. (Fedor Indutny) #4777 * performance: Improvements in: - process.nextTick() (Ruben Bridgewater) #5092 - path module (Brian White) #5123 - querystring module (Brian White) #5012 - streams module when processing small chunks (Matteo Collina) #4354 PR-URL: #5295
Related issue: |
* buffer: - You can now supply an encoding argument when filling a Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying an existing Buffer will also work with Buffer#fill(buffer[, start[, end]]). See the API documentation for details on how this works. (Trevor Norris) #4935 - Buffer#indexOf() no longer requires a byteOffset argument if you also wish to specify an encoding: Buffer#indexOf(val[, byteOffset][, encoding]). (Trevor Norris) #4803 * child_process: spawn() and spawnSync() now support a 'shell' option to allow for optional execution of the given command inside a shell. If set to true, cmd.exe will be used on Windows and /bin/sh elsewhere. A path to a custom shell can also be passed to override these defaults. On Windows, this option allows .bat. and .cmd files to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598 * http_parser: Update to http-parser 2.6.2 to fix an unintentionally strict limitation of allowable header characters. (James M Snell) #5237 * dgram: socket.send() now supports accepts an array of Buffers or Strings as the first argument. See the API docs for details on how this works. (Matteo Collina) #4374 * http: Fix a bug where handling headers will mistakenly trigger an 'upgrade' event where the server is just advertising its protocols. This bug can prevent HTTP clients from communicating with HTTP/2 enabled servers. (Fedor Indutny) #4337 * net: Added a listening Boolean property to net and http servers to indicate whether the server is listening for connections. (José Moreira) #4743 * node: The C++ node::MakeCallback() API is now reentrant and calling it from inside another MakeCallback() call no longer causes the nextTick queue or Promises microtask queue to be processed out of order. (Trevor Norris) #4507 * tls: Add a new tlsSocket.getProtocol() method to get the negotiated TLS protocol version of the current connection. (Brian White) #4995 * vm: Introduce new 'produceCachedData' and 'cachedData' options to new vm.Script() to interact with V8's code cache. When a new vm.Script object is created with the 'produceCachedData' set to true a Buffer with V8's code cache data will be produced and stored in cachedData property of the returned object. This data in turn may be supplied back to another vm.Script() object with a 'cachedData' option if the supplied source is the same. Successfully executing a script from cached data can speed up instantiation time. See the API docs for details. (Fedor Indutny) #4777 * performance: Improvements in: - process.nextTick() (Ruben Bridgewater) #5092 - path module (Brian White) #5123 - querystring module (Brian White) #5012 - streams module when processing small chunks (Matteo Collina) #4354 PR-URL: #5295
v5.7 |
:) Thanks @evanlucas |
This is more or less a rewrite of the entire
path
module and brings with it significant performance benefits. Overall the improvements bring up to an 18,000% performance increase. The ridiculously high number comes from the early returns that I've added for simple cases (e.g. empty or'/'
inputs). The non-early return improvements are more like up to 2,000%.Here are the benchmark results with these changes (note: these results were obtained from a benchmark.js-based benchmark runner I'm working on, but the inputs and actual test code are the same):
The
|
in some of the benchmark inputs is a delimiter to be able to pass multiple arguments to the function being benchmarked.