Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

querystring: improve parse() and escape() performance #5012

Closed
wants to merge 3 commits into from

Conversation

mscdex
Copy link
Contributor

@mscdex mscdex commented Jan 31, 2016

parse() performance is improved by ~20-200% with the various querystring-parse benchmarks.

Some optimization strategies used include:

  • Combining multiple searches (for '&', '=', and '+') on the same
    string into a single loop
  • Avoiding string.split()
  • Minimizing creation of temporary strings
  • Avoiding string decoding if no encoded bytes were found and the
    default string decoder is being used

escape() performance is improved a bit, up to ~15% with the various querystring-stringify benchmarks by reducing the number of string concatenations and avoiding a potential deopt if the input string ends on an incomplete multibyte character.

Also, a constant deopt in unescapeBuffer() is avoided by checking the index (to make sure it is not out of bounds) passed to charCodeAt()

@mscdex mscdex added the querystring Issues and PRs related to the built-in querystring module. label Jan 31, 2016
@jbergstroem
Copy link
Member

@infusion
Copy link

Why did you remove the str.length cache in escape()?

@mscdex
Copy link
Contributor Author

mscdex commented Jan 31, 2016

@infusion It's not necessary with modern versions of v8.

@@ -18,7 +18,7 @@ QueryString.unescapeBuffer = function(s, decodeSpaces) {
var n, m, hexchar;

for (var inIndex = 0, outIndex = 0; inIndex <= s.length; inIndex++) {
var c = s.charCodeAt(inIndex);
var c = inIndex < s.length ? s.charCodeAt(inIndex) : NaN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what is the benefit of using NaN here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

charCodeAt() returns NaN for out of bounds indices. I was just keeping the same behavior here but avoiding the deopt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh that makes sense. Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be drawbacks to changing the <= to < in the loop test? You'd have to replicate some of the out[outIndex++] assignments after the loop but it would keep the loop body simple. I guess you can also accomplish that with a s/NaN/0/.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it as-is to keep changes minimal. I typically don't like to duplicate code when reusing the loop logic like that is easy/simple enough.

@mscdex
Copy link
Contributor Author

mscdex commented Feb 1, 2016

@bnoordhuis I've fixed the missing post-OptimizeFunctionOnNextCall function calls. Performance is still the same FWIW.

@jasnell
Copy link
Member

jasnell commented Feb 1, 2016

LGTM

@@ -18,7 +18,7 @@ QueryString.unescapeBuffer = function(s, decodeSpaces) {
var n, m, hexchar;

for (var inIndex = 0, outIndex = 0; inIndex <= s.length; inIndex++) {
var c = s.charCodeAt(inIndex);
var c = inIndex < s.length ? s.charCodeAt(inIndex) : NaN;
switch (state) {
case 'CHAR':
switch (c) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you can eke out some more performance if you replace the calls to charCode() with their number literal equivalents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// slower
if (keys.indexOf(key) === -1) {
obj[key] = value;
keys.push(key);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex General question – isn't keys[keys.length] = key still better than keys.push(key)?

Few months ago I've inspected compiled code using IRHydra and I remember some difference between pushing items and assigning by array length in loop in favor of second one.

Does it matter anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing with Chrome with v8 4.7 on jsperf shows that using arr[arr.length] = x is indeed much faster, but there wasn't as large of a performance gain in the node benchmark (including a new benchmark input I just added that has even more duplicate keys). However I've changed it anyway for the small performance increase it does provide.

@mscdex
Copy link
Contributor Author

mscdex commented Feb 11, 2016

Can I get some more LGTMs on this one?

/cc @nodejs/collaborators

@jasnell
Copy link
Member

jasnell commented Feb 11, 2016

Still LGTM

@silverwind
Copy link
Contributor

LGTM pending CI.

CI: https://ci.nodejs.org/job/node-test-pull-request/1638/

@mcollina
Copy link
Member

LGTM

This commit improves parse() performance by ~20-200% with the various
querystring-parse benchmarks.

Some optimization strategies used in this commit include:
* Combining multiple searches (for '&', '=', and '+') on the same
   string into a single loop
* Avoiding string.split()
* Minimizing creation of temporary strings
* Avoiding string decoding if no encoded bytes were found and the
   default string decoder is being used
Before this, v8 would deopt when an out of bounds `inIndex` would get
passed to charCodeAt(). charCodeAt() returns NaN in such cases, so we
directly emulate that behavior as well.

Also, calls to charCodeAt() for constant strings have been replaced
by the raw character codes and parser state is now stored as an
integer instead of a string. Both of these provide a slight
performance increase.
This commit improves escape() performance by up to 15% with the
existing querystring-stringify benchmarks by reducing the number
of string concatentations. A potential deopt is also avoided by
making sure the index passed to charCodeAt() is within bounds.
@mscdex
Copy link
Contributor Author

mscdex commented Feb 13, 2016

CI again since the last one had CI infrastructure issues: https://ci.nodejs.org/job/node-test-commit/2216/

mscdex added a commit that referenced this pull request Feb 13, 2016
This commit improves parse() performance by ~20-200% with the various
querystring-parse benchmarks.

Some optimization strategies used in this commit include:
* Combining multiple searches (for '&', '=', and '+') on the same
   string into a single loop
* Avoiding string.split()
* Minimizing creation of temporary strings
* Avoiding string decoding if no encoded bytes were found and the
   default string decoder is being used

PR-URL: #5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
mscdex added a commit that referenced this pull request Feb 13, 2016
Before this, v8 would deopt when an out of bounds `inIndex` would get
passed to charCodeAt(). charCodeAt() returns NaN in such cases, so we
directly emulate that behavior as well.

Also, calls to charCodeAt() for constant strings have been replaced
by the raw character codes and parser state is now stored as an
integer instead of a string. Both of these provide a slight
performance increase.

PR-URL: #5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
mscdex added a commit that referenced this pull request Feb 13, 2016
This commit improves escape() performance by up to 15% with the
existing querystring-stringify benchmarks by reducing the number
of string concatentations. A potential deopt is also avoided by
making sure the index passed to charCodeAt() is within bounds.

PR-URL: #5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
@mscdex
Copy link
Contributor Author

mscdex commented Feb 13, 2016

Landed in 00638ac, c8e650d, and a2a69a2.

@mscdex mscdex closed this Feb 13, 2016
@mscdex mscdex deleted the perf-querystring branch February 13, 2016 01:33
rvagg pushed a commit that referenced this pull request Feb 15, 2016
This commit improves parse() performance by ~20-200% with the various
querystring-parse benchmarks.

Some optimization strategies used in this commit include:
* Combining multiple searches (for '&', '=', and '+') on the same
   string into a single loop
* Avoiding string.split()
* Minimizing creation of temporary strings
* Avoiding string decoding if no encoded bytes were found and the
   default string decoder is being used

PR-URL: #5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
rvagg pushed a commit that referenced this pull request Feb 15, 2016
Before this, v8 would deopt when an out of bounds `inIndex` would get
passed to charCodeAt(). charCodeAt() returns NaN in such cases, so we
directly emulate that behavior as well.

Also, calls to charCodeAt() for constant strings have been replaced
by the raw character codes and parser state is now stored as an
integer instead of a string. Both of these provide a slight
performance increase.

PR-URL: #5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
rvagg pushed a commit that referenced this pull request Feb 15, 2016
This commit improves escape() performance by up to 15% with the
existing querystring-stringify benchmarks by reducing the number
of string concatentations. A potential deopt is also avoided by
making sure the index passed to charCodeAt() is within bounds.

PR-URL: #5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
rvagg added a commit that referenced this pull request Feb 21, 2016
* buffer:
  - You can now supply an encoding argument when filling a
    Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying
    an existing Buffer will also work with
    Buffer#fill(buffer[, start[, end]]). See the API documentation for
    details on how this works. (Trevor Norris) #4935
  - Buffer#indexOf() no longer requires a byteOffset argument if you
    also wish to specify an encoding:
    Buffer#indexOf(val[, byteOffset][, encoding]).
    (Trevor Norris) #4803
* child_process: spawn() and spawnSync() now support a 'shell' option
  to allow for optional execution of the given command inside a shell.
  If set to true, cmd.exe will be used on Windows and /bin/sh
  elsewhere. A path to a custom shell can also be passed to override
  these defaults. On Windows, this option allows .bat. and .cmd files
  to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598
* http_parser: Update to http-parser 2.6.2 to fix an unintentionally
  strict limitation of allowable header characters.
  (James M Snell) #5237
* dgram: socket.send() now supports accepts an array of Buffers or
  Strings as the first argument. See the API docs for details on how
  this works. (Matteo Collina) #4374
* http: Fix a bug where handling headers will mistakenly trigger an
  'upgrade' event where the server is just advertising its protocols.
  This bug can prevent HTTP clients from communicating with HTTP/2
  enabled servers. (Fedor Indutny) #4337
* net: Added a listening Boolean property to net and http servers to
  indicate whether the server is listening for connections.
  (José Moreira) #4743
* node: The C++ node::MakeCallback() API is now reentrant and calling
  it from inside another MakeCallback() call no longer causes the
  nextTick queue or Promises microtask queue to be processed out of
  order. (Trevor Norris) #4507
* tls: Add a new tlsSocket.getProtocol() method to get the negotiated
  TLS protocol version of the current connection. (Brian White) #4995
* vm: Introduce new 'produceCachedData' and 'cachedData' options to
  new vm.Script() to interact with V8's code cache. When a new
  vm.Script object is created with the 'produceCachedData' set to true
  a Buffer with V8's code cache data will be produced and stored in
  cachedData property of the returned object. This data in turn may be
  supplied back to another vm.Script() object with a 'cachedData'
  option if the supplied source is the same. Successfully executing a
  script from cached data can speed up instantiation time. See the API
  docs for details. (Fedor Indutny) #4777
* performance: Improvements in:
  - process.nextTick() (Ruben Bridgewater) #5092
  - path module (Brian White) #5123
  - querystring module (Brian White) #5012
  - streams module when processing small chunks (Matteo Collina) #4354
rvagg added a commit that referenced this pull request Feb 21, 2016
* buffer:
  - You can now supply an encoding argument when filling a
    Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying
    an existing Buffer will also work with
    Buffer#fill(buffer[, start[, end]]). See the API documentation for
    details on how this works. (Trevor Norris) #4935
  - Buffer#indexOf() no longer requires a byteOffset argument if you
    also wish to specify an encoding:
    Buffer#indexOf(val[, byteOffset][, encoding]).
    (Trevor Norris) #4803
* child_process: spawn() and spawnSync() now support a 'shell' option
  to allow for optional execution of the given command inside a shell.
  If set to true, cmd.exe will be used on Windows and /bin/sh
  elsewhere. A path to a custom shell can also be passed to override
  these defaults. On Windows, this option allows .bat. and .cmd files
  to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598
* http_parser: Update to http-parser 2.6.2 to fix an unintentionally
  strict limitation of allowable header characters.
  (James M Snell) #5237
* dgram: socket.send() now supports accepts an array of Buffers or
  Strings as the first argument. See the API docs for details on how
  this works. (Matteo Collina) #4374
* http: Fix a bug where handling headers will mistakenly trigger an
  'upgrade' event where the server is just advertising its protocols.
  This bug can prevent HTTP clients from communicating with HTTP/2
  enabled servers. (Fedor Indutny) #4337
* net: Added a listening Boolean property to net and http servers to
  indicate whether the server is listening for connections.
  (José Moreira) #4743
* node: The C++ node::MakeCallback() API is now reentrant and calling
  it from inside another MakeCallback() call no longer causes the
  nextTick queue or Promises microtask queue to be processed out of
  order. (Trevor Norris) #4507
* tls: Add a new tlsSocket.getProtocol() method to get the negotiated
  TLS protocol version of the current connection. (Brian White) #4995
* vm: Introduce new 'produceCachedData' and 'cachedData' options to
  new vm.Script() to interact with V8's code cache. When a new
  vm.Script object is created with the 'produceCachedData' set to true
  a Buffer with V8's code cache data will be produced and stored in
  cachedData property of the returned object. This data in turn may be
  supplied back to another vm.Script() object with a 'cachedData'
  option if the supplied source is the same. Successfully executing a
  script from cached data can speed up instantiation time. See the API
  docs for details. (Fedor Indutny) #4777
* performance: Improvements in:
  - process.nextTick() (Ruben Bridgewater) #5092
  - path module (Brian White) #5123
  - querystring module (Brian White) #5012
  - streams module when processing small chunks (Matteo Collina) #4354
rvagg added a commit that referenced this pull request Feb 23, 2016
* buffer:
  - You can now supply an encoding argument when filling a
    Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying
    an existing Buffer will also work with
    Buffer#fill(buffer[, start[, end]]). See the API documentation for
    details on how this works. (Trevor Norris) #4935
  - Buffer#indexOf() no longer requires a byteOffset argument if you
    also wish to specify an encoding:
    Buffer#indexOf(val[, byteOffset][, encoding]).
    (Trevor Norris) #4803
* child_process: spawn() and spawnSync() now support a 'shell' option
  to allow for optional execution of the given command inside a shell.
  If set to true, cmd.exe will be used on Windows and /bin/sh
  elsewhere. A path to a custom shell can also be passed to override
  these defaults. On Windows, this option allows .bat. and .cmd files
  to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598
* http_parser: Update to http-parser 2.6.2 to fix an unintentionally
  strict limitation of allowable header characters.
  (James M Snell) #5237
* dgram: socket.send() now supports accepts an array of Buffers or
  Strings as the first argument. See the API docs for details on how
  this works. (Matteo Collina) #4374
* http: Fix a bug where handling headers will mistakenly trigger an
  'upgrade' event where the server is just advertising its protocols.
  This bug can prevent HTTP clients from communicating with HTTP/2
  enabled servers. (Fedor Indutny) #4337
* net: Added a listening Boolean property to net and http servers to
  indicate whether the server is listening for connections.
  (José Moreira) #4743
* node: The C++ node::MakeCallback() API is now reentrant and calling
  it from inside another MakeCallback() call no longer causes the
  nextTick queue or Promises microtask queue to be processed out of
  order. (Trevor Norris) #4507
* tls: Add a new tlsSocket.getProtocol() method to get the negotiated
  TLS protocol version of the current connection. (Brian White) #4995
* vm: Introduce new 'produceCachedData' and 'cachedData' options to
  new vm.Script() to interact with V8's code cache. When a new
  vm.Script object is created with the 'produceCachedData' set to true
  a Buffer with V8's code cache data will be produced and stored in
  cachedData property of the returned object. This data in turn may be
  supplied back to another vm.Script() object with a 'cachedData'
  option if the supplied source is the same. Successfully executing a
  script from cached data can speed up instantiation time. See the API
  docs for details. (Fedor Indutny) #4777
* performance: Improvements in:
  - process.nextTick() (Ruben Bridgewater) #5092
  - path module (Brian White) #5123
  - querystring module (Brian White) #5012
  - streams module when processing small chunks (Matteo Collina) #4354

PR-URL: #5295
stefanmb pushed a commit to stefanmb/node that referenced this pull request Feb 23, 2016
This commit improves parse() performance by ~20-200% with the various
querystring-parse benchmarks.

Some optimization strategies used in this commit include:
* Combining multiple searches (for '&', '=', and '+') on the same
   string into a single loop
* Avoiding string.split()
* Minimizing creation of temporary strings
* Avoiding string decoding if no encoded bytes were found and the
   default string decoder is being used

PR-URL: nodejs#5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
stefanmb pushed a commit to stefanmb/node that referenced this pull request Feb 23, 2016
Before this, v8 would deopt when an out of bounds `inIndex` would get
passed to charCodeAt(). charCodeAt() returns NaN in such cases, so we
directly emulate that behavior as well.

Also, calls to charCodeAt() for constant strings have been replaced
by the raw character codes and parser state is now stored as an
integer instead of a string. Both of these provide a slight
performance increase.

PR-URL: nodejs#5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
stefanmb pushed a commit to stefanmb/node that referenced this pull request Feb 23, 2016
This commit improves escape() performance by up to 15% with the
existing querystring-stringify benchmarks by reducing the number
of string concatentations. A potential deopt is also avoided by
making sure the index passed to charCodeAt() is within bounds.

PR-URL: nodejs#5012
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
rvagg added a commit that referenced this pull request Feb 24, 2016
* buffer:
  - You can now supply an encoding argument when filling a
    Buffer Buffer#fill(string[, start[, end]][, encoding]), supplying
    an existing Buffer will also work with
    Buffer#fill(buffer[, start[, end]]). See the API documentation for
    details on how this works. (Trevor Norris) #4935
  - Buffer#indexOf() no longer requires a byteOffset argument if you
    also wish to specify an encoding:
    Buffer#indexOf(val[, byteOffset][, encoding]).
    (Trevor Norris) #4803
* child_process: spawn() and spawnSync() now support a 'shell' option
  to allow for optional execution of the given command inside a shell.
  If set to true, cmd.exe will be used on Windows and /bin/sh
  elsewhere. A path to a custom shell can also be passed to override
  these defaults. On Windows, this option allows .bat. and .cmd files
  to be executed with spawn() and spawnSync(). (Colin Ihrig) #4598
* http_parser: Update to http-parser 2.6.2 to fix an unintentionally
  strict limitation of allowable header characters.
  (James M Snell) #5237
* dgram: socket.send() now supports accepts an array of Buffers or
  Strings as the first argument. See the API docs for details on how
  this works. (Matteo Collina) #4374
* http: Fix a bug where handling headers will mistakenly trigger an
  'upgrade' event where the server is just advertising its protocols.
  This bug can prevent HTTP clients from communicating with HTTP/2
  enabled servers. (Fedor Indutny) #4337
* net: Added a listening Boolean property to net and http servers to
  indicate whether the server is listening for connections.
  (José Moreira) #4743
* node: The C++ node::MakeCallback() API is now reentrant and calling
  it from inside another MakeCallback() call no longer causes the
  nextTick queue or Promises microtask queue to be processed out of
  order. (Trevor Norris) #4507
* tls: Add a new tlsSocket.getProtocol() method to get the negotiated
  TLS protocol version of the current connection. (Brian White) #4995
* vm: Introduce new 'produceCachedData' and 'cachedData' options to
  new vm.Script() to interact with V8's code cache. When a new
  vm.Script object is created with the 'produceCachedData' set to true
  a Buffer with V8's code cache data will be produced and stored in
  cachedData property of the returned object. This data in turn may be
  supplied back to another vm.Script() object with a 'cachedData'
  option if the supplied source is the same. Successfully executing a
  script from cached data can speed up instantiation time. See the API
  docs for details. (Fedor Indutny) #4777
* performance: Improvements in:
  - process.nextTick() (Ruben Bridgewater) #5092
  - path module (Brian White) #5123
  - querystring module (Brian White) #5012
  - streams module when processing small chunks (Matteo Collina) #4354

PR-URL: #5295
@MylesBorins
Copy link
Contributor

Adding the LTS watch flag, but I think this will have to sit for a while before we know that this is stable.

Thoughts?

@jasnell
Copy link
Member

jasnell commented Mar 11, 2016

Yeah, like the path changes, we'll want to let this sit for a good long time before backporting.

@rvagg
Copy link
Member

rvagg commented Mar 14, 2016

-1 for LTS is my vote (not being absolute, if you two want to disagree with me). I'm leaning that way for perf changes, stronger for larger changes (not just in LOC but impact). Performance profile of LTS should be relatively stable over time and we owe it to users not to screw with things too much. While we can pick up edge cases with Stable releases, there's a whole different sector of users who use LTS that may experience totally different edge cases than users of Stable might.

@MylesBorins
Copy link
Contributor

@rvagg I don't disagree.
What I do think could be interesting though is keeping track of larger changes like this and all the future regressions so that it will be easy to backport the entire lot if we need to (e.g. to many changes making the overall backporting process a nightmare)

@MylesBorins
Copy link
Contributor

marking this don't land for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
querystring Issues and PRs related to the built-in querystring module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.