Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src,lib,buffer: improve atob / btoa performance #38433

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

XadillaX
Copy link
Contributor

@XadillaX XadillaX commented Apr 27, 2021

Extract modp_b64 dependence from Chromium to improve performance of atob() and btoa().

Refs: https://github.com/chromium/chromium/tree/92.0.4490.1/third_party/modp_b64

Benchmark

  • 12 Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
  • MemTotal: 32657380 kB

Old logic

$ ./node benchmark/run.js --filter buffer-atob.js buffers

buffers/buffer-atob.js
buffers/buffer-atob.js n=32 len=65536: 487.7280162769471

$ ./node benchmark/run.js --filter buffer-btoa.js buffers

buffers/buffer-btoa.js
buffers/buffer-btoa.js n=32 len=67108864: 2.4594857665824494

Logic of this PR

$ ./node benchmark/run.js --filter buffer-atob.js buffers

buffers/buffer-atob.js
buffers/buffer-atob.js n=32 len=65536: 10,588.320156389489

$ ./node benchmark/run.js --filter buffer-btoa.js buffers

buffers/buffer-btoa.js
buffers/buffer-btoa.js n=32 len=67108864: 3.503510145848807

Extract `modp_b64` dependence from Chromium to improve performance of
`atob()` and `btoa()`.

Refs: https://github.com/chromium/chromium/tree/92.0.4490.1/third_party/modp_b64
@github-actions github-actions bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Apr 27, 2021
@XadillaX XadillaX added the buffer Issues and PRs related to the buffer subsystem. label Apr 27, 2021
@@ -0,0 +1,44 @@
// Copyright Joyent, Inc. and other Node contributors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New files added should never have a copyright header added to them.

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the performance boost, I'm generally -1 on introducing a new dependency and a different codepath for base64 encoding/decoding only for atob/btoa. If anything, we should be looking at improving the base64/base64url encoding performance for Buffer. /cc @addaleax

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is far too much optimization for methods that should never be used.

@XadillaX
Copy link
Contributor Author

Despite the performance boost, I'm generally -1 on introducing a new dependency and a different codepath for base64 encoding/decoding only for atob/btoa. If anything, we should be looking at improving the base64/base64url encoding performance for Buffer. /cc @addaleax

I think I can try to use modp b64 in Buffer's Base64 if it's really faster.

This dep is extracted from Chromium.

@legendecas
Copy link
Member

According to https://github.com/nodejs/node/blob/master/doc/guides/contributing/pull-requests.md#dependencies policy, changes on deps/* should be sent to upstream first. In that case, as Node.js already have their own implementation of base64 functionality, adding a new dependency for base64 solely doesn't seem to be the best effort to reduce the burden of collaborators. While the base64 codebase isn't quite large, are there any improvements that are worth to be made on Node.js own base64 implementation?

@XadillaX
Copy link
Contributor Author

XadillaX commented Apr 28, 2021

@addaleax @jasnell I've updated, removing modp_b64 and using base64_encode. And got the new benchmark:

$ ./node benchmark/run.js --filter buffer-btoa.js buffers

buffers/buffer-btoa.js
buffers/buffer-btoa.js n=32 len=67108864: 3.3501380480392817

$ ./node benchmark/run.js --filter buffer-atob.js buffers

buffers/buffer-atob.js
buffers/buffer-atob.js n=32 len=65536: 6,966.175950365997

And I've found that modp_b64's decoding is faster then code in Node.js'.

Maybe we can merge this PR first (if possible), and then use modp_b64 to instead of Node.js' current Base64 code.

@addaleax
Copy link
Member

@XadillaX In order to merge this, I think we should:

  • Remove the C++-backed atob and btoa implementations
  • Remove their benchmarks
  • Undo the stylistic changes that are happening to unrelated code

@XadillaX
Copy link
Contributor Author

@XadillaX In order to merge this, I think we should:

  • Remove the C++-backed atob and btoa implementations
  • Remove their benchmarks
  • Undo the stylistic changes that are happening to unrelated code

The C++-backed implementations is just fasten the function to resolve:

// TODO(@jasnell): The implementation here has not been performance
// optimized in any way.

And this is not a large change, just call base64_encode directly to avoid do unnecessary Buffer call.

@addaleax
Copy link
Member

addaleax commented Apr 28, 2021

The C++-backed implementations is just fasten the function to resolve:

Then let’s remove the TODO comment, or edit it to make clear that there’s not actually anything to do there.

Making these functions faster tells people that it’s okay to use them. It’s not, therefore let’s not worry about their performance at all.

@@ -60,6 +60,7 @@ void OnFatalError(const char* location, const char* message);
V(ERR_INVALID_ARG_VALUE, TypeError) \
V(ERR_OSSL_EVP_INVALID_DIGEST, Error) \
V(ERR_INVALID_ARG_TYPE, TypeError) \
V(ERR_INVALID_CHARACTER, RangeError) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOMException are expected in atob/btoa per https://html.spec.whatwg.org/multipage/webappapis.html#dom-atob

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's wrapped in JavaScript side.

btw, this PR seems won't be merged according to @addaleax's suggestion.

@XadillaX
Copy link
Contributor Author

XadillaX commented May 1, 2021

@addaleax I've tried to use modp_b64 to instead of current and found the performance is almost the same (c975dff...XadillaX:base64-perf).

So the way to improve this PR is just to make atob() / btoa() using base64_encode / base64_decode directly.

But as you said before, there's no need to improve them.

That means the only thing I should do is to remove @jasnell's TODO comment?

/cc @jasnell

XadillaX added a commit to XadillaX/node that referenced this pull request May 5, 2021
aduh95 pushed a commit to XadillaX/node that referenced this pull request May 9, 2021
Refs: nodejs#38433 (comment)

PR-URL: nodejs#38548
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Zijian Liu <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
targos pushed a commit that referenced this pull request May 17, 2021
Refs: #38433 (comment)

PR-URL: #38548
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Zijian Liu <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
targos pushed a commit that referenced this pull request May 30, 2021
Refs: #38433 (comment)

PR-URL: #38548
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Zijian Liu <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
targos pushed a commit that referenced this pull request Jun 5, 2021
Refs: #38433 (comment)

PR-URL: #38548
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Zijian Liu <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
targos pushed a commit that referenced this pull request Jun 5, 2021
Refs: #38433 (comment)

PR-URL: #38548
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Zijian Liu <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
targos pushed a commit that referenced this pull request Jun 11, 2021
Refs: #38433 (comment)

PR-URL: #38548
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Zijian Liu <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
@dcousens
Copy link

dcousens commented Feb 5, 2024

We are thinking about using atob and btoa in feross/buffer#339 for web compatibility.
We have seen quite quickly that the node implementations are poorly optimised.

Although the buffer module is directed at browsers, if user code unintentionally finds it's way to using the buffer module in a node environment and subsequently ends up using atob or btoa, this is only a loss for users of node, not our library.

@lemire
Copy link
Member

lemire commented Feb 6, 2024

Note that it is planned that simdutf could provide fast routines for base64 in the near future.

See
nodejs/performance#128

@lemire lemire mentioned this pull request Mar 16, 2024
@lemire
Copy link
Member

lemire commented Mar 16, 2024

The simdutf library (already part of Node.js) will have accelerated base64 support in the next release. Decoding will follow the WHATWG forgiving-base64 specification so that ASCII white spaces are allowed in base64 content.

simdutf/simdutf#375

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants