Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url: treat special characters in hostnames more strictly in url.parse() #45046

Closed
wants to merge 1 commit into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Oct 18, 2022

Throw if ^, |, and some other special characters are in the hostname, similar to WHATWG URL.

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. url Issues and PRs related to the legacy built-in url module. labels Oct 18, 2022
@Trott Trott added the request-ci Add this label to start a Jenkins CI on a PR. label Oct 18, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 18, 2022
@nodejs-github-bot

This comment was marked as outdated.

@nodejs-github-bot

This comment was marked as outdated.

@nodejs-github-bot
Copy link
Collaborator

@Trott Trott requested review from lpinca and anonrig October 18, 2022 05:12
lib/url.js Outdated
this.host = rest.slice(start, nonHost);
// WHATWG URL removes tabs, newlines, and carriage returns. Let's do that too.
Copy link
Member

@lpinca lpinca Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does it everywhere, not only on the host component. I'm not sure if url.parse() should do the same and how many breaking changes are tolerable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does it everywhere, not only on the host component. I'm not sure if url.parse() should do the same and how many breaking changes are tolerable.

I agree in principle, but I think narrowly scoping to hostname to start (and maybe even stopping there) makes a lot of sense as hostname spoofing is a bigger security concern than path or scheme spoofing. (Or at least that's my assumption. If anyone else thinks that's incorrect, I'm open to being persuaded.)

@Trott

This comment was marked as resolved.

@anonrig
Copy link
Member

anonrig commented Oct 18, 2022

@Trott Can we run a benchmark to see the performance impact of this pull request?

@Trott
Copy link
Member Author

Trott commented Oct 18, 2022

@Trott Can we run a benchmark to see the performance impact of this pull request?

Sure.

But I think I'm OK if url.parse() doesn't perform as well as new URL().

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when this goes out in a release, let's be sure to include details on the performance impact of these changes. The perf regression is a key reason why these kinds of changes weren't made before. What we're essentially saying is that we are no longer committing to maintaining the same perf on url.parse().

Copy link
Member

@anonrig anonrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Left a small/non-blocking comment.

lib/url.js Outdated Show resolved Hide resolved
@Trott Trott added commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. request-ci Add this label to start a Jenkins CI on a PR. labels Oct 19, 2022
@Trott

This comment was marked as outdated.

@Trott

This comment was marked as outdated.

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 19, 2022
@nodejs-github-bot

This comment was marked as outdated.

@Trott
Copy link
Member Author

Trott commented Oct 19, 2022

To unbreak one of the benchmark URL, we use toAscii()/punycode when we find % in a domain, like WHATWG URL.

@Trott
Copy link
Member Author

Trott commented Oct 19, 2022

@Trott
Copy link
Member Author

Trott commented Oct 19, 2022

@nodejs-github-bot

This comment was marked as outdated.

@nodejs-github-bot
Copy link
Collaborator

@Trott
Copy link
Member Author

Trott commented Oct 19, 2022

Here are all 55 of the three-star/highest-confidence results from the benchmark run. 😱

04:19:50 url/legacy-vs-whatwg-url-get-prop.js e=1 method='legacy' type='dot'                                              ***     21.79 %      ±11.67% ±15.53% ±20.21%
04:19:50 url/legacy-vs-whatwg-url-get-prop.js e=1 method='legacy' type='percent'                                          ***     55.40 %      ±11.01% ±14.67% ±19.13%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='auth' withBase='false'                               ***    -13.89 %       ±7.35%  ±9.79% ±12.74%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='dot' withBase='false'                                ***    -26.02 %       ±5.92%  ±7.88% ±10.27%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='dot' withBase='true'                                 ***    -25.38 %       ±6.08%  ±8.12% ±10.64%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='idn' withBase='false'                                ***    -21.70 %       ±6.91%  ±9.23% ±12.10%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='idn' withBase='true'                                 ***    -19.09 %       ±5.25%  ±7.00%  ±9.15%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='percent' withBase='false'                            ***    -32.08 %       ±5.45%  ±7.25%  ±9.45%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='percent' withBase='true'                             ***    -34.36 %       ±5.39%  ±7.19%  ±9.39%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='short' withBase='false'                              ***    -22.96 %       ±5.92%  ±7.88% ±10.26%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='short' withBase='true'                               ***    -27.76 %       ±7.29%  ±9.75% ±12.79%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='wpt' withBase='false'                                ***    -12.39 %       ±6.88%  ±9.16% ±11.92%
04:19:50 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='wpt' withBase='true'                                 ***    -11.71 %       ±6.67%  ±8.88% ±11.55%
04:19:50 url/url-parse.js n=10000000 type='escaped'                                                                       ***    -42.61 %       ±2.12%  ±2.85%  ±3.74%
04:19:50 url/url-parse.js n=10000000 type='normal'                                                                        ***    -49.44 %       ±1.48%  ±1.98%  ±2.60%
04:19:50 url/url-resolve.js n=100000 path='down' href='auth'                                                              ***    -17.13 %       ±2.62%  ±3.49%  ±4.55%
04:19:50 url/url-resolve.js n=100000 path='down' href='dot'                                                               ***    -11.41 %       ±4.99%  ±6.69%  ±8.78%
04:19:50 url/url-resolve.js n=100000 path='down' href='file'                                                              ***     -4.20 %       ±1.87%  ±2.49%  ±3.24%
04:19:50 url/url-resolve.js n=100000 path='down' href='idn'                                                               ***    -13.59 %       ±3.60%  ±4.81%  ±6.31%
04:19:50 url/url-resolve.js n=100000 path='down' href='long'                                                              ***    -12.39 %       ±2.30%  ±3.06%  ±3.99%
04:19:50 url/url-resolve.js n=100000 path='down' href='percent'                                                           ***    -27.67 %       ±3.74%  ±4.99%  ±6.53%
04:19:50 url/url-resolve.js n=100000 path='down' href='short'                                                             ***    -17.52 %       ±2.95%  ±3.94%  ±5.16%
04:19:50 url/url-resolve.js n=100000 path='down' href='ws'                                                                ***    -11.66 %       ±4.96%  ±6.60%  ±8.59%
04:19:50 url/url-resolve.js n=100000 path='foo/bar' href='auth'                                                           ***    -16.37 %       ±4.48%  ±5.96%  ±7.76%
04:19:50 url/url-resolve.js n=100000 path='foo/bar' href='dot'                                                            ***    -20.02 %       ±7.06%  ±9.46% ±12.44%
04:19:50 url/url-resolve.js n=100000 path='foo/bar' href='idn'                                                            ***    -19.17 %       ±5.72%  ±7.65% ±10.03%
04:19:50 url/url-resolve.js n=100000 path='foo/bar' href='long'                                                           ***    -13.00 %       ±3.64%  ±4.85%  ±6.33%
04:19:50 url/url-resolve.js n=100000 path='foo/bar' href='percent'                                                        ***    -33.90 %       ±3.48%  ±4.67%  ±6.14%
04:19:50 url/url-resolve.js n=100000 path='foo/bar' href='short'                                                          ***    -18.60 %       ±5.24%  ±6.97%  ±9.08%
04:19:50 url/url-resolve.js n=100000 path='foo/bar' href='ws'                                                             ***    -18.65 %       ±7.02%  ±9.38% ±12.29%
04:19:50 url/url-resolve.js n=100000 path='sibling' href='auth'                                                           ***    -16.80 %       ±4.46%  ±5.98%  ±7.88%
04:19:50 url/url-resolve.js n=100000 path='sibling' href='dot'                                                            ***    -18.36 %       ±4.24%  ±5.68%  ±7.44%
04:19:50 url/url-resolve.js n=100000 path='sibling' href='idn'                                                            ***    -16.18 %       ±3.72%  ±4.96%  ±6.48%
04:19:50 url/url-resolve.js n=100000 path='sibling' href='long'                                                           ***    -14.56 %       ±4.39%  ±5.86%  ±7.67%
04:19:50 url/url-resolve.js n=100000 path='sibling' href='percent'                                                        ***    -28.08 %       ±4.71%  ±6.29%  ±8.23%
04:19:50 url/url-resolve.js n=100000 path='sibling' href='short'                                                          ***    -18.88 %       ±4.00%  ±5.35%  ±7.05%
04:19:50 url/url-resolve.js n=100000 path='sibling' href='ws'                                                             ***    -12.67 %       ±2.54%  ±3.38%  ±4.40%
04:19:50 url/url-resolve.js n=100000 path='up' href='auth'                                                                ***    -16.19 %       ±2.94%  ±3.92%  ±5.11%
04:19:50 url/url-resolve.js n=100000 path='up' href='dot'                                                                 ***    -16.40 %       ±4.00%  ±5.34%  ±7.00%
04:19:50 url/url-resolve.js n=100000 path='up' href='idn'                                                                 ***    -15.95 %       ±4.33%  ±5.79%  ±7.58%
04:19:50 url/url-resolve.js n=100000 path='up' href='long'                                                                ***    -12.24 %       ±2.60%  ±3.46%  ±4.51%
04:19:50 url/url-resolve.js n=100000 path='up' href='percent'                                                             ***    -24.43 %       ±3.97%  ±5.29%  ±6.89%
04:19:50 url/url-resolve.js n=100000 path='up' href='short'                                                               ***    -17.75 %       ±1.62%  ±2.15%  ±2.80%
04:19:50 url/url-resolve.js n=100000 path='up' href='ws'                                                                  ***    -14.06 %       ±3.80%  ±5.07%  ±6.61%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='auth'                                                        ***    -20.71 %       ±4.10%  ±5.49%  ±7.21%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='dot'                                                         ***    -28.56 %       ±5.13%  ±6.83%  ±8.89%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='file'                                                        ***    -22.89 %       ±1.31%  ±1.74%  ±2.27%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='idn'                                                         ***    -19.49 %       ±5.37%  ±7.14%  ±9.29%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='javascript'                                                  ***    -24.86 %       ±2.92%  ±3.91%  ±5.15%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='long'                                                        ***    -18.77 %       ±1.74%  ±2.32%  ±3.03%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='noscheme'                                                    ***    -22.98 %       ±1.45%  ±1.93%  ±2.53%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='percent'                                                     ***    -37.70 %       ±4.37%  ±5.84%  ±7.65%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='short'                                                       ***    -29.33 %       ±3.16%  ±4.23%  ±5.54%
04:19:50 url/url-resolve.js n=100000 path='withscheme' href='ws'                                                          ***    -21.58 %       ±4.64%  ±6.17%  ±8.05%

@Trott
Copy link
Member Author

Trott commented Oct 19, 2022

Benchmark with replaceAll() replaced by replace(). https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1202/

12:35:27 url/legacy-vs-whatwg-url-get-prop.js e=1 method='legacy' type='percent'                                          ***     50.11 %       ±9.92% ±13.26% ±17.40%
12:35:27 url/legacy-vs-whatwg-url-get-prop.js e=1 method='whatwg' type='ws'                                               ***      8.47 %       ±4.60%  ±6.13%  ±8.00%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='dot' withBase='false'                                ***    -24.97 %       ±5.22%  ±6.96%  ±9.11%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='dot' withBase='true'                                 ***    -20.63 %       ±6.86%  ±9.13% ±11.91%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='idn' withBase='false'                                ***    -14.34 %       ±5.09%  ±6.78%  ±8.82%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='idn' withBase='true'                                 ***    -16.03 %       ±6.07%  ±8.09% ±10.54%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='percent' withBase='false'                            ***    -35.88 %       ±4.89%  ±6.54%  ±8.56%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='percent' withBase='true'                             ***    -33.48 %       ±6.16%  ±8.19% ±10.67%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='short' withBase='false'                              ***    -25.37 %       ±6.38%  ±8.50% ±11.07%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='short' withBase='true'                               ***    -28.95 %       ±5.67%  ±7.56%  ±9.87%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='wpt' withBase='false'                                ***    -15.01 %       ±5.55%  ±7.41%  ±9.68%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='wpt' withBase='true'                                 ***    -15.21 %       ±5.64%  ±7.51%  ±9.79%
12:35:27 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='ws' withBase='true'                                  ***    -12.26 %       ±6.26%  ±8.34% ±10.86%
12:35:27 url/url-parse.js n=10000000 type='escaped'                                                                       ***    -40.37 %       ±1.26%  ±1.68%  ±2.19%
12:35:27 url/url-parse.js n=10000000 type='normal'                                                                        ***    -46.26 %       ±1.63%  ±2.19%  ±2.87%
12:35:27 url/url-resolve.js n=100000 path='down' href='auth'                                                              ***    -14.57 %       ±3.16%  ±4.22%  ±5.51%
12:35:27 url/url-resolve.js n=100000 path='down' href='dot'                                                               ***    -17.55 %       ±3.78%  ±5.07%  ±6.67%
12:35:27 url/url-resolve.js n=100000 path='down' href='idn'                                                               ***    -19.57 %       ±5.02%  ±6.72%  ±8.83%
12:35:27 url/url-resolve.js n=100000 path='down' href='long'                                                              ***    -11.82 %       ±2.53%  ±3.39%  ±4.46%
12:35:27 url/url-resolve.js n=100000 path='down' href='percent'                                                           ***    -27.30 %       ±3.59%  ±4.78%  ±6.22%
12:35:27 url/url-resolve.js n=100000 path='down' href='short'                                                             ***    -14.63 %       ±4.37%  ±5.82%  ±7.58%
12:35:27 url/url-resolve.js n=100000 path='down' href='ws'                                                                ***     -9.20 %       ±4.04%  ±5.39%  ±7.03%
12:35:27 url/url-resolve.js n=100000 path='foo/bar' href='auth'                                                           ***    -15.90 %       ±4.89%  ±6.51%  ±8.48%
12:35:27 url/url-resolve.js n=100000 path='foo/bar' href='dot'                                                            ***    -20.02 %       ±7.19%  ±9.60% ±12.54%
12:35:27 url/url-resolve.js n=100000 path='foo/bar' href='idn'                                                            ***    -21.22 %       ±7.49% ±10.02% ±13.14%
12:35:27 url/url-resolve.js n=100000 path='foo/bar' href='long'                                                           ***    -14.89 %       ±2.85%  ±3.82%  ±5.05%
12:35:27 url/url-resolve.js n=100000 path='foo/bar' href='percent'                                                        ***    -33.81 %       ±5.96%  ±7.96% ±10.42%
12:35:27 url/url-resolve.js n=100000 path='foo/bar' href='short'                                                          ***    -15.31 %       ±5.60%  ±7.46%  ±9.70%
12:35:27 url/url-resolve.js n=100000 path='foo/bar' href='ws'                                                             ***    -14.00 %       ±6.57%  ±8.74% ±11.38%
12:35:27 url/url-resolve.js n=100000 path='sibling' href='auth'                                                           ***    -13.76 %       ±3.68%  ±4.92%  ±6.44%
12:35:27 url/url-resolve.js n=100000 path='sibling' href='dot'                                                            ***    -17.20 %       ±4.49%  ±6.03%  ±7.96%
12:35:27 url/url-resolve.js n=100000 path='sibling' href='idn'                                                            ***    -18.82 %       ±5.93%  ±7.92% ±10.37%
12:35:27 url/url-resolve.js n=100000 path='sibling' href='long'                                                           ***    -12.82 %       ±2.38%  ±3.18%  ±4.17%
12:35:27 url/url-resolve.js n=100000 path='sibling' href='percent'                                                        ***    -28.80 %       ±3.03%  ±4.06%  ±5.32%
12:35:27 url/url-resolve.js n=100000 path='sibling' href='short'                                                          ***    -14.33 %       ±4.75%  ±6.33%  ±8.24%
12:35:27 url/url-resolve.js n=100000 path='sibling' href='ws'                                                             ***    -16.10 %       ±5.26%  ±7.04%  ±9.24%
12:35:27 url/url-resolve.js n=100000 path='up' href='auth'                                                                ***    -14.18 %       ±3.47%  ±4.62%  ±6.02%
12:35:27 url/url-resolve.js n=100000 path='up' href='dot'                                                                 ***    -17.47 %       ±3.33%  ±4.45%  ±5.83%
12:35:27 url/url-resolve.js n=100000 path='up' href='idn'                                                                 ***    -18.74 %       ±5.00%  ±6.65%  ±8.66%
12:35:27 url/url-resolve.js n=100000 path='up' href='long'                                                                ***    -11.42 %       ±2.09%  ±2.78%  ±3.63%
12:35:27 url/url-resolve.js n=100000 path='up' href='percent'                                                             ***    -24.32 %       ±1.50%  ±2.00%  ±2.60%
12:35:27 url/url-resolve.js n=100000 path='up' href='short'                                                               ***    -22.01 %       ±5.86%  ±7.88% ±10.43%
12:35:27 url/url-resolve.js n=100000 path='up' href='ws'                                                                  ***    -10.69 %       ±3.69%  ±4.92%  ±6.43%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='auth'                                                        ***    -25.74 %       ±4.61%  ±6.18%  ±8.14%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='dot'                                                         ***    -25.04 %       ±7.58% ±10.10% ±13.15%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='file'                                                        ***    -22.49 %       ±1.25%  ±1.67%  ±2.17%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='idn'                                                         ***    -24.88 %       ±4.27%  ±5.71%  ±7.51%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='javascript'                                                  ***    -22.87 %       ±3.31%  ±4.45%  ±5.90%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='long'                                                        ***    -18.05 %       ±3.55%  ±4.73%  ±6.17%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='noscheme'                                                    ***    -24.32 %       ±3.03%  ±4.06%  ±5.33%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='percent'                                                     ***    -36.58 %       ±4.60%  ±6.13%  ±7.98%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='short'                                                       ***    -27.53 %       ±3.16%  ±4.23%  ±5.56%
12:35:27 url/url-resolve.js n=100000 path='withscheme' href='ws'                                                          ***    -18.34 %       ±6.44%  ±8.56% ±11.15%

(Will try adding in Ruben's suggestion and benchmarking again.)

Throw if ^, |, and some other special characters are in the hostname,
similar to WHATWG URL.

Use punycode/toAscii when % appears in a hostname, like WHATWG URL.
@Trott
Copy link
Member Author

Trott commented Oct 25, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. needs-ci PRs that need a full CI run. url Issues and PRs related to the legacy built-in url module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants