Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns: refactor internal/dns/promises.js #27081

Closed
wants to merge 2 commits into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Apr 4, 2019

Use isIP() instead of isIPv4() since it does the additional
functionality that we were adding after our calls to isIP().

This not-so-incidentally also increases code coverage from tests. At
least one of the replaced ternaries was difficult to cover reliably
because operating system/configuration variances were too unpredictable.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines

Use `isIP()` instead of `isIPv4()` since it does the additional
functionality that we were adding after our calls to `isIP()`.

This not-so-incidentally also increases code coverage from tests. At
least one of the replaced ternaries was difficult to cover reliably
because operating system/configuration variances were too unpredictable.
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot nodejs-github-bot added the dns Issues and PRs related to the dns subsystem. label Apr 4, 2019
@nodejs-github-bot
Copy link
Collaborator

@richardlau
Copy link
Member

richardlau commented Apr 4, 2019

Use isIP() instead of isIPv4() since it does the additional
functionality that we were adding after our calls to isIP().

FWIW isIP() does do an extra regex test in the IPv6 case. Is this a hot path? Do we have any relevant benchmarks?

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

FWIW isIP() does do an extra regex test in the IPv6 case.

Yeah, I guess it's worth noting that, at least in theory, this can change the behavior. Previously, anything that wasn't IPv4 was automatically deemed IPv6. Now we're actually checking that the address is a legit IPv6, and if it's not, family will be 0. However, in practice, that shouldn't happen. Moreover, if there is a malformed address, I don't think throwing up our hands and calling it IPv6 is the correct behavior. That seems like a bug. Using 0 instead at least makes some sense. Throwing an error might be warranted but even if we decide to go that route, I'd rather do incremental change anyway.

Is this a hot path? Do we have any relevant benchmarks?

Both of these are in callbacks for lookup() so it seems doubtful that it's a hot path. But we do have a lookup benchmark, so let's run it.

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/318/

@Trott Trott added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 4, 2019
@richardlau
Copy link
Member

Both of these are in callbacks for lookup() so it seems doubtful that it's a hot path. But we do have a lookup benchmark, so let's run it.

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/318/

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/319/

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

Results seem OK to me so far. Will try with a larger 'N' to see if we can get something statistically significant.

18:13:55                                                       confidence improvement accuracy (*)   (**)  (***)
18:13:55  dns/lookup.js n=5000000 all='false' name=''                         -0.06 %       ±2.95% ±3.93% ±5.11%
18:13:55  dns/lookup.js n=5000000 all='false' name='::1'                      -0.83 %       ±1.75% ±2.32% ±3.02%
18:13:55  dns/lookup.js n=5000000 all='false' name='127.0.0.1'                -0.30 %       ±2.04% ±2.71% ±3.53%
18:13:55  dns/lookup.js n=5000000 all='true' name=''                          -1.08 %       ±2.92% ±3.89% ±5.07%
18:13:55  dns/lookup.js n=5000000 all='true' name='::1'                       -0.74 %       ±2.79% ±3.74% ±4.93%
18:13:55  dns/lookup.js n=5000000 all='true' name='127.0.0.1'                 -0.50 %       ±4.57% ±6.09% ±7.93%
18:13:55 
18:13:55 Be aware that when doing many comparisons the risk of a false-positive
18:13:55 result increases. In this case there are 6 comparisons, you can thus
18:13:55 expect the following amount of false-positive results:
18:13:55   0.30 false positives, when considering a   5% risk acceptance (*, **, ***),
18:13:55   0.06 false positives, when considering a   1% risk acceptance (**, ***),
18:13:55   0.01 false positives, when considering a 0.1% risk acceptance (***)
18:13:56 ++ mv output040419-020556.csv /w/bnch-comp

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/320/ (Same as previous run but bumping the number of times we run each benchmark up from 30 to 128 in the hopes of getting some statistically significant results. Alternatively, if we see some results that show this PR being faster, it will be more obvious that this is a wash.)

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

Hmmm....

22:21:41 ++ Rscript benchmark/compare.R
22:21:41                                                       confidence improvement accuracy (*)   (**)  (***)
22:21:41  dns/lookup.js n=5000000 all='false' name=''                          0.24 %       ±1.51% ±1.98% ±2.55%
22:21:41  dns/lookup.js n=5000000 all='false' name='::1'                       0.62 %       ±0.91% ±1.20% ±1.54%
22:21:41  dns/lookup.js n=5000000 all='false' name='127.0.0.1'        ***     -4.38 %       ±1.66% ±2.19% ±2.81%
22:21:41  dns/lookup.js n=5000000 all='true' name=''                           1.02 %       ±1.72% ±2.27% ±2.91%
22:21:41  dns/lookup.js n=5000000 all='true' name='::1'                       -0.00 %       ±1.08% ±1.42% ±1.82%
22:21:41  dns/lookup.js n=5000000 all='true' name='127.0.0.1'                 -0.55 %       ±1.97% ±2.60% ±3.33%
22:21:41 
22:21:41 Be aware that when doing many comparisons the risk of a false-positive
22:21:41 result increases. In this case there are 6 comparisons, you can thus
22:21:41 expect the following amount of false-positive results:
22:21:41   0.30 false positives, when considering a   5% risk acceptance (*, **, ***),
22:21:41   0.06 false positives, when considering a   1% risk acceptance (**, ***),
22:21:41   0.01 false positives, when considering a 0.1% risk acceptance (***)

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

@richardlau
Copy link
Member

It occurs to me that the benchmark probably isn't actually testing the code touched by this PR as it tests the non-promise version of dns.lookup() 😆.

Having said that, there is similar code in

this.callback(null, addresses[0], isIPv4(addresses[0]) ? 4 : 6);
and
family: family || (isIPv4(addr) ? 4 : 6)
which we should probably keep consistent.

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

You sure the current benchmark isn't affected by this change already?:

23:26:01                                                       confidence improvement accuracy (*)   (**)  (***)
23:26:01  dns/lookup.js n=5000000 all='false' name=''                          0.12 %       ±1.55% ±2.04% ±2.62%
23:26:01  dns/lookup.js n=5000000 all='false' name='::1'                      -0.15 %       ±1.05% ±1.39% ±1.78%
23:26:01  dns/lookup.js n=5000000 all='false' name='127.0.0.1'        ***     -2.44 %       ±1.27% ±1.68% ±2.15%
23:26:01  dns/lookup.js n=5000000 all='true' name=''                          -0.39 %       ±1.80% ±2.37% ±3.04%
23:26:01  dns/lookup.js n=5000000 all='true' name='::1'                       -0.12 %       ±1.44% ±1.90% ±2.44%
23:26:01  dns/lookup.js n=5000000 all='true' name='127.0.0.1'                 -0.36 %       ±2.02% ±2.66% ±3.41%
23:26:01 
23:26:01 Be aware that when doing many comparisons the risk of a false-positive
23:26:01 result increases. In this case there are 6 comparisons, you can thus
23:26:01 expect the following amount of false-positive results:
23:26:01   0.30 false positives, when considering a   5% risk acceptance (*, **, ***),
23:26:01   0.06 false positives, when considering a   1% risk acceptance (**, ***),
23:26:01   0.01 false positives, when considering a 0.1% risk acceptance (***)

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

(I agree that it doesn't make sense. But at the same time, the results are consistent. Something's up.)

@Trott Trott removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 4, 2019
@BridgeAR
Copy link
Member

BridgeAR commented Apr 4, 2019

@Trott this looks like statistical anomalies to me.

@richardlau
Copy link
Member

You sure the current benchmark isn't affected by this change already?:

Yes. At least according to coverage:

-bash-4.2$ rm -rf out/Release/.coverage/
-bash-4.2$ NODE_V8_COVERAGE=out/Release/.coverage tools/test.py test/benchmark/test-benchmark-dns.js
[00:00|% 100|+   1|-   0]: Done
-bash-4.2$ grep -oP '[\w/.]*dns[\w/.]*' out/Release/.coverage/*
out/Release/.coverage/coverage-43652-1554392808703-0.json:dns.js
out/Release/.coverage/coverage-43666-1554392808663-0.json:///home/users/riclau/sandbox/github/nodejs/benchmark/dns/lookup.js
out/Release/.coverage/coverage-43666-1554392808663-0.json:dns.js
out/Release/.coverage/coverage-43666-1554392808663-0.json:internal/dns/utils.js
out/Release/.coverage/coverage-43676-1554392808645-0.json:///home/users/riclau/sandbox/github/nodejs/benchmark/dns/lookup.js
out/Release/.coverage/coverage-43676-1554392808645-0.json:dns.js
out/Release/.coverage/coverage-43676-1554392808645-0.json:internal/dns/utils.js
-bash-4.2$

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

@Trott this looks like statistical anomalies to me.

But the results are consistent. Twice now, the same one has a very high degree of confidence (low p-value). While it all-but-certainly isn't a reflection of the code change, it is (almost certainly) a reflection of something. An artifact somewhere. Maybe with this benchmark, and the way the operating system caches lookups, the second benchmark will always be slower? In other words, maybe the benchmark doesn't yield valid results ever, at least on the OS where we're running it?

Let's test that theory: Here's a run with no code changes. (The PR is a change to a GOVERNANCE.md only, so the code source should be identical.) https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/322/

@Trott
Copy link
Member Author

Trott commented Apr 4, 2019

Ah, there we go, the benchmark is clearly not providing valid results. Same code shows statistically significant differences in performance:

10:02:54 ++ Rscript benchmark/compare.R
10:02:54                                                       confidence improvement accuracy (*)   (**)  (***)
10:02:54  dns/lookup.js n=5000000 all='false' name=''                          0.67 %       ±1.50% ±1.98% ±2.54%
10:02:54  dns/lookup.js n=5000000 all='false' name='::1'                       0.90 %       ±1.08% ±1.42% ±1.82%
10:02:54  dns/lookup.js n=5000000 all='false' name='127.0.0.1'         **     -1.97 %       ±1.23% ±1.62% ±2.08%
10:02:54  dns/lookup.js n=5000000 all='true' name=''                           1.17 %       ±1.68% ±2.22% ±2.84%
10:02:54  dns/lookup.js n=5000000 all='true' name='::1'                        0.29 %       ±1.00% ±1.32% ±1.70%
10:02:54  dns/lookup.js n=5000000 all='true' name='127.0.0.1'                 -0.82 %       ±2.01% ±2.65% ±3.40%
10:02:54 
10:02:54 Be aware that when doing many comparisons the risk of a false-positive
10:02:54 result increases. In this case there are 6 comparisons, you can thus
10:02:54 expect the following amount of false-positive results:
10:02:54   0.30 false positives, when considering a   5% risk acceptance (*, **, ***),
10:02:54   0.06 false positives, when considering a   1% risk acceptance (**, ***),
10:02:54   0.01 false positives, when considering a 0.1% risk acceptance (***)

@Trott Trott mentioned this pull request Apr 4, 2019
4 tasks
Trott added a commit to Trott/io.js that referenced this pull request Apr 5, 2019
The benchmarks for dns.lookup() include calling it with an empty
hostname which results in a deprecation warning. This benchmark seems to
be subject to some odd side effects (see Ref below) and we probably
generally don't want to benchmark deprecated things by default anyway.
Remove the deprecated value from the default list. Bonus is that this
will speed up the benchmark.

Refs: nodejs#27081 (comment)
Trott added a commit to Trott/io.js that referenced this pull request Apr 7, 2019
The benchmarks for dns.lookup() include calling it with an empty
hostname which results in a deprecation warning. This benchmark seems to
be subject to some odd side effects (see Ref below) and we probably
generally don't want to benchmark deprecated things by default anyway.
Remove the deprecated value from the default list. Bonus is that this
will speed up the benchmark.

Refs: nodejs#27081 (comment)

PR-URL: nodejs#27091
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
@Trott
Copy link
Member Author

Trott commented Apr 7, 2019

OK, the benchmark has been modified slightly, so let's see if we no longer get the false positive on a running-the-same-version-of-node-for-both-benchmarks: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/326/

And, in an optimistic move, let's re-run the benchmark for this PR: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/327/ (scheduled)

If that doesn't have surprising results, then I suppose the next step is to write a benchmark for DNS promises and to add the isIP() change to the callback version, and re-run benchmarks.

@Trott
Copy link
Member Author

Trott commented Apr 7, 2019

Still seeing the false positive in the benchmark run using the same executable:

23:14:21                                                       confidence improvement accuracy (*)   (**)  (***)
23:14:21  dns/lookup.js n=5000000 all='false' name='::1'                       0.46 %       ±0.85% ±1.13% ±1.45%
23:14:21  dns/lookup.js n=5000000 all='false' name='127.0.0.1'          *     -1.83 %       ±1.66% ±2.18% ±2.80%
23:14:21  dns/lookup.js n=5000000 all='true' name='::1'                       -0.44 %       ±0.96% ±1.26% ±1.62%
23:14:21  dns/lookup.js n=5000000 all='true' name='127.0.0.1'                  0.49 %       ±2.19% ±2.88% ±3.70%
23:14:21 
23:14:21 Be aware that when doing many comparisons the risk of a false-positive
23:14:21 result increases. In this case there are 4 comparisons, you can thus
23:14:21 expect the following amount of false-positive results:
23:14:21   0.20 false positives, when considering a   5% risk acceptance (*, **, ***),
23:14:21   0.04 false positives, when considering a   1% risk acceptance (**, ***),
23:14:21   0.00 false positives, when considering a 0.1% risk acceptance (***)

@BridgeAR
Copy link
Member

BridgeAR commented Apr 9, 2019

@Trott if you change the cases mentioned here #27081 (comment) and then restart the benchmark, it'll execute changed code in the expected way. So the outcome should also be more significant. One star as confidence is mostly a statistical anomaly. Yes, there were also three stars with a low percentage value but even that happens from time to time.

@Trott
Copy link
Member Author

Trott commented Apr 16, 2019

Running this change with the benchmark file in #27249 resulted in no statistically-significant difference in performance. I think this is good to land.

$ cat /var/tmp/trott | Rscript benchmark/compare.R 
                                                               confidence improvement accuracy (*)   (**)  (***)
 dns/lookup-promises.js n=5000000 all='false' name='::1'                       0.63 %       ±3.40% ±4.53% ±5.90%
 dns/lookup-promises.js n=5000000 all='false' name='127.0.0.1'                -1.68 %       ±2.72% ±3.63% ±4.74%
 dns/lookup-promises.js n=5000000 all='true' name='::1'                       -0.47 %       ±2.32% ±3.10% ±4.04%
 dns/lookup-promises.js n=5000000 all='true' name='127.0.0.1'                  2.53 %       ±3.48% ±4.64% ±6.08%
 dns/lookup.js n=5000000 all='false' name='::1'                                0.56 %       ±3.38% ±4.50% ±5.85%
 dns/lookup.js n=5000000 all='false' name='127.0.0.1'                          2.14 %       ±4.92% ±6.55% ±8.54%
 dns/lookup.js n=5000000 all='true' name='::1'                                 2.47 %       ±3.17% ±4.22% ±5.51%
 dns/lookup.js n=5000000 all='true' name='127.0.0.1'                          -0.96 %       ±4.11% ±5.47% ±7.12%

Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case there are 8 comparisons, you can thus
expect the following amount of false-positive results:
  0.40 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.08 false positives, when considering a   1% risk acceptance (**, ***),
  0.01 false positives, when considering a 0.1% risk acceptance (***)
$ 

@Trott Trott added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 16, 2019
Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Trott it would be good to actually have a consistent handling across the "regular" dns functions and the promise based ones as pointed out here #27081 (comment). Please do not land this for promises only.

@BridgeAR BridgeAR removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 16, 2019
@Trott
Copy link
Member Author

Trott commented Apr 16, 2019

@Trott it would be good to actually have a consistent handling across the "regular" dns functions and the promise based ones as pointed out here #27081 (comment). Please do not land this for promises only.

Done. A re-review or two would be good. @jasnell @lpinca @richardlau @BridgeAR

In lib/dns.js, use `isIP()` instead of `isIPv4()` for determining the
`family` property in `lookup()`. If an invalid IP address is returned,
the `family` currently provided is `6`. With this change, it will be
`0`. Update documentation to reflect this.
@nodejs-github-bot
Copy link
Collaborator

@Trott
Copy link
Member Author

Trott commented Apr 16, 2019

Additional benchmark for the callback-based changes. Might not actually be relevant to the change here, but hey, for completeness....

                                                      confidence improvement accuracy (*)   (**)   (***)
 dns/lookup.js n=5000000 all='false' name='::1'                      -1.36 %       ±2.45% ±3.27%  ±4.25%
 dns/lookup.js n=5000000 all='false' name='127.0.0.1'                 2.22 %       ±3.62% ±4.81%  ±6.27%
 dns/lookup.js n=5000000 all='true' name='::1'                        1.37 %       ±3.38% ±4.51%  ±5.90%
 dns/lookup.js n=5000000 all='true' name='127.0.0.1'                  2.94 %       ±7.22% ±9.61% ±12.51%

Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case there are 4 comparisons, you can thus
expect the following amount of false-positive results:
  0.20 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.04 false positives, when considering a   1% risk acceptance (**, ***),
  0.00 false positives, when considering a 0.1% risk acceptance (***)

@Trott Trott added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 16, 2019
@Trott
Copy link
Member Author

Trott commented Apr 16, 2019

Landed in 14df42f...09cdc37

@Trott Trott closed this Apr 16, 2019
Trott added a commit to Trott/io.js that referenced this pull request Apr 16, 2019
Use `isIP()` instead of `isIPv4()` since it does the additional
functionality that we were adding after our calls to `isIP()`.

This not-so-incidentally also increases code coverage from tests. At
least one of the replaced ternaries was difficult to cover reliably
because operating system/configuration variances were too unpredictable.

PR-URL: nodejs#27081
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Trott added a commit to Trott/io.js that referenced this pull request Apr 16, 2019
In lib/dns.js, use `isIP()` instead of `isIPv4()` for determining the
`family` property in `lookup()`. If an invalid IP address is returned,
the `family` currently provided is `6`. With this change, it will be
`0`. Update documentation to reflect this.

PR-URL: nodejs#27081
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Trott added a commit to Trott/io.js that referenced this pull request Apr 18, 2019
Adding this benchmark will let us check the performance implications of
nodejs#27081.
Trott added a commit to Trott/io.js that referenced this pull request Apr 18, 2019
Adding this benchmark will let us check the performance implications of
nodejs#27081.

PR-URL: nodejs#27249
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Yongsheng Zhang <[email protected]>
@Trott Trott deleted the refactor-dns-promises branch January 13, 2022 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. dns Issues and PRs related to the dns subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants