Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url library behaves unexpectedly when attempting to set a url's port to a large number #19595

Closed
nodeav opened this issue Mar 25, 2018 · 32 comments
Labels
confirmed-bug Issues with confirmed bugs. whatwg-url Issues and PRs related to the WHATWG URL implementation.

Comments

@nodeav
Copy link
Contributor

nodeav commented Mar 25, 2018

const {URL} = require('url');
let url = new URL('http://127.0.0.1');
url.port // -> ''
url.port = 2.12323232e88;
url.port // -> '2'

@benjamingr
Copy link
Member

I have been able to reproduce it and @nodeav has offered to try and tackle this with a PR :)

@benjamingr benjamingr added confirmed-bug Issues with confirmed bugs. whatwg-url Issues and PRs related to the WHATWG URL implementation. labels Mar 25, 2018
@benjamingr
Copy link
Member

I tagged this as a "confirmed bug" since this is not what our .docs say, a solution could be to either fix the docs or to change the URL implementation to reject incorrect numbers for ports.

@addaleax
Copy link
Member

+1 on this being a bug, but one that also needs fixing in the spec:

Let port be the mathematical integer value that is represented by buffer in radix-10 using ASCII digits for digits with values 0 through 9.

Chromium and Firefox exhibit the same behaviour as Node.

/cc @nodejs/url

@TimothyGu
Copy link
Member

I believe the property is operating as specified and documented. The documentation says:

The port value may be set as either a number or as a String containing a number in the range 0 to 65535 (inclusive)…

If an invalid string is assigned to the port property, but it begins with a number, the leading number is assigned to port.

However I do agree the first sentence quoted can be misleading, since it’s actually meant to be “(number or string containing a number) in the range …” rather than “number or (string containing a number in the range…)”.

Whether or not there is a bug in the spec, the odd parts of the URL Standard (like this one) are usually written for more reasons than just being buggy (compatibility is the most common reason). I don’t know the details on this property but @annevk might.

@addaleax
Copy link
Member

@TimothyGu What led be to believe that this was a bug in the spec was that the spec does contain an explicit range check when setting port, and that that range check 'works' for all number values up until the limit where Number#toString() returns exponential notation.

@TimothyGu
Copy link
Member

@addaleax I’d recommend filing a bug at https://github.com/whatwg/url for further verification if you are convinced that this is indeed a spec bug.

@annevk
Copy link

annevk commented Mar 26, 2018

Most browsers had a limit of sorts and this was the most common one. The port attribute is a little weird in that the setter first converts to a string, and then uses the URL parser to get an integer out of it. So that's why you get "2" above I'm pretty sure (haven't looked), because the URL parser's port bits bail out on the dot.

@nodeav
Copy link
Contributor Author

nodeav commented Mar 26, 2018

I'll post a PR in an hour or two, with a range check in the beginning of the port setter. Do you think it is the right solution? I feel that this is unexpected behavior, regardless of the spec.

@annevk
Copy link

annevk commented Mar 26, 2018

  1. https://notapattern.net/2014/10/14/ways-men-in-tech-are-unintentionally-sexist/
  2. Violating the specification is almost certainly not the way forward. If you want to change the specification somehow, follow the suggestion from @TimothyGu above and please consider https://whatwg.org/working-mode#changes and https://whatwg.org/faq.

@benjamingr
Copy link
Member

@nodeav what @annevk means in point number 1 is that we don't use "guys" for "people" in English in Node.

@annevk worth noting that in @nodeav's native tongue the distinction (using guys for people) doesn't really exist since you would use a gender neutral term (אנשים or חבר׳ה rather than בחורים) which roughly translates to guys but without the gendered part. Thanks for pointing it out - I'm just giving some random context trivia.


About point #2 do you think that's a reasonable change for the spec?

I'm also -1 on changing Node's behavior without changing the spec - the whole point of adding WHATWG URLs to Node was to make browser interoperability easier and settle on a single specification for URLs in both platforms.

@annevk
Copy link

annevk commented Mar 26, 2018

It's not really clear to me what the proposed change is. To make port accept a Number as well as a String and throw if it's out-of-bound? That seems somewhat backwards incompatible. It's also not entirely clear to me it's worth it. It'd help to know when you'd write the kind of code in OP.

@benjamingr
Copy link
Member

@annevk let's say you read the port from a parsed file containing URLS or another third party source:

var port = readPort(); // attacker returns 30 ** 30
// our validation: don't allow opening a connection to a lower-than 1024 port to the server.
if (typeof port !== 'number' || port < 1024) { 
  return false; 
}
serverUrl.port = port;
download(serverUrl); // user connected to port we didn't mean them to be able to connect to

@nodeav
Copy link
Contributor Author

nodeav commented Mar 26, 2018

Sorry, did not mean to offend anyone, I have no intentions to exclude any gender from the discussion (@benjamingr's anecdote is correct, I meant it as a gender-neutral term). I've since edited my comment to exclude the offending word.

Regarding the issue:
This is a snippet from the node v9.9.0 documentation:

The port value may be set as either a number or as a String containing a number in the range 0 to 65535 (inclusive). Setting the value to the default port of the URL objects given protocol will result in the port value becoming the empty string ('').

The proposed change is, thus, to make sure the number is in the valid range specified in the documentation. Another possibility is to change the documentation to match the current behaviour. However, In my opinion at least, something is clearly slightly broken here.

@annevk
Copy link

annevk commented Mar 26, 2018

@benjamingr that's fairly compelling. I'd support trying to change it given the security angle.

@benjamingr
Copy link
Member

Opened issue in whatwg repo whatwg/url#377

@nodeav
Copy link
Contributor Author

nodeav commented Mar 26, 2018

@benjamingr Great. Does that mean I shouldn't open a PR, and await their fix instead?

@benjamingr
Copy link
Member

@nodeav well, ideally you'd take a stab at making a PR to the spec although I get that this might be more intimidating. Alternatively you can wait for the spec to change and then PR into Node.

@nodeav
Copy link
Contributor Author

nodeav commented Mar 26, 2018

I'll look into that, thanks. The docs are still wrong currently; Should I open a PR to fix the docs instead?
(As a user I'd expect the port parameter to accept a number, which conflicts with the docs changes I'm talking about)

@benjamingr
Copy link
Member

@nodeav yes please! :)

nodeav added a commit to nodeav/node that referenced this issue Mar 27, 2018
numbers which are coerced to scientific notation via .toString(),
will behave unexpectedly when assigned to a url's port.

Fixes: nodejs#19595
@nodeav
Copy link
Contributor Author

nodeav commented Apr 2, 2018

I'm still working on the doc change,
however it looks like since the whatwg spec talks specifically about an input String, as opposed to an input Number, a bug still exists here, either in the doc or in the code; The NodeJS docs say the port can be set as a number, which is then range-checked.

a snippet from the whatwg spec:
Let port be the mathematical integer value that is represented by buffer in radix-10 using ASCII digits for digits with values 0 through 9.

Correct me if I'm wrong, but it seems to me that the root of this bug is the inconsistency when converting a number to a string in node (actually, the ECMA standard), not passing numbers in scientific notation to whatwg.
Accepting a number as an input is an addition to the standard, and the inconsistency should be handled.
For example, once the string "1e32" is passed to whatwg for checking, it acts as defined in the spec and returns 1.
Passing 1e32 to node's url.port setter, however, results in unexpected behavior for the user.

@trivikr trivikr closed this as completed in 51c2c51 Apr 9, 2018
@nodeav
Copy link
Contributor Author

nodeav commented Apr 10, 2018

@trivikr I just wanted to make sure the argument I have presented has been considered prior to closing this issue, and that the final decision has been made in regards to changing the code. Thanks.

@trivikr
Copy link
Member

trivikr commented Apr 10, 2018

@nodeav This issue was closed automatically by commit message. Reopening

@trivikr trivikr reopened this Apr 10, 2018
targos pushed a commit that referenced this issue Apr 12, 2018
numbers which are coerced to scientific notation via .toString(),
will behave unexpectedly when assigned to a url's port.

Fixes: #19595

PR-URL: #19645
Reviewed-By: Vse Mozhet Byt <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
@nodeav
Copy link
Contributor Author

nodeav commented Apr 22, 2018

Should I create a unit test for this case, Or is it covered by whatwg somehow?

@benjamingr
Copy link
Member

Well, the next step would be to follow up with whatwg

@nodeav
Copy link
Contributor Author

nodeav commented Apr 24, 2018

@benjamingr Not sure how I can help with the whatwg thing, and I also happen to still feel this is a Node-specific bug.
Converting numbers to scientific notation is an ECMA thing, not a whatwg one.
Node should, IMO, handle numbers being passed to the setter instead of strings - it already converts Numbers to Strings, but doesn't handle large numbers correctly in respect to what whatwg specifies as valid input.x

@TimothyGu
Copy link
Member

TimothyGu commented Apr 24, 2018

@nodeav Again, it's not a Node.js bug when it acts according to the WHATWG and W3C specifications, and also browsers. You could try filing an issue against whatwg/url.

@benjamingr
Copy link
Member

As @TimothyGu said - this is required if we want to keep compatibility with browsers - I think we should give the standards way a chance and to proceed in that route.

@nodeav
Copy link
Contributor Author

nodeav commented Apr 25, 2018

But the whatwg standard doesn't say anything about accepting numbers as an input, only strings. Nodejs chooses to accept numbers as an input, and does the conversion to a String according to the ECMA specification - however, this conversion conflicts with whatwg's allowed input, which doesn't accept numbers in scientific notation represented in a string, nor numbers (integers) at all, only strings representing decimal numbers in a straightforward manner.

Node does not provide this. A Python implementation would not encounter this problem. Am I missing something?

from the spec:

> A URL-port string must be zero or more ASCII digits.
> An ASCII digit is a code point in the range U+0030 (0) to U+0039 (9), inclusive.

And the full 'algorithm':

port state
> If c is an ASCII digit, append c to buffer.

Otherwise, if one of the following is true

c is the EOF code point, U+002F (/), U+003F (?), or U+0023 (#)

url is special and c is U+005C ()

state override is given

then:

If buffer is not the empty string, then:

Let port be the mathematical integer value that is represented by buffer in radix-10 using ASCII digits for digits with values 0 through 9.

If port is greater than 2^16 − 1, validation error, return failure.

Set url’s port to null, if port is url’s scheme’s default port, and to port otherwise.

Set buffer to the empty string.

If state override is given, then return.

Set state to path start state, and decrease pointer by one.

Otherwise, validation error, return failure.

@annevk
Copy link

annevk commented Apr 25, 2018

@nodeav yeah, you're missing how IDL works. It specifies the ToString behavior Node.js (and browsers) have.

@nodeav
Copy link
Contributor Author

nodeav commented Apr 25, 2018

@annevk Thanks, I'll go bug the whatwg people instead then!

@benjamingr
Copy link
Member

@nodeav I think you meant to ping Anne and not Anna :)

FWIW Anne is "whatwg people" - we can pursue the issue in whatwg/url#377

@nodeav
Copy link
Contributor Author

nodeav commented Apr 25, 2018

@benjamingr haha, indeed. Let's continue there. Thanks for the patience, everyone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants