Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make http.request correctly parse {host: "hostname:port"} #2271

Closed
wants to merge 1 commit into from

Conversation

nfriedly
Copy link
Contributor

The issue:
http.request() treats host as an alternate form of hostname, which it sort of is, except that host is (normally) allowed to contain a port number.

Before this change, if host contains a port number and there isn't a hostname field to override it, http.request() attempts a DNS lookup on the entire host field, including the port number. This always fails.

The fix:
If a host field is present, split it around ':' and use the first portion as a fallback hostname and the second (if present) as a fallback port. ("Fallback" meaning that the values from the host field are only used when the hostname and port fields aren't set.)

Includes a test to verify the correct behavior.

I originally filed an issue on joyent/node and submitted a patch there, but @jasnell said that this might be a better fit for nodejs/io.js or nodejs/node, so I'm sending the same patch to each. (The node patch is at #72) Please merge into whichever repo is appropriate and close the other one.

@Fishrock123 Fishrock123 added the http Issues or PRs related to the http subsystem. label Jul 29, 2015
// NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
// OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
// USE OR OTHER DEALINGS IN THE SOFTWARE.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need this license.

Also, 'use strict'; here please! :)

@nfriedly
Copy link
Contributor Author

Oh, and two things I forgot to mention:

  1. On my laptop, test/parallel/test-net-better-error-messages-port-hostname.js always times out both before and after my change. All other tests pass.

  2. I can see a possible better solution of adding a url.normalize() method, but didn't do that yet because it's a larger change. However, I'd be happy to put that together and update http.request() to use it if there's interest.

@nfriedly
Copy link
Contributor Author

@Fishrock123 thanks, I just pushed a new commit that I believe addresses everything you pointed out. If you want, I'll rebase it into one.

@@ -57,8 +57,10 @@ function ClientRequest(options, cb) {
const defaultPort = options.defaultPort ||
self.agent && self.agent.defaultPort;

var port = options.port = options.port || defaultPort || 80;
var host = options.host = options.hostname || options.host || 'localhost';
const hostParts = (options.host && options.host.split(':')) || [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail on IPv6 address and port, e.g. [::]:80. Could use url here like url.parse('fake://' + str).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll add a test for that.

@sam-github
Copy link
Contributor

-1

  • : is syntactically valid in domain names (uncommon, disallowed in TLDs and most second levels, but still valid: https://tools.ietf.org/html/rfc2181#page-13)
  • splitting on : completely fails with IPv6 addresses, leading to bizarre lookups and errors
  • this kind of thing has been an exploitable security hole in the past, you just have to convince someone that some string with a ':' is a host, and you can choose the port they go to
  • there is already a well-defined format for encoding host names and ports: URIs, node supports them, as does the request module

@silverwind
Copy link
Contributor

I'm also feeling this is probably too magical, so I have to -1 as well here.

@nfriedly
Copy link
Contributor Author

@sam-github just before you posted, I switched to using url.parse() at @silverwind's recommendation. That should address most of your points.

@nfriedly
Copy link
Contributor Author

There is a bug here, but I agree that this is starting to get a little silly and definitely don't want to make things worse. What would you guys say to a proper require('url').normalize(urlObj) method rather than trying to jam the logic into http.request()?

@nfriedly
Copy link
Contributor Author

Actually, there's already an (undocumented) Url.prototype.parseHost() method: https://github.com/nodejs/io.js/blob/master/lib/url.js#L706

Although using it directly on the options object would change the behavior of http.request by making host override hostname and port.

...It also leaves the brackets around ipv6 addresses, which also breaks dns lookups, so probably not a good option.

@sam-github
Copy link
Contributor

Even with url parsing, I'm personally still -1, I just don't think this is reasonable for the HTTP library, which should focus on getting HTTP right, and let npm add API/use-case sugar. And its trivial to add this on top of http.request() if you want it, there is no need for this in core, or any reason why core is the necessary or even best place for this.

@sam-github
Copy link
Contributor

And doesn't the change to test-http-dns-error.js mean this is a little bit backwards incompatible?

@nfriedly
Copy link
Contributor Author

To give a real-world example of where this bug makes node.js/io.js behave unexpectedly, I have a library that connects to a remote server. When testing, I mock the remote server with one running on localhost:8080, I then change the host value from remote-server.com to localhost:8080 and the tests fails with a dns error. That was how I found this bug.

Yes, I can work around it easily enough, but that doesn't change the fact that it's a bug, or at the very least, doesn't work as expected. In the url.parse/url.format methods, host can and will contain a port number. And ditto in the window.location object that it's based on. So to have a different behavior in a fairly important core API (and one that explicitly states that it should be compatible with url.parse, no less) feels to me like something that should be fixed.

As for the test-http-dns-error.js, that test is expecting an error. With this patch, it would get a different error, but I think that's an acceptable change. That said, I'd be willing to rework it further to avoid that if it's important to you.

The issue:
http.request() treats host as an alternate form of hostname, which
it sort of is, except that host is (normally) allowed to contain a
port number.

Before this change, if host contains a port number and there isn't
a hostname field to override it, http.request() attempts a DNS
lookup on the entire host field, including the port number. This
always fails.

The fix:
If a host field is present, use the url lib to parse it and use
the parsed hostname nad port values as fallbacks if
options.hostname and options.port are unset.

Includes tests for localhost:12346 and [::1]:12346.
@nfriedly
Copy link
Contributor Author

Here we go, I gave it one more shot. This one uses Url.prototype.parseHost() to do the main work, but in a manner that avoids changing the API at all, so hostname is still preferred over host (although I would like to point out that Url.format() prefers host over hostname), and the same error as before is thrown for http.get({host:'********'}), so no changes to test-http-dns-error.js.

I added two lines to unwrap bracketed ipv6 addresses, based on the equivalent lines in Url.prototype.parse() to ensure that hosts like [::1]:8080 will work properly.

I also rebased everything into a single commit to clean up the history.

What do you guys say? If it's still a stack of -1's then I'll give up...

@jasnell
Copy link
Member

jasnell commented Nov 16, 2015

@sam-github @silverwind .. looks like the conversation on this one stalled out. There was an updated commit that was never reviewed. Mind taking a look?

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Nov 16, 2015
@jasnell
Copy link
Member

jasnell commented Nov 16, 2015

@nodejs/http

@mscdex
Copy link
Contributor

mscdex commented Nov 16, 2015

I think this doesn't play nice with IPv6 host values. For example:

function parse(s) {
  var parsedHost = {host: s};
  if (s) {
    url.Url.prototype.parseHost.call(parsedHost);

    // unwrap brackets from ipv6 ip addresses
    const hostname = parsedHost.hostname;
    if (hostname[0] === '[' && hostname[hostname.length - 1] === ']') {
      parsedHost.hostname = hostname.substr(1, hostname.length - 2);
    }
  }
}

console.dir(parse('::1'));
// displays:
// { host: '::1', port: '1', hostname: ':' }

@jasnell
Copy link
Member

jasnell commented Nov 16, 2015

:-) @mscdex .. you read my mind. I had that same kind of test queued up on my todo list. I'm leaning towards a -1 on this particular change

@dougwilson
Copy link
Member

I agree with the concept, but not sure about the particular implementation. Of course, the string ::1 is not a valid host, if you were to think of the HTTP Host header (it should be [::1]), but I think that will probably cause confusion, even if technically correct to not handle bare IPv6 in host (since that's a hostname).

@silverwind
Copy link
Contributor

For some reason, I can't get this particular example to work at all here, any ideas?

> url.Url.prototype.parseHost.call({host: '::1'})
undefined

As for this issue: It has some value I'd say. url.parse for example also includes the port in host, so I'd agree there is a bug here.

@mscdex
Copy link
Contributor

mscdex commented Nov 16, 2015

@dougwilson I'd bet that most people just use host when passing an object to http.request() and use valid IP addresses without thinking about low-level HTTP stuff like that. Even the documentation for http.request() for host currently says:

A domain name or IP address of the server to issue the request to.

So I could easily see people using values like '::1' and other IPv6 addresses.

@mscdex
Copy link
Contributor

mscdex commented Nov 16, 2015

@silverwind The reason is that parseHost() simply modifies the object passed in, it does not return an object.

@dougwilson
Copy link
Member

Hi @mscdex , it's a good thing I noted that it would be too confusing to actually enforce that in my post already :)

@jasnell
Copy link
Member

jasnell commented Mar 22, 2016

Closing given the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants