Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

node does not sanitize http header field values #2602

Closed
alpercelik opened this issue Jan 24, 2012 · 13 comments
Closed

node does not sanitize http header field values #2602

alpercelik opened this issue Jan 24, 2012 · 13 comments

Comments

@alpercelik
Copy link

Noticed first with express and submitted a pull request to express framework but after our discussions about the issue with @visionmedia, it seemed more logical to have it in node.

You may find the pull request for express which explains details here
expressjs/express#974

I did sanitizing in http.js where header field values are stored.
https://github.com/alpercelik/node/commit/667aae596cded9336f50574386683ec39ada43f2

@alpercelik
Copy link
Author

@koichik
Copy link

koichik commented Jan 24, 2012

Actually, CRLF is valid characters of header field value.
RFC-2616 - section 2.2 says:

HTTP/1.1 header field values can be folded onto multiple lines if the
continuation line begins with a space or horizontal tab.

So, I think that it should throw an error if CRLF in the header value is not a part of LWS.

   LWS            = [CRLF] 1*( SP | HT )

@bnoordhuis
Copy link
Member

Thanks, but I don't think this is a good change.

@koichik already pointed out that the patch rejects valid header values. Even if the patch worked as advertised, it could still have unforeseen consequences. It's quite likely that some applications rely on the current behaviour of .setHeader().

Moreover, it's a change where everybody pays the price (expensive per-header regex check) for a feature that only some need. It belongs at the library or application level, not in the core http library.

@alpercelik
Copy link
Author

You are right, my change breaks valid header values. I overlooked validity of CRLF as @koichik mentioned.
I used regex, that was a quick change and your comment is valid about performance.

The issue I was trying to fix was related to http response splitting attack (http://en.wikipedia.org/wiki/HTTP_response_splitting). I was checking if this is possible in express and then noticed that it is, via url parameter to redirect method, which sets Location header with value url parameter passed in (if it is not mapped, a nice feature in feature, but this is also possible even url is generated indirectly).

example : https://gist.github.com/1654555#file_app.js
response : https://gist.github.com/1654555#file_server+response

When a header is set through setHeader function it is possible to add extra header fields/values where value has CRLF (not part of LWS).

I agree that applications that are using current behaviour may break if changed, but I think current applications should be using this function as it is documented. As per documentation :

Sets a single header value for implicit headers. If this header already exists in the to-be-sent headers, its value will be replaced. Use an array of strings here if you need to send multiple headers with the same name.

So if an application is calling this setHeader and intentionally adding more than one header field by CRLFs that is not part of LWS, then application is doing something wrong and better to be fixed. If they already use properly, they should be fine.

Also as per documentation setHeader function should not allow adding new headers or modifying other header fields unintentionally. At this point the solution that @koichik has proposed fits RFC and behaviour as per documentation.

this example will set Location header as well:

res.setHeader('Set-Cookie', 'foo=bar;\r\nLocation:http://www.mysite.com');

About http response splitting attack there are some good resources, this page has nice content with examples and references:
http://cwe.mitre.org/data/definitions/113.html

And in this blog different proof of concepts are presented for reddit.com (also mentions null byte in concept 3):
https://nealpoole.com/blog/2011/01/http-response-splitting-on-reddit-com/

If this check will not be done in core http library then I think it is better to mention in documentation, because it is easy to overlook in library or application level.

At least for "Location" header, I think it is better to fix, I don't see what would be legitimate reason to have CRLF in header value if I am not overlooking.

for CRLF's in redirect url

I feel that such controls fits better in core http library and it also applies to everyone's need. At the end security is also everyones need.

@tj
Copy link

tj commented Jan 25, 2012

it's still an attack vector even if it's not for a redirect so it didn't make sense to fix this in express, but if core doesn't want something to protect against it I probably wouldn't go for it either since so many people still use setHeader() or writeHead(). Best to just filter the input

@isaacs
Copy link

isaacs commented Jan 25, 2012

The setHeader(key, value) API should not let you accidentally set more than one header. \r\n is a valid character for the value, but only if it is followed by whitespace.

// valid
response.setHeader('set-cookie', 'foo=bar;\r\n     Expires=Wed, 09 Jun 2021 10:18:14 GMT')

// invalid
response.setHeader('set-cookie', 'foo=bar\r\nLocation: http://evil.com')

So, value.replace(/\r\n.*$/, '') is not appropriate. But value.replace(/\r\n\S.*$/, '') probably is appropriate, though.

@isaacs isaacs reopened this Jan 25, 2012
@koichik
Copy link

koichik commented Jan 26, 2012

How about koichik/node@2bd9f87? As wrote to my first comment, I would like to throw an error rather than replacing silently.

The following example calls setHeader() one million with and without assertion.

var http = require('http');

var N = 1E6;
var name = 'Content-Type';
var value = 'text/plain; charset=UTF-8';

var server = http.createServer(function(req, res) {
  console.time('assert');
  for (var i = 0; i < N; ++i) {
    res.setHeader(name, value + i);
  };
  console.timeEnd('assert');

  console.time('noAssert');
  for (var i = 0; i < N; ++i) {
    res.setHeader(name, value + i, true);
  };
  console.timeEnd('noAssert');

  res.writeHead(200);
  res.end();
}).listen(3000, function() {
  http.get({port: 3000}, function(res) {
    res.on('end', function() {
      server.close();
    });
  });
});

result:

assert: 870ms
noAssert: 468ms

The use of the RegExp is slow for about 400 milliseconds per one million times. I think that it is reasonable enough.

@tj
Copy link

tj commented Jan 26, 2012

yeah on jsperf that regexp is doing 5,265,491 ops/s :p. -1 for throwing though, adding a try/catch to every header manip would get pretty annoying

@koichik
Copy link

koichik commented Jan 27, 2012

@visionmedia - Because header manipulations throw an error from before, you should have already added a try/catch :)

I think that it is a bug of an application that an error is thrown, it must check its input. Throwing an error makes a user notice it. Probably, handling an error will become easy when Domains is merged.

@alpercelik
Copy link
Author

I agree about throwing error is better because; setHeader cannot successfully do what it is designed to do. And in this case it is caused by invalid input parameters.

@tj
Copy link

tj commented Jan 27, 2012

@koichik ah well then yeah we shouldn't throw, I see that it throws as far as the header already being sent, that's fine because that's a user error, you dont need to try/catch all over for those, but IMO this should not throw. Sure I might want to know with a console.warn when someone is attempting to split the header but I also dont want to litter my apps with error checking for every single setHeader()

@indutny
Copy link
Member

indutny commented Mar 18, 2014

Is it still relevant?

@jasnell
Copy link
Member

jasnell commented Aug 26, 2015

New PR opened on nodejs/node. Will continue to track the conversation there. nodejs/node#2526

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants