Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return error when content-length header does not match bytes transferred #1462

Open
1 task done
kanongil opened this issue Sep 14, 2020 · 13 comments
Open
1 task done
Labels
enhancement This change will extend Got features ✭ help wanted ✭

Comments

@kanongil
Copy link

What problem are you trying to solve?

Safely retrieve content delivered from a server.

Describe the feature

Depending on how the stream is sent from the server, got will happily complete without error, even if the actual bytes transferred does not match the value in the content-length header.

While it is possible to workaround, and detect manually, it is quite cumbersome, especially if decompress is enabled.

FYI, it is not a problem when the server responds with content-encoding: chunked, since node will error if such a stream is interrupted.

Checklist

  • I have read the documentation and made sure this feature doesn't already exist.
@Giotino
Copy link
Collaborator

Giotino commented Sep 14, 2020

Depending on how the stream is sent from the server

Can you make an example?
I tried to reproduce it, but it's quite generic. Also this behaviour doesn't make sense with the HTTP standard (but it could be due to some internal Got component).

@kanongil
Copy link
Author

kanongil commented Sep 14, 2020

Yeah, I'm not sure about producing a test case. Currently, I make it fail by disabling / re-enabling my wifi during a heavy download from a remote server (using node v14.10.1 on macOS 10.15.6).

Looking into it, the issue does seem to be related to HTTP2 downloads, as I am unable to trigger the error with it disabled.

@kanongil
Copy link
Author

I tried running with NODE_DEBUG=http2, and found that the unerrored prematurely closed responses close while readable true, while all other responses have readable false.

HTTP2 93829: Http2Stream 19 [Http2Session client]: headers received
HTTP2 93829: Http2Stream 19 [Http2Session client]: emitting stream 'response' event

<interrupt wifi>

HTTP2 93829: Http2Stream 19 [Http2Session client]: closed with code 0, closed false, readable true
HTTP2 93829: Http2Stream 19 [Http2Session client]: destroying stream

@szmarczak szmarczak added enhancement This change will extend Got features ✭ help wanted ✭ labels Sep 14, 2020
@kanongil
Copy link
Author

I have not been able to create a test case, but I can present more details:

This issue is server dependent. If I use nginx termination instead of envoy, it does not fail. I have not been able to debug the HTTP2 frames, but I have a strong feeling that this issue is caused by insufficient response validation in node. Specifically the content-length header, as suggested in the spec:

A request or response is also malformed if the value of a content-length header field does not equal the sum of the DATA frame payload lengths that form the body

Further code analysis shows that nghttp2 supposedly validates the content-length, before handing over the response to node in http2/core.js. Given this, the bug is likely to be found inside nghttp2.

Digging into the code, the content-length is indeed validated. However, this logic might not always be called. Specifically nghttp2_http_on_remote_end_stream() is only called after receiving a DATA frame with the END_STREAM flag, and not when a RST frame is received. As it turns out, a RST frame can trigger the same on_stream_close_callback, forwarding its embedded error_code. The very same code that is exposed to the node callback, and that could have a value of 0, causing node to just close the stream!

So the theory is that this bug will trigger on HTTP2 streams, that contain a content-length header, and where the server sends a RST frame with an error code of 0, before all data has been sent.

@szmarczak
Copy link
Collaborator

@Giotino The socket is destroyed as usual, but the thing is that you cannot verify whether it was destroyed because the connection was reset or it was successful. I have experienced this a lot on Android.

@kanongil
Copy link
Author

Digging into envoy, I found further evidence to the theory, as it can indeed send RST frames with NGHTTP2_NO_ERROR.

@kanongil
Copy link
Author

Yup, this is an nghttp2 issue, as speculated. Though, any fix will take some time to arrive in node, so it might make sense to add a workaround here?

The issue only applies to streams that have fewer bytes than the content-length indicates, so it should be able to be detected by tracking the consumed bytes, and comparing it to the content-length before finishing the stream.

I'm still at a loss for a test case, since I don't know how to create a test with a server that sends RST_STREAM with error_code 0.

@kanongil
Copy link
Author

I made a failing test using http2 streams here: nodejs/node#35209 (comment).

I also managed to find problems when a server closes the stream with RST_STREAM code 8 (CANCEL):
nodejs/node#35209 (comment)

@kanongil
Copy link
Author

kanongil commented Sep 28, 2020

If you want to work around this last bug, you could check the stream rstCode property when the end event is emitted (it will be 8 when aborted by the server).

@sindresorhus sindresorhus removed this from the Got v12 milestone Jan 5, 2021
@turbopasi
Copy link

turbopasi commented Feb 3, 2021

@kanongil I am facing a similiar problem at the moment . Whenever the source file on the server is being overwritten during a download with got.stream, it just stops the download without any error event being raised. Just the end event. https://stackoverflow.com/questions/65231912/reliable-file-download-with-got/65990239#comment116695967_65990239

You think this rstCode workaround could be used in my case ? And how would that work since there are no parameters given to the end event handler

got.stream().on('end', (event) => {
   console.log(event);
   // undefined
});

@kanongil
Copy link
Author

kanongil commented Feb 3, 2021

@turbopasi It should be a property on the got.stream().

@turbopasi
Copy link

Thanks ! I just tested it real quick , but didn't have any luck :

 const request = got.stream('https://whatever')
    .on('end', () => {
      console.log(Object.keys(request));
      console.log(request.rstCode); // logs undefined
    });

I probably have some misunderstanding here - where should this rstCode exactly be ? 🤔

@kanongil
Copy link
Author

kanongil commented Feb 3, 2021

So the workaround was meant to be used internally in got. As a client you need to find the http2 stream to access it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This change will extend Got features ✭ help wanted ✭
Projects
None yet
Development

No branches or pull requests

5 participants