Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zlib.createUnzip does not throw error on unexpected end of file #2043

Closed
iliakan opened this issue Jun 23, 2015 · 13 comments
Closed

zlib.createUnzip does not throw error on unexpected end of file #2043

iliakan opened this issue Jun 23, 2015 · 13 comments
Assignees
Labels
confirmed-bug Issues with confirmed bugs. good first issue Issues that are suitable for first-time contributors. zlib Issues and PRs related to the zlib subsystem.

Comments

@iliakan
Copy link

iliakan commented Jun 23, 2015

Here's the code:

var zlib = require('zlib');
var fs = require('fs');

fs.createReadStream('test.gz') 
  .pipe(zlib.createGunzip()) 
  .on('error', function(err) {
    console.log("ERROR", err);
  })
  .pipe(fs.createWriteStream('test')) 
  .on('finish', function() {
    console.log("DONE");
  });

I take a big valid gz file and do:

head -n 100000 ~/valid.gz > test.gz

Now test.gz is an unfinished archive:

> gunzip -k test.gz
gunzip: test.gz: unexpected end of file
gunzip: test.gz: uncompress failed

...But the aforementioned script shows DONE on this (unfinished) archive. No error.
There should be.

@trevnorris
Copy link
Contributor

Does it throw if you do zlib.unzipSync(fs.readFileSync('test.gz'))?

@iliakan
Copy link
Author

iliakan commented Jun 23, 2015

No, it "works" silently. The same behavior as the async version.

But the file is definitely corrupted. It is smaller than the original, and gunzip console utility shows error as written above. The result of such "unarchiving" is invalid.

Node.JS should throw error as gunzip executable does.

@trevnorris
Copy link
Contributor

/cc @indutny

@mscdex mscdex added the zlib Issues and PRs related to the zlib subsystem. label Jun 23, 2015
@iliakan iliakan changed the title zlib.createUnzip dies not throw error on unexpected end of file zlib.createUnzip does not throw error on unexpected end of file Jun 25, 2015
@iliakan
Copy link
Author

iliakan commented Jun 25, 2015

There cause is in https://github.com/nodejs/io.js/blob/master/src/node_zlib.cc#L270-L291.

There are few errors not handled there. In the case I described, there is a fatal Z_BUF_ERROR (it may be not fatal, but in this case it is, no more data).

Maybe it would be beneficial to look at other zlib wrappers, like gz_decomp function in http://www.virtualbox.org/svn/vbox/trunk/src/libs/zlib-1.2.6/gzread.c, how it copes with errors, and extend the CheckError function.

@trevnorris
Copy link
Contributor

Does seem like that error case should be handled. Clean up resources and emit error on async and throw if sync.

@brendanashworth brendanashworth added the confirmed-bug Issues with confirmed bugs. label Jun 27, 2015
@chrisdickinson chrisdickinson added the good first issue Issues that are suitable for first-time contributors. label Jul 8, 2015
@chrisdickinson
Copy link
Contributor

I'd be happy to help any folks new to the project track this down.

@thefourtheye
Copy link
Contributor

@chrisdickinson I was reading this page http://zlib.net/zlib_how.html

The way we tell that deflate() has no more output is by seeing that it did not fill the output buffer, leaving avail_out greater than zero. However suppose that deflate() has no more output, but just so happened to exactly fill the output buffer! avail_out is zero, and we can't tell that deflate() has done all it can. As far as we know, deflate() has more output for us. So we call it again. But now deflate() produces no output at all, and avail_out remains unchanged as CHUNK. That deflate() call wasn't able to do anything, either consume input or produce output, and so it returns Z_BUF_ERROR. (See, I told you I'd cover this later.) However this is not a problem at all. Now we finally have the desired indication that deflate() is really done, and so we drop out of the inner loop to provide more input to deflate().

If I understand this correctly,

  • if the avail_out is not changed after the inflate/deflate call,
  • the next_in didn't change after the inflate/deflate call, and
  • next_in is null

then it means that

  • we have some data to be sent out
  • but stream is not finished and
  • the last inflate/deflate call didn't make any change to the data (possibly because it is expecting more data)
  • and the pointer from which we have to read is null.

Can this mean we have malformed data? I tried this patch

diff --git a/src/node_zlib.cc b/src/node_zlib.cc
index 699d5c4..4085fa6 100644
--- a/src/node_zlib.cc
+++ b/src/node_zlib.cc
@@ -222,6 +222,8 @@ class ZCtx : public AsyncWrap {
   // been consumed.
   static void Process(uv_work_t* work_req) {
     ZCtx *ctx = ContainerOf(&ZCtx::work_req_, work_req);
+    size_t prev_avail_out = ctx->strm_.avail_out;
+    Bytef* prev_next_in = ctx->strm_.next_in;

     // If the avail_out is left at 0, then it means that it ran out
     // of room.  If there was avail_out left over, then it means
@@ -253,6 +261,11 @@ class ZCtx : public AsyncWrap {
             // input.
             ctx->err_ = Z_NEED_DICT;
           }
+        } else if (ctx->err_ == Z_BUF_ERROR &&
+                   ctx->strm_.avail_out == prev_avail_out &&
+                   ctx->strm_.next_in == prev_next_in &&
+                   prev_next_in == nullptr) {
+            ctx->err_ = Z_DATA_ERROR;
         }

And it gives me

ERROR { [Error: Zlib error] errno: -3, code: 'Z_DATA_ERROR' }

for the OP's case.

@thefourtheye
Copy link
Contributor

NVM. That is wrong. It breaks test/parallel/test-zlib-dictionary.js

@misterdjules
Copy link

@chrisdickinson Thanks for being available as a mentor 👍 Would you mind adding some of your contact info for people willing to pick that up so that they can contact you when/if they need guidance?

@thefourtheye
Copy link
Contributor

@chrisdickinson I would love to learn from you as your mentee :-)

@chrisdickinson
Copy link
Contributor

@misterdjules Great idea — my email is chris at neversaw dot us — that is probably the best way to get ahold of me.

@thefourtheye Rad! I can start looking at this issue tomorrow AM and would be happy to answer any questions via email or issue comment. I can also work out a time to make myself available on IRC in #io.js if that helps as well!

@thefourtheye
Copy link
Contributor

@chrisdickinson Cool. I am in Indian Standard Time. I ll start with the mail communication. I am also available in IRC as well :-)

@jhamhader
Copy link
Contributor

The very verbose zlib manual indicates that both inflate and deflate return Z_OK when progress was made but more work is left.
If deflate returns Z_OK when called with the Z_FINISH flush flag (last chunk) then it is an error indicating exactly the described issue. Should CheckError be aware of the flush flag in ctx?
I.E.:
err == Z_OK is not an error iff flush != Z_FINISH

rvagg pushed a commit that referenced this issue Oct 21, 2015
Check for unexpected end-of-file error when decompressing. If the output
buffer still has space after decompressing and deflate returned Z_OK or
Z_BUF_ERROR - that means unexpected end-of-file. Added
test-zlib-truncated.js for the case of truncated input. Fixed the zlib
dictionary test to not end the inflate stream on a truncated output (no
crc) of deflate

Fixes: #2043
PR-URL: #2595
Reviewed-By: Trevor Norris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. good first issue Issues that are suitable for first-time contributors. zlib Issues and PRs related to the zlib subsystem.
Projects
None yet
Development

No branches or pull requests

8 participants