Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"corrupt deflate stream" error #251

Open
xudesheng opened this issue Sep 29, 2020 · 7 comments
Open

"corrupt deflate stream" error #251

xudesheng opened this issue Sep 29, 2020 · 7 comments

Comments

@xudesheng
Copy link

Background:
I have a C-based client application which talks with server side using Websocket with permessage-deflate extension. "deflate 1.12.11" is the lib used to compress and decompress data.
I want to use Rust to replace this C-based client application.

Expectation:

  1. I'm not expecting a byte to byte the same result of compression.
  2. client-side can compress data into same deflate format and it can be decompressed by the server without issue.
  3. client-side can decompress compressed content from the server without issue.

Sample Data set 1:

  1. Raw data: 01400000000200000f4c150bc06c0000ff
  2. Compressed Data: 6274606060606260e0f711e53e90c3c0f01f00

Sample Data set 2:

  1. Raw data: 01140000000200000000ffffffff0001066170704b65792438643966363138392d313933392d343534322d393763652d356136666337356130333732
  2. Compressed Data: 621461606060026286ff40c0c0c8965850e09d5aa96291629966666861a96b68696ca96b626a62a46b699e9caa6b9a6896966c6e9a68606c6e0400

Sample Data set 3:

  1. Raw Data: 014000004becffffffff150bc06c0000ff
  2. Compressed Data: 62746060f07e033219a2e23f00

Sample Data Set 4:

  1. Raw Data: 01400000000500000f4cffffffff0000050108656467654e616d6512456467652070726f7065727479206e616d650000010c69735072696d6172794b65790201010d707573685468726573686f6c64394368616e6765207468726573686f6c6420746f2067656e6572617465206576656e7420666f72206e756d657269632070726f706572746965730100000108707573685479706543537472696e6720726570726573656e74696e672070757368207479706520666f72206576656e74206e6f74696669636174696f6e20746f207375627363726962657273000000000100030009546f74616c466c6f770100000000000000000006414c57415953010003000b54656d70657261747572650100000000000000000006414c57415953010003000b4661756c745374617475730100000000000000000006414c57415953010003000a496e6c657456616c76650100000000000000000006414c57415953010003001054656d70657261747572654c696d69740100000000000000000006414c57415953010003000944756d6d79546578740100000000000000000006414c57415953010003000850726573737572650100000000000000000006414c5741595301000300074c6f6766696c650100000000000000000006414c5741595301000300084c6f636174696f6e0100000000000000000006414c5741595300
  2. Compressed Data: 7491db4a0331108613a1baf500e213e435bc2cd582b8486117c5cb743bdd0de44432a9eedb3bd9ba2098e62a997ffe6f0e99408b0c9a7c6cc12bd8f7f0260d3c3cd345f85f83b014628cdfaab80dcac830bec278c1f99d4f7168870071707affb81ea42517ce01814ef4602150350147b0280e2e089b0c04d5cd74059113bb9a58a387758341d95e04f0c421537e645120a913e1c4b27928d54954cee65231ed6217d40e42a4b918a71d2d5b87526fb4fbe26c3e97abfa63f5d964f9a605e3737b294039612393c60629231613ae5fac067c97fa5806dcffa9502ba3b098b57c4ac68c2d7c97e56a4b8b88e77abcaa5d7fa0bf2e5b6b775ad03f95fd00

Cargo.toml:

#miniz_oxide="0.4"
hex-literal = "0.2"
#flate2 = { version = "1.0.17", features = ["zlib-ng-compat"], default-features = false }
#flate2 = "1.0"
flate2 = { version = "1.0.17", features = ["zlib"], default-features = false }
anyhow = "1.0"
log = "0.4"

main.rs:

use std::io::prelude::*;
use flate2::Compression;
use flate2::write::DeflateEncoder as MyEncoder;
use flate2::write::DeflateDecoder as MyDecoder;
use hex_literal::hex;

use anyhow::Result;


fn twx_compress(bytes: &[u8]) ->Result<Vec<u8>> {
    let mut e = MyEncoder::new(Vec::new(), Compression::default());
    e.write_all(bytes)?;
    let compressed_bytes=e.finish()?;

    Ok(compressed_bytes)
}

fn twx_decompress(compressed:&mut [u8])->Result<Vec<u8>>{
    let mut writer = Vec::new();
    let mut deflater = MyDecoder::new(writer);
    deflater.write_all(compressed)?;
    let block_end = vec![0x00 as u8, 0x00 as u8, 0xff as u8, 0xff as u8];
    deflater.write_all(&block_end[..])?;
    writer =deflater.finish()?;

    Ok(writer)
}
fn roundtrip(raw: &[u8],processed:&mut [u8]) {
    let compressed_raw = twx_compress(raw).expect("failed to compress!");
    println!("Original raw, len:{},content:{:02X?}",raw.len(),raw);
    println!("Compress raw, len:{},content:{:02X?}\n",compressed_raw.len(),compressed_raw);
    
    
    let decompressed = twx_decompress(processed).expect("Failed to decompress");
    println!("Original compressed, len:{},content:{:02X?}",processed.len(),processed);
    println!("decompressed content,len:{},content:{:02X?}\n",decompressed.len(),decompressed);
}

fn main() -> Result<()> {
    let source = hex!("01400000000200000f4c150bc06c0000ff");
    let mut processed = hex!("6274606060606260e0f711e53e90c3c0f01f00");
    roundtrip(source.as_ref(),processed.as_mut());

    
    let source = hex!("01140000000200000000ffffffff0001066170704b65792438643966363138392d313933392d343534322d393763652d356136666337356130333732");
    let mut processed = hex!("621461606060026286ff40c0c0c8965850e09d5aa96291629966666861a96b68696ca96b626a62a46b699e9caa6b9a6896966c6e9a68606c6e0400");
    roundtrip(source.as_ref(),processed.as_mut());

    let source = hex!("014000004becffffffff150bc06c0000ff");
    let mut processed = hex!("62746060f07e033219a2e23f00");
    roundtrip(source.as_ref(),processed.as_mut());

    let source = hex!("01400000000500000f4cffffffff0000050108656467654e616d6512456467652070726f7065727479206e616d650000010c69735072696d6172794b65790201010d707573685468726573686f6c64394368616e6765207468726573686f6c6420746f2067656e6572617465206576656e7420666f72206e756d657269632070726f706572746965730100000108707573685479706543537472696e6720726570726573656e74696e672070757368207479706520666f72206576656e74206e6f74696669636174696f6e20746f207375627363726962657273000000000100030009546f74616c466c6f770100000000000000000006414c57415953010003000b54656d70657261747572650100000000000000000006414c57415953010003000b4661756c745374617475730100000000000000000006414c57415953010003000a496e6c657456616c76650100000000000000000006414c57415953010003001054656d70657261747572654c696d69740100000000000000000006414c57415953010003000944756d6d79546578740100000000000000000006414c57415953010003000850726573737572650100000000000000000006414c5741595301000300074c6f6766696c650100000000000000000006414c5741595301000300084c6f636174696f6e0100000000000000000006414c5741595300");
    let mut processed = hex!("7491db4a0331108613a1baf500e213e435bc2cd582b8486117c5cb743bdd0de44432a9eedb3bd9ba2098e62a997ffe6f0e99408b0c9a7c6cc12bd8f7f0260d3c3cd345f85f83b014628cdfaab80dcac830bec278c1f99d4f7168870071707affb81ea42517ce01814ef4602150350147b0280e2e089b0c04d5cd74059113bb9a58a387758341d95e04f0c421537e645120a913e1c4b27928d54954cee65231ed6217d40e42a4b918a71d2d5b87526fb4fbe26c3e97abfa63f5d964f9a605e3737b294039612393c60629231613ae5fac067c97fa5806dcffa9502ba3b098b57c4ac68c2d7c97e56a4b8b88e77abcaa5d7fa0bf2e5b6b775ad03f95fd00");
    roundtrip(source.as_ref(),processed.as_mut());

    
    Ok(())
}

Both decompressions of data set 3 and 4 failed.
'Failed to decompress: corrupt deflate stream'

@alexcrichton
Copy link
Member

Thanks for the report! I'm not really entirely sure what's going on here, but it could be a bug on either side of things. I don't know much about the library you're using already and how closely it maps to what this crate does. For example I don't know what the block_end write is doing in the example you have.

I'd recommend trying a few different backends to see if your mileage varies perhaps?

@xudesheng
Copy link
Author

xudesheng commented Oct 2, 2020

Thanks for the report! I'm not really entirely sure what's going on here, but it could be a bug on either side of things. I don't know much about the library you're using already and how closely it maps to what this crate does. For example I don't know what the block_end write is doing in the example you have.

I'd recommend trying a few different backends to see if your mileage varies perhaps?

Thank you, @alexcrichton for your response.

  1. With or without block end, the result will be the same.
    https://tools.ietf.org/html/rfc7692 chapter 7.2 talks about when a block end is needed even through above code is not a real implementation of it. I have tried with or without block end, but the result is the same.

  2. I have tried all backends and the current one has the best outcome but still has the issues I mentioned.

  3. The lib I mentioned is https://zlib.net/ "deflate.c"
    version 1.2.11, " deflate 1.2.11 Copyright 1995-2017 Jean-loup Gailly and Mark Adler ";

@alexcrichton
Copy link
Member

Sorry in that case I don't really know what's going on. I think this will require some investigation to see where the binary blobs came from and try to track down where exactly the error is cropping up.

@doivosevic
Copy link

Any update on this? I'm experiencing the same error on some hard to debug code so I'm fishing for ideas

@oyvindln
Copy link
Contributor

oyvindln commented Jun 27, 2021

At least for the compression side, if I understand the RFC correctly, you need to ensure there is an empty block at the end, and then remove the last 4 bytes (That is , the len and len complement of the empty stored block leaving you with just the block header). You may be able to ensure an empty end block using .flush() (I think that does a sync flush, if not you would have to use the underlying compression functions) before doing .finish. Alternatively according to 7.2.3.4 in the rfc. a 0x00 can be added to the end after .finish() which essentially is adding an empty stored block header to the end manually.

It's possible it's failing in both cases due to seeing more data than expected but not sure.

@ByteAlex
Copy link

ByteAlex commented Sep 17, 2021

Not sure if this helps, but I have a similar problem and looked through tons of SO posts and tried various hacks.

For a zlib-stream encoded websocket connection, you need a single Inflater object.
Likely because the websocket is loaded as a single keep-alive connection, which just continously sends data.
So each message has to be encoded on the same inflater. If the websocket sends framed data, you got to collect those frames until you find a ZLIB end header (0x00,0x00,0xff,0xff) at the end of the message.

Then you may start decoding on those.

But for my case another issue arises there.

windowBits can also be greater than 15 for optional gzip decoding. Add 32 to windowBits to enable zlib and gzip decoding with automatic header detection, or add 16 to decode only the gzip format (the zlib format will return a Z_DATA_ERROR). If a gzip stream is being decoded, strm->adler is a crc32 instead of an adler32.

I'm assuming that those wBit restrictions cause issues aswell. I'll fork the lib and play around for a bit.

Update:
I don't had to play with the wBits, but got a working version there; probably not the cleanest implementation, but it seems to work for me.
https://github.com/ZeroTwo-Bot/zlib-stream-rs/blob/master/src/lib.rs

@samdenty
Copy link

Inflate works correctly with pako:

const data = Buffer.from([
  120, 156, 13, 201, 49, 14, 128, 32, 12, 0, 192, 157, 87, 52, 44, 76, 141, 187,
  147, 95, 105, 0, 165, 166, 161, 104, 107, 248, 190, 44, 183, 28, 34, 134, 116,
  92, 143, 84, 179, 45, 11, 167, 29, 6, 121, 110, 1, 87, 4, 18, 209, 9, 220,
  253, 85, 27, 53, 59, 107, 135, 219, 22, 147, 189, 233, 231, 16, 11, 57, 69,
  56, 185, 74, 9, 63, 160,
]);

const pako = require("pako");

const inflator = pako.Inflate();
inflator.push(data, true);

console.log(inflator.err);
console.log(inflator.result);
console.log(inflator.strm.avail_in);

But fails with flate2, does anyone know why? CC @alexcrichton

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants