Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix data lost on decompression #2547

Merged
merged 2 commits into from
Aug 10, 2019

Conversation

komeda-shinji
Copy link

Which issue(s) this PR fixes:
Fixes #2546

in string_decompress(), gz.read loop break when unused.nil?,
but unused.nil? does not show end of stream.
It cause data lost.

loop shoud be break when eof.

@repeatedly
Copy link
Member

Thanks for the patch!

Could you follow DCO?
And do you have a failure case to check the behaviour in test?

Signed-off-by: KOMEDA Shinji <[email protected]>
@komeda-shinji komeda-shinji force-pushed the fix-compressable-data-lost branch from 4302bc3 to 12d69e7 Compare August 8, 2019 06:09
@komeda-shinji
Copy link
Author

I have failure case data, but I cannot provide because it contains privacy data.
I try to create failure case data, but It is too hard and unsuccessful yet.

@repeatedly repeatedly added the bug Something isn't working label Aug 9, 2019
@komeda-shinji
Copy link
Author

This script generates failure case data for string_decompress easily.

require 'stringio'
require 'zlib'
require 'securerandom'

def compress(data, **kwargs)
  output_io = kwargs[:output_io]
  io = output_io || StringIO.new
  Zlib::GzipWriter.wrap(io) do |gz|
    gz.write data
  end

  output_io || io.string
end

def string_decompress(compressed_data)
  io = StringIO.new(compressed_data)

  out = ''
  loop do
    gz = Zlib::GzipReader.new(io)
    out << gz.read
    unused = gz.unused
    gz.finish

    break if unused.nil?
    adjust = unused.length
    io.pos -= adjust
  end

  out
end

def create_data
  data = []
  10.times do
    data << SecureRandom.base64(1990) + "\n"
  end
  data
end
    
def find_failure
  data = []
  found = false
  loop do
    data = create_data

    indata = ''
    for d in data do
      indata << compress(d)
    end
    
    io = StringIO.new(indata)
    out = ''
    loop do
      gz = Zlib::GzipReader.new(io)
      out << gz.read
      unused = gz.unused
      gz.finish
    
      puts unused: unused.nil? ? "nil" : unused.length, eof: io.eof? ? "EOF" : "not"
      found = true if unused.nil? and not io.eof?
      unless unused.nil?
        adjust = unused.length
        io.pos -= adjust
      end
      break if io.eof
    end
    break if found
  end
  data
end

failure_case = find_failure

indata = ''
for d in failure_case do
  indata << compress(d)
end
outdata = string_decompress(indata)

puts failure_case
puts "---"
puts outdata

io.pos -= adjust
unless unused.nil?
adjust = unused.length
input.pos -= adjust
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use io instead of input here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! It's my msitake, thank you.

Signed-off-by: KOMEDA Shinji <[email protected]>
@repeatedly repeatedly merged commit 9e88103 into fluent:master Aug 10, 2019
@repeatedly
Copy link
Member

Thanks!
I will consider adding test with your script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compressable.rb lost some data on decompression
3 participants