Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BinaryCodec.readBytesOrFewer fails if no data is requested AND no date is available in the underlying input stream #1188

Merged
merged 3 commits into from
Oct 5, 2018

Conversation

nh13
Copy link
Member

@nh13 nh13 commented Oct 5, 2018

I ran into this bug when writing an empty string as the last element into a byte buffer (ex binaryCodec.writeString("", true, false)) and then trying to read it back in (ex. binaryCodec.readLengthAndString(false)). The main issue is that InputStream.read will return -1 even if the maximum number of bytes to read is zero (i.e. InputStream.read(buffer, 0, 0) will return -1).

I made a commit with a test to show the bug and a commit to fix it.

…am is a ByteArrayInputStream, the buffer has no more bytes, and the # of bytes to read is zero.
@nh13
Copy link
Member Author

nh13 commented Oct 5, 2018

@lbergelson @jacarey or @tfenne care to take a look.

@lbergelson
Copy link
Member

Strange, input stream documentation says it should return 0 in that case.

If len is zero, then no bytes are read and
0 is returned; otherwise, there is an attempt to read at
least one byte. If no byte is available because the stream is at end of
file, the value -1 is returned; otherwise, at least one
byte is read and stored into b.

@lbergelson
Copy link
Member

Oh, I didn't read carefully enough, it's the combination of the end of stream AND requesting to read nothing that is the issue?

@codecov-io
Copy link

codecov-io commented Oct 5, 2018

Codecov Report

Merging #1188 into master will increase coverage by 0.01%.
The diff coverage is 100%.

@@              Coverage Diff               @@
##              master     #1188      +/-   ##
==============================================
+ Coverage     68.397%   68.407%   +0.01%     
- Complexity      8014      8016       +2     
==============================================
  Files            542       542              
  Lines          32693     32700       +7     
  Branches        5529      5530       +1     
==============================================
+ Hits           22361     22369       +8     
  Misses          8127      8127              
+ Partials        2205      2204       -1
Impacted Files Coverage Δ Complexity Δ
...rc/main/java/htsjdk/samtools/util/BinaryCodec.java 69.266% <100%> (ø) 56 <0> (+1) ⬆️
...a/htsjdk/samtools/util/AbstractProgressLogger.java 45.902% <0%> (+3.309%) 7% <0%> (ø) ⬇️
...samtools/util/AsyncBlockCompressedInputStream.java 76% <0%> (+4%) 13% <0%> (+1%) ⬆️

@nh13
Copy link
Member Author

nh13 commented Oct 5, 2018

@lbergelson

From ByteArrayInputStream

Reads up to len bytes of data into an array of bytes from this input stream. If pos equals count, then -1 is returned to indicate end of file.

which doesn't agree with InputStream.

@@ -415,7 +415,8 @@ public int readBytesOrFewer(final byte[] buffer, final int offset, final int len
throw new IllegalStateException("Calling read method on BinaryCodec open for write.");
}
try {
return inputStream.read(buffer, offset, length);
if (length == 0) return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use { } around if body

@@ -415,7 +415,8 @@ public int readBytesOrFewer(final byte[] buffer, final int offset, final int len
throw new IllegalStateException("Calling read method on BinaryCodec open for write.");
}
try {
return inputStream.read(buffer, offset, length);
if (length == 0) return 0;
else return inputStream.read(buffer, offset, length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else unnecessary after return

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about return (length == 0) ? 0 : inputStream.read(buffer, offset, length);

@nh13
Copy link
Member Author

nh13 commented Oct 5, 2018

My interpretation is that InpuStream.read should return zero, and that ByteArrayInputStream.read says otherwise for its implementation.

https://bugs.java.com/view_bug.do?bug_id=6766844

@pshapiro4broad
Copy link
Contributor

Thanks for the note, I was looking at the source for InputStream.read() and was pretty confused by this. Given this behavior of ByteArrayInputStream it makes me wonder how many other cases like this there are, if we want to rely on correct results from read().

return inputStream.read(buffer, offset, length);
// Some implementations of InputStream do not behave well when the buffer is empty and length is zero, for
// example ByteArrayInputStream, so we must check for length equal to zero.
// See: https://bugs.java.com/view_bug.do?bug_id=6766844
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good find, thanks for adding this note.

@lbergelson
Copy link
Member

I love the open java bugs from 2008 that no one has bothered responding too.

@pshapiro4broad
Copy link
Contributor

On one of the linked bugs there's a comment

The inputstream.read(byte[], int, 0) is not always expected return 0 in the case of the bug description. We should not try to read zero bytes from inputstream.

@nh13 nh13 merged commit ff3db93 into master Oct 5, 2018
@nh13 nh13 deleted the nh_binary_codec_empty_string branch October 5, 2018 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants