Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend MultiGzDecoder over GzDecoder in docs #324

Merged
merged 8 commits into from
Jul 30, 2023
18 changes: 7 additions & 11 deletions src/gz/bufread.rs
Original file line number Diff line number Diff line change
Expand Up @@ -167,11 +167,11 @@ impl<R: BufRead + Write> Write for GzEncoder<R> {
}
}

/// A gzip streaming decoder
/// A decoder for a single member of a gzip file. Prefer [MultiGzDecoder] for
/// most uses.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A decoder for a single member of a gzip file. Prefer [MultiGzDecoder] for
/// most uses.
/// A decoder for a gzip file with a single member.

I'd agree that we shouldn't just present this as a decoder for gzip in general, and should capture the distinction in the summary, but I'd stop short of telling people to prefer MultiGzDecoder.

///
/// This structure consumes a [`BufRead`] interface, reading compressed data
/// from the underlying reader, and emitting uncompressed data.
/// Use [`MultiGzDecoder`] if your file has multiple streams.
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
///

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever the final text, it should stick to the RFC terminology of "member" rather than using "stream" in some places.

/// [`BufRead`]: https://doc.rust-lang.org/std/io/trait.BufRead.html
///
Expand Down Expand Up @@ -397,20 +397,16 @@ impl<R: BufRead + Write> Write for GzDecoder<R> {
}
}

/// A gzip streaming decoder that decodes all members of a multistream
/// A gzip streaming decoder that decodes a full [gzip file].
Byron marked this conversation as resolved.
Show resolved Hide resolved
///
/// A gzip member consists of a header, compressed data and a trailer. The [gzip
/// specification](https://tools.ietf.org/html/rfc1952), however, allows multiple
/// gzip members to be joined in a single stream. `MultiGzDecoder` will
/// decode all consecutive members while [`GzDecoder`] will only decompress
/// the first gzip member. The multistream format is commonly used in
/// bioinformatics, for example when using the BGZF compressed data. It's also useful
/// to compress large amounts of data in parallel where each thread produces one stream
/// for a chunk of input data.
/// A gzip file consists of a series of "members" concatenated one after another.
/// MultiGzDecoder decodes all members of a file, while [GzDecoder] will only decode
/// the first one member. MultiGzDecoder is preferable in most cases.
Byron marked this conversation as resolved.
Show resolved Hide resolved
///
/// This structure exposes a [`BufRead`] interface that will consume all gzip members
/// from the underlying reader and emit uncompressed data.
///
/// [gzip file]: https://www.rfc-editor.org/rfc/rfc1952#page-5
/// [`BufRead`]: https://doc.rust-lang.org/std/io/trait.BufRead.html
///
/// # Examples
Expand Down
21 changes: 7 additions & 14 deletions src/gz/read.rs
Original file line number Diff line number Diff line change
Expand Up @@ -90,13 +90,11 @@ impl<R: Read + Write> Write for GzEncoder<R> {
}
}

/// A gzip streaming decoder
/// A decoder for a single member of a gzip file. Prefer [MultiGzDecoder] for
/// most uses.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A decoder for a single member of a gzip file. Prefer [MultiGzDecoder] for
/// most uses.
/// A decoder for a gzip file with a single member.

I'd agree that we shouldn't just present this as a decoder for gzip in general, and should capture the distinction in the summary, but I'd stop short of telling people to prefer MultiGzDecoder.

///
/// This structure exposes a [`Read`] interface that will consume compressed
/// data from the underlying reader and emit uncompressed data.
/// Use [`MultiGzDecoder`] if your file has multiple streams.
///
/// [`Read`]: https://doc.rust-lang.org/std/io/trait.Read.html
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
///

/// # Examples
///
Expand Down Expand Up @@ -180,21 +178,16 @@ impl<R: Read + Write> Write for GzDecoder<R> {
}
}

/// A gzip streaming decoder that decodes all members of a multistream
/// A gzip streaming decoder that decodes a full [gzip file].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A gzip streaming decoder that decodes a full [gzip file].
/// A gzip streaming decoder that decodes a [gzip file] with multiple members.

///
/// A gzip member consists of a header, compressed data and a trailer. The [gzip
/// specification](https://tools.ietf.org/html/rfc1952), however, allows multiple
/// gzip members to be joined in a single stream. `MultiGzDecoder` will
/// decode all consecutive members while [`GzDecoder`] will only decompress the
/// first gzip member. The multistream format is commonly used in bioinformatics,
/// for example when using the BGZF compressed data. It's also useful
/// to compress large amounts of data in parallel where each thread produces one stream
/// for a chunk of input data.
/// A gzip file consists of a series of "members" concatenated one after another.
/// MultiGzDecoder decodes all members of a file, while [GzDecoder] will only decode
/// the first one member. MultiGzDecoder is preferable in most cases.
///
/// This structure exposes a [`Read`] interface that will consume all gzip members
/// from the underlying reader and emit uncompressed data.
///
/// [`Read`]: https://doc.rust-lang.org/std/io/trait.Read.html
/// [gzip file]: https://www.rfc-editor.org/rfc/rfc1952#page-5
///
/// # Examples
///
Expand Down
17 changes: 8 additions & 9 deletions src/gz/write.rs
Original file line number Diff line number Diff line change
Expand Up @@ -166,11 +166,11 @@ impl<W: Write> Drop for GzEncoder<W> {
}
}

/// A gzip streaming decoder
/// A decoder for a single member of a gzip file. Prefer [MultiGzDecoder] for
/// most uses.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A decoder for a single member of a gzip file. Prefer [MultiGzDecoder] for
/// most uses.
/// A decoder for a gzip file with a single member.

I'd agree that we shouldn't just present this as a decoder for gzip in general, and should capture the distinction in the summary, but I'd stop short of telling people to prefer MultiGzDecoder.

///
/// This structure exposes a [`Write`] interface that will emit uncompressed data
/// to the underlying writer `W`.
/// Use [`MultiGzDecoder`] if your file has multiple streams.
///
Copy link
Member

@joshtriplett joshtriplett Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
///

/// [`Write`]: https://doc.rust-lang.org/std/io/trait.Write.html
///
Expand Down Expand Up @@ -373,17 +373,16 @@ impl<W: Read + Write> Read for GzDecoder<W> {
}
}

/// A gzip streaming decoder that decodes all members of a multistream
/// A gzip streaming decoder that decodes a full [gzip file].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A gzip streaming decoder that decodes a full [gzip file].
/// A gzip streaming decoder that decodes a [gzip file] with multiple members.

///
/// A gzip member consists of a header, compressed data and a trailer. The [gzip
/// specification](https://tools.ietf.org/html/rfc1952), however, allows multiple
/// gzip members to be joined in a single stream. `MultiGzDecoder` will
/// decode all consecutive members while `GzDecoder` will only decompress
/// the first gzip member. The multistream format is commonly used in
/// bioinformatics, for example when using the BGZF compressed data.
/// A gzip file consists of a series of "members" concatenated one after another.
/// MultiGzDecoder decodes all members of a file, while [GzDecoder] will only decode
/// the first one member. MultiGzDecoder is preferable in most cases.
///
/// This structure exposes a [`Write`] interface that will consume all gzip members
/// from the written buffers and write uncompressed data to the writer.
///
/// [gzip file]: https://www.rfc-editor.org/rfc/rfc1952#page-5
#[derive(Debug)]
pub struct MultiGzDecoder<W: Write> {
inner: GzDecoder<W>,
Expand Down