Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend MultiGzDecoder over GzDecoder in docs #324

Merged
merged 8 commits into from
Jul 30, 2023
18 changes: 7 additions & 11 deletions src/gz/bufread.rs
Original file line number Diff line number Diff line change
Expand Up @@ -167,11 +167,10 @@ impl<R: BufRead + Write> Write for GzEncoder<R> {
}
}

/// A gzip streaming decoder
/// A decoder for a gzip file with a single member.
///
/// This structure consumes a [`BufRead`] interface, reading compressed data
/// from the underlying reader, and emitting uncompressed data.
/// Use [`MultiGzDecoder`] if your file has multiple streams.
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
///

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever the final text, it should stick to the RFC terminology of "member" rather than using "stream" in some places.

/// [`BufRead`]: https://doc.rust-lang.org/std/io/trait.BufRead.html
///
Expand Down Expand Up @@ -397,20 +396,17 @@ impl<R: BufRead + Write> Write for GzDecoder<R> {
}
}

/// A gzip streaming decoder that decodes all members of a multistream
/// A gzip streaming decoder that decodes a [gzip file] with multiple members.
///
/// A gzip member consists of a header, compressed data and a trailer. The [gzip
/// specification](https://tools.ietf.org/html/rfc1952), however, allows multiple
/// gzip members to be joined in a single stream. `MultiGzDecoder` will
/// decode all consecutive members while [`GzDecoder`] will only decompress
/// the first gzip member. The multistream format is commonly used in
/// bioinformatics, for example when using the BGZF compressed data. It's also useful
/// to compress large amounts of data in parallel where each thread produces one stream
/// for a chunk of input data.
/// A gzip file consists of a series of "members" concatenated one after another.
/// MultiGzDecoder decodes all members of a file, while [GzDecoder] will only decode
/// the first member. Learn more
/// [in the introduction](https://docs.rs/flate2/*/flate2/#About-multi-member-Gzip-files).
///
/// This structure exposes a [`BufRead`] interface that will consume all gzip members
/// from the underlying reader and emit uncompressed data.
///
/// [gzip file]: https://www.rfc-editor.org/rfc/rfc1952#page-5
/// [`BufRead`]: https://doc.rust-lang.org/std/io/trait.BufRead.html
///
/// # Examples
Expand Down
22 changes: 9 additions & 13 deletions src/gz/read.rs
Original file line number Diff line number Diff line change
Expand Up @@ -90,13 +90,13 @@ impl<R: Read + Write> Write for GzEncoder<R> {
}
}

/// A gzip streaming decoder
/// A decoder for a gzip file with a single member.
///
/// This structure exposes a [`Read`] interface that will consume compressed
/// data from the underlying reader and emit uncompressed data.
/// Use [`MultiGzDecoder`] if your file has multiple streams.
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
///

/// [`Read`]: https://doc.rust-lang.org/std/io/trait.Read.html
/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
///
/// # Examples
///
Expand Down Expand Up @@ -180,21 +180,17 @@ impl<R: Read + Write> Write for GzDecoder<R> {
}
}

/// A gzip streaming decoder that decodes all members of a multistream
/// A gzip streaming decoder that decodes a [gzip file] with multiple members.
///
/// A gzip member consists of a header, compressed data and a trailer. The [gzip
/// specification](https://tools.ietf.org/html/rfc1952), however, allows multiple
/// gzip members to be joined in a single stream. `MultiGzDecoder` will
/// decode all consecutive members while [`GzDecoder`] will only decompress the
/// first gzip member. The multistream format is commonly used in bioinformatics,
/// for example when using the BGZF compressed data. It's also useful
/// to compress large amounts of data in parallel where each thread produces one stream
/// for a chunk of input data.
/// A gzip file consists of a series of "members" concatenated one after another.
/// MultiGzDecoder decodes all members of a file, while [GzDecoder] will only decode
/// the first member. Learn more
/// [in the introduction](https://docs.rs/flate2/*/flate2/#About-multi-member-Gzip-files).
///
/// This structure exposes a [`Read`] interface that will consume all gzip members
/// from the underlying reader and emit uncompressed data.
///
/// [`Read`]: https://doc.rust-lang.org/std/io/trait.Read.html
/// [gzip file]: https://www.rfc-editor.org/rfc/rfc1952#page-5
///
/// # Examples
///
Expand Down
20 changes: 11 additions & 9 deletions src/gz/write.rs
Original file line number Diff line number Diff line change
Expand Up @@ -166,11 +166,13 @@ impl<W: Write> Drop for GzEncoder<W> {
}
}

/// A gzip streaming decoder
/// A decoder for a gzip file with a single member.
///
/// This structure exposes a [`Write`] interface that will emit uncompressed data
/// to the underlying writer `W`.
/// Use [`MultiGzDecoder`] if your file has multiple streams.
///
Copy link
Member

@joshtriplett joshtriplett Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
///

/// This decoder only handles gzipped data with a single stream.
/// Use [`MultiGzDecoder`] for gzipped data with multiple streams.
Byron marked this conversation as resolved.
Show resolved Hide resolved
///
/// [`Write`]: https://doc.rust-lang.org/std/io/trait.Write.html
///
Expand Down Expand Up @@ -373,17 +375,17 @@ impl<W: Read + Write> Read for GzDecoder<W> {
}
}

/// A gzip streaming decoder that decodes all members of a multistream
/// A gzip streaming decoder that decodes a [gzip file] with multiple members.
///
/// A gzip member consists of a header, compressed data and a trailer. The [gzip
/// specification](https://tools.ietf.org/html/rfc1952), however, allows multiple
/// gzip members to be joined in a single stream. `MultiGzDecoder` will
/// decode all consecutive members while `GzDecoder` will only decompress
/// the first gzip member. The multistream format is commonly used in
/// bioinformatics, for example when using the BGZF compressed data.
/// A gzip file consists of a series of "members" concatenated one after another.
/// `MultiGzDecoder` decodes all members of a file, while [GzDecoder] will only decode
/// the first member. Learn more
/// [in the introduction](https://docs.rs/flate2/*/flate2/#About-multi-member-Gzip-files).
///
/// This structure exposes a [`Write`] interface that will consume all gzip members
/// from the written buffers and write uncompressed data to the writer.
///
/// [gzip file]: https://www.rfc-editor.org/rfc/rfc1952#page-5
#[derive(Debug)]
pub struct MultiGzDecoder<W: Write> {
inner: GzDecoder<W>,
Expand Down
18 changes: 18 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,30 @@
//! `Write` trait if `T: Write`. That is, the "dual trait" is forwarded directly
//! to the underlying object if available.
//!
//! # About multi-member Gzip files
//!
//! While most `gzip` files one encounters will have a single *member* that can be read
//! with the [`GzDecoder`], there may be some files which have multiple members.
//!
//! If these are read with a [`GzDecoder`], only the first member will be consumed and
//! the rest will silently be left alone, which can be surprising.
Byron marked this conversation as resolved.
Show resolved Hide resolved
//!
//! The [`MultiGzDecoder`] on the other hand will decode all *members* of `gzip` file
Byron marked this conversation as resolved.
Show resolved Hide resolved
//! into one consecutive stream of bytes, which hides the underlying *members* entirely
//! while failing if the file does not contain solely `gzip` *members*.
//!
//! It's worth noting that major browser like Chrome, Firefox as well as tool like `curl`
//! will only decode the first member of a `gzip` encoded reply, so what's right to do
//! truly depends on the context, as well the expected input of the library or application.
//!
//! [`read`]: read/index.html
//! [`bufread`]: bufread/index.html
//! [`write`]: write/index.html
//! [read]: https://doc.rust-lang.org/std/io/trait.Read.html
//! [write]: https://doc.rust-lang.org/std/io/trait.Write.html
//! [bufread]: https://doc.rust-lang.org/std/io/trait.BufRead.html
//! [`GzDecoder`]: read/struct.GzDecoder.html
//! [`MultiGzDecoder`]: read/struct.MultiGzDecoder.html
#![doc(html_root_url = "https://docs.rs/flate2/0.2")]
#![deny(missing_docs)]
#![deny(missing_debug_implementations)]
Expand Down