Dynamic ring buffer sizing for BZip2 and ZStandard on non-seekable streams#1273
Conversation
…reams Agent-Logs-Url: https://github.com/adamhathcock/sharpcompress/sessions/b115976a-5d8a-412e-a462-72b4531f2d0a Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
…reams Agent-Logs-Url: https://github.com/adamhathcock/sharpcompress/sessions/b115976a-5d8a-412e-a462-72b4531f2d0a Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
Code Review SummaryStatus: 1 Suggestion Found | Recommendation: Consider for merge Overview
Issue Details (click to expand)SUGGESTION
Files Reviewed (8 files)
Overall Assessment: The PR implements a robust solution for handling BZip2 and ZStandard compressed tar files on non-seekable streams. The changes are well-documented and the approach of using format-specific minimum buffer sizes is clean. The implementation correctly sizes the ring buffer at construction time to avoid reallocation during format detection. Tests are comprehensive and cover the key scenarios. The one suggestion is minor - a potential edge case with negative values for minBufferSize, but this would be a very unusual calling pattern and would still work correctly (just use the default buffer size). Reviewed by minimax-m2.5-20260211 · 516,877 tokens |
There was a problem hiding this comment.
Pull request overview
This PR addresses rewind buffer overflows during TAR format detection on non-seekable streams by ensuring the rewind ring buffer is sized for worst-case compressed blocks (notably BZip2 and ZStandard) before any probe reads occur.
Changes:
- Add per-wrapper minimum rewind buffer sizing to
TarWrapper, plus a computedMaximumRewindBufferSizefor TAR detection paths. - Update TAR reader/factory creation to allocate rewind buffers using
TarWrapper.MaximumRewindBufferSize, and addStartRecording(int? minBufferSize = null). - Add/adjust constants and regression tests covering non-seekable
.tar.bz2and larger rewind scenarios.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/SharpCompress.Test/Tar/TarReaderTests.cs | Adds regression tests for non-seekable .tar.bz2 reading and wrapper minimum buffer sizing. |
| tests/SharpCompress.Test/Streams/SharpCompressStreamSeekTest.cs | Adds tests for StartRecording(minBufferSize) behavior and default sizing. |
| src/SharpCompress/Readers/Tar/TarReader.Factory.cs | Ensures SharpCompressStream.Create uses at least TarWrapper.MaximumRewindBufferSize for TAR probing. |
| src/SharpCompress/IO/SharpCompressStream.cs | Extends StartRecording with optional minimum buffer sizing and allocates ring buffer on-demand. |
| src/SharpCompress/IO/SeekableSharpCompressStream.cs | Updates StartRecording override signature to match the base class. |
| src/SharpCompress/Factories/TarWrapper.cs | Introduces MinimumRewindBufferSize and MaximumRewindBufferSize, with BZip2/ZStandard-specific minima. |
| src/SharpCompress/Factories/TarFactory.cs | Uses StartRecording(TarWrapper.MaximumRewindBufferSize) for TAR detection paths. |
| src/SharpCompress/Compressors/ZStandard/ZstandardConstants.cs | Adds BlockSizeMax and DStreamInSize constants. |
| src/SharpCompress/Common/Constants.cs | Reverts global default rewind buffer size to 80KB and documents wrapper-specific overrides. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This will fix buffer size issues for Bzip2 and Zstd when using tars. Reduces the default ring buffer size back to 80k |
BZip2 decompresses in whole blocks — at level 9 the compressed first block can approach 900 KB. ZStandard blocks can be up to 128 KB. The default 80 KB ring buffer causes a rewind overflow during
IsTarFileformat detection on non-seekable streams for both formats.Changes
TarWrapper— new optionalminimumRewindBufferSizeconstructor param andMinimumRewindBufferSizeproperty (defaults toConstants.RewindableBufferSize). StaticMaximumRewindBufferSizepre-computes the largest value across all wrappers.BZip2Constants.baseBlockSize * 9= 900 000 bytesZstandardConstants.DStreamInSize= 131 075 bytesZstandardConstants— newBlockSizeMax(131072) andDStreamInSize(131075) constants matchingZSTD_BLOCKSIZE_MAXandZSTD_DStreamInSize.SharpCompressStream.StartRecording(int? minBufferSize = null)— ring buffer is allocated at recording time withmax(minBufferSize, Constants.RewindableBufferSize), avoiding post-construction reallocation.SeekableSharpCompressStream.StartRecording— signature updated to match; parameter ignored (uses native seeking).Constants.RewindableBufferSize— reverted to 81920 (80 KB). Format-specific overrides are now declared at the wrapper level rather than inflating the global default.TarFactoryandTarReader.Factory— allStartRecordingcalls andSharpCompressStream.Createbuffer sizes now useTarWrapper.MaximumRewindBufferSize, ensuring the buffer is correctly sized before any reads occur.