Skip to content

Added IStreamStack for debugging and configurable buffer management. …#930

Merged
adamhathcock merged 9 commits intoadamhathcock:masterfrom
Nanook:feature/StreamStack
Jul 23, 2025
Merged

Added IStreamStack for debugging and configurable buffer management. …#930
adamhathcock merged 9 commits intoadamhathcock:masterfrom
Nanook:feature/StreamStack

Conversation

@Nanook
Copy link
Collaborator

@Nanook Nanook commented Jul 20, 2025

IStreamStack Buffering Enhancement

This pull request introduces a comprehensive buffer management stack that simplifies stream over-reading in compression formats, adds optional stream hierarchy debugging with position tracking (no release performance impact), and establishes a clean foundation for future stream consolidation and simplification.

Warning: This is a fundamental change and needs to be looked over carefully. It was born from having a lot of difficulty debugging stream positions and rewinding. A lot of the files changed are to add a StreamStack interface, other changes are to cater for SharpCompressStream and buffer usage. All unit tests are passing with minimal or zero change.

🔧 Core Changes

1. Enhanced IStreamStack Interface

  • Universal Design Philosophy: Implemented as an interface rather than base class to allow anything to join the stack, not necessarily just streams, while keeping the chain intact
  • Flexible Architecture: Any object can participate in the stream hierarchy by implementing IStreamStack
  • Buffer Management: BufferSize, BufferPosition, DefaultBufferSize properties for systematic buffer placement
  • Debug Support: InstanceId for comprehensive stream tracking (DEBUG_STREAMS only - zero release impact)
  • Chain Integrity: BaseStream() method maintains hierarchy navigation regardless of implementation type

2. Unified SharpCompressStream

Consolidates the functionality of NonDisposingStream, RewindableStream and CountingWritableSubStream into a single implementation. Future refactors could simplify things further. SharpCompressStream provides:

  • Over-read Protection: Fixed-size buffers safely contain over-read data
  • Position Tracking: Internal position management independent of base stream
  • Stream Resumption: Buffer rewinding (via fixed buffer that be proficed by a base stream/IStreamStack item) allows base streams to resume at exact boundaries
  • Lifecycle Management: Configurable disposal behavior

3. 🔍 DEBUG_STREAMS C# define for Debugging Visualization

Activated for .NET 8.0 Debug builds only, provides complete stream hierarchy tracking with zero performance impact on release builds:

FileStream[Px0::]/SharpCompressStream#1[Px0:Bx0:Dx0] : Constructed by [Volume..ctor()]
FileStream[Px21e::]/SharpCompressStream#1[Px25:Bx10000:Dx10000]/ZlibBaseStream#2[:Bx0] : Constructed by [ZipFilePart.CreateDecompressionStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px25:Bx10000:Dx10000]/ZlibBaseStream#2[:Bx0]/DeflateStream#3[Px0:Bx0:Dx0] : Constructed by [StreamingZipFilePart.GetCompressedStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px25:Bx10000:Dx10000]/ZlibBaseStream#2[:Bx0]/DeflateStream#3[Px0:Bx0:Dx0]/SharpCompressStream#4[Px0:Bx0:Dx0] : Constructed by [StreamingZipFilePart.GetCompressedStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px25:Bx10000:Dx10000]/ZlibBaseStream#2[:Bx0]/DeflateStream#3[Px0:Bx0:Dx0]/SharpCompressStream#4[Px0:Bx0:Dx0]/EntryStream#5[Px0:Bx0:Dx0] : Constructed by [AbstractReader`2.GetEntryStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px28:Bx10000:Dx10000]/ZlibBaseStream#2[:Bx0]/DeflateStream#3[Px3:Bx0:Dx0]/SharpCompressStream#4[Px1:Bx0:Dx0]/EntryStream#5[Px1:Bx0:Dx0] : Disposed by [ZipReaderTests.Issue_685()]
FileStream[Px21e::]/SharpCompressStream#1[Px28:Bx10000:Dx10000]/ZlibBaseStream#2[:Bx0]/DeflateStream#3[Px3:Bx0:Dx0]/SharpCompressStream#4[Px1:Bx0:Dx0] : Disposed by [EntryStream.Dispose()]
FileStream[Px21e::]/SharpCompressStream#1[Px5e:Bx10000:Dx10000]/ZlibBaseStream#6[:Bx0] : Constructed by [ZipFilePart.CreateDecompressionStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px5e:Bx10000:Dx10000]/ZlibBaseStream#6[:Bx0]/DeflateStream#7[Px0:Bx0:Dx0] : Constructed by [StreamingZipFilePart.GetCompressedStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px5e:Bx10000:Dx10000]/ZlibBaseStream#6[:Bx0]/DeflateStream#7[Px0:Bx0:Dx0]/SharpCompressStream#8[Px0:Bx0:Dx0] : Constructed by [StreamingZipFilePart.GetCompressedStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px5e:Bx10000:Dx10000]/ZlibBaseStream#6[:Bx0]/DeflateStream#7[Px0:Bx0:Dx0]/SharpCompressStream#8[Px0:Bx0:Dx0]/EntryStream#9[Px0:Bx0:Dx0] : Constructed by [AbstractReader`2.GetEntryStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px62:Bx10000:Dx10000]/ZlibBaseStream#6[:Bx0]/DeflateStream#7[Px4:Bx0:Dx0]/SharpCompressStream#8[Px2:Bx0:Dx0]/EntryStream#9[Px2:Bx0:Dx0] : Disposed by [ZipReaderTests.Issue_685()]
FileStream[Px21e::]/SharpCompressStream#1[Px62:Bx10000:Dx10000]/ZlibBaseStream#6[:Bx0]/DeflateStream#7[Px4:Bx0:Dx0]/SharpCompressStream#8[Px2:Bx0:Dx0] : Disposed by [EntryStream.Dispose()]
FileStream[Px21e::]/SharpCompressStream#1[Px9f:Bx10000:Dx10000]/ReadOnlySubStream#10[Px0:Bx0:Dx0] : Constructed by [ZipFilePart.GetCryptoStream()]
FileStream[Px21e::]/SharpCompressStream#1[Px9f:Bx10000:Dx10000]/ReadOnlySubStream#10[Px0:Bx0:Dx0]/EntryStream#11[Px0:Bx0:Dx0] : Constructed by [AbstractReader`2.GetEntryStream()]
FileStream[Px21e::]/SharpCompressStream#1[Pxa1:Bx10000:Dx10000]/ReadOnlySubStream#10[Px2:Bx0:Dx0]/EntryStream#11[Px2:Bx0:Dx0] : Disposed by [ZipReaderTests.Issue_685()]
FileStream[Px21e::]/SharpCompressStream#1[Pxa1:Bx10000:Dx10000]/ReadOnlySubStream#10[Px2:Bx0:Dx0] : Disposed by [EntryStream.Dispose()]
FileStream[Px21e::]/SharpCompressStream#1[Pxcd:Bx10000:Dx10000]/ReadOnlySubStream#12[Px0:Bx0:Dx0] : Constructed by [ZipFilePart.GetCryptoStream()]
FileStream[Px21e::]/SharpCompressStream#1[Pxcd:Bx10000:Dx10000]/ReadOnlySubStream#12[Px0:Bx0:Dx0]/EntryStream#13[Px0:Bx0:Dx0] : Constructed by [AbstractReader`2.GetEntryStream()]
FileStream[Px21e::]/SharpCompressStream#1[Pxdc:Bx10000:Dx10000]/ReadOnlySubStream#12[Pxf:Bx0:Dx0]/EntryStream#13[Pxf:Bx0:Dx0] : Disposed by [ZipReaderTests.Issue_685()]
FileStream[Px21e::]/SharpCompressStream#1[Pxdc:Bx10000:Dx10000]/ReadOnlySubStream#12[Pxf:Bx0:Dx0] : Disposed by [EntryStream.Dispose()]
FileStream[Px21e::]/SharpCompressStream#1[Px21e:Bx10000:Dx10000] : Disposed by [Volume.Dispose()]

Key Insights from This Debug Output:

  • Position Progression: Watch SharpCompressStream#1 position increment (Px0Px25Px28Px5ePx62) as it processes ZIP entries etc
  • Buffer Management: Bx10000 (64KB configurable buffer) strategically placed at base level for over-reading protection
  • Stream Lifecycle: Complex multi-layered hierarchies created and properly disposed for each ZIP entry
  • Over-reading Safety: Base FileStream position remains stable while compression streams process data safely

Format: StreamType#InstanceId[Position:BufferSize:DefaultBufferSize]

📈 Benefits

Changes

  • Simplifies over-reading in Deflate, Zlib, Gzip, Lzma etc. Streams can return/rewind unused bytes by repositioning the base buffer
  • Predictable memory usage with fixed-size buffers
  • Enhanced debugging with complete stream hierarchy visibility (debug builds only)

Strategic Foundation

  • Flexible Architecture: Interface-based design allows any object to join stream hierarchies
  • Clean foundation for consolidating additional stream types in future phases
  • Proven approach validates core functionality before broader changes
  • Full backward compatibility ensures existing code works unchanged

🧪 Usage

// Automatic over-reading protection
var options = new ReaderOptions() { BufferSize = 0x10000 }; // 64KB
using var stream = SharpCompressStream.Create(baseStream, bufferSize: options.BufferSize);

Result: Compression streams can over-read safely into managed buffers, then rewind to exact boundaries, allowing base streams to resume correctly for subsequent processing.


This enhancement introduces a comprehensive buffer management stack that solves fundamental stream processing issues while providing optional debugging capabilities with zero release performance impact. The flexible interface-based design allows any object to participate in stream hierarchies, creating a solid foundation for future consolidation phases.

Nanook added 3 commits July 20, 2025 17:35
…Added SharpCompressStream to consolodate streams to help simplify debugging. All unit tests passing.
@Nanook Nanook requested review from adamhathcock and Copilot July 20, 2025 17:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces the IStreamStack interface to provide a unified stream hierarchy with configurable buffer management and enhanced debugging capabilities for SharpCompress. The changes consolidate stream functionality from NonDisposingStream, RewindableStream, and CountingWritableSubStream into the new SharpCompressStream class.

Key Changes

  • Introduced IStreamStack interface for universal stream hierarchy management with flexible buffer control
  • Replaced legacy stream classes with SharpCompressStream providing over-read protection and position tracking
  • Added DEBUG_STREAMS conditional compilation for comprehensive debugging with zero release performance impact

Reviewed Changes

Copilot reviewed 84 out of 84 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/SharpCompress/IO/IStreamStack.cs Core interface definition with buffer management and debugging extensions
src/SharpCompress/IO/SharpCompressStream.cs Unified stream implementation replacing multiple legacy stream classes
src/SharpCompress/IO/NonDisposingStream.cs Removed legacy stream class replaced by SharpCompressStream
tests/SharpCompress.Test/Streams/SharpCompressStreamTest.cs New test suite for buffer management functionality
Multiple stream implementations Updated to implement IStreamStack interface for hierarchy participation

throw new NotSupportedException();
}

public override void WriteByte(byte value)
Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WriteByte method does not handle buffering consistently with the Read method. When buffering is enabled, writes should either be buffered or the buffer should be invalidated to maintain data integrity.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Stream/Buffer would never be used for both purposes in the same instance. This was taken from RewindableStream.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smells wrong but if it's previous behavior then all good

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently buffering isn't really used for writing. There's no ability to pass buffer sizes for writing currently either. This is planned to be added to perhaps aid LZMA2 block writing. I was conscious about the scope of the change and keeping it reviewable, getting a stable commit to build on with a bit of confidence :-)

Comment on lines +288 to +294
++_internalPosition;
}

public override void Write(byte[] buffer, int offset, int count)
{
Stream.Write(buffer, offset, count);
_internalPosition += count;
Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Write method does not handle buffering consistently with the Read method. When buffering is enabled, writes should either be buffered or the buffer should be invalidated to maintain data integrity.

Suggested change
++_internalPosition;
}
public override void Write(byte[] buffer, int offset, int count)
{
Stream.Write(buffer, offset, count);
_internalPosition += count;
++_internalPosition;
_bufferedLength = 0; // Invalidate the buffer
_bufferPosition = 0;
}
public override void Write(byte[] buffer, int offset, int count)
{
Stream.Write(buffer, offset, count);
_internalPosition += count;
_bufferedLength = 0; // Invalidate the buffer
_bufferPosition = 0;

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

@Nanook Nanook Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous WriteByte issue.

private byte[]? _buffer;
private int _bufferPosition;
private int _bufferedLength;
private bool _bufferingEnabled;
Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer state variables (_bufferPosition, _bufferedLength) are not validated for consistency during buffer operations, which could lead to corrupt buffer states.

Suggested change
private bool _bufferingEnabled;
private bool _bufferingEnabled;
private void ValidateBufferState()
{
if (_bufferPosition < 0 || _bufferPosition > _bufferedLength)
{
throw new InvalidOperationException("Buffer state is inconsistent: _bufferPosition is out of range.");
}
}

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated code.

Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rewind method decrements BufferPosition without validating that the result is non-negative, which could result in an invalid buffer state.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated code.

Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rented buffer from ArrayPool is never returned, causing a resource leak. ArrayPool.Return should be called in a finally block or using a proper disposal pattern.

Suggested change
var buffer = ArrayPool<byte>.Shared.Rent(1024);

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the line.

Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect type passed to DebugConstruct - should be typeof(LzmaStream), not typeof(LZipStream).

Suggested change
this.DebugConstruct(typeof(LZipStream));
this.DebugConstruct(typeof(LzmaStream));

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated code.

Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect type passed to DebugConstruct - should be typeof(LzmaStream), not typeof(LZipStream).

Suggested change
this.DebugConstruct(typeof(LZipStream));
this.DebugConstruct(typeof(LzmaStream));

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated code.

Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect type passed to DebugDispose - should be typeof(LzmaStream), not typeof(LZipStream).

Suggested change
this.DebugDispose(typeof(LZipStream));
this.DebugDispose(typeof(LzmaStream));

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated code.

Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard-coded magic number 4 should be replaced with a named constant to improve maintainability and explain its purpose (likely sizeof(uint) for header bytes).

Suggested change
((IStreamStack)rewindableStream).Rewind(4);
((IStreamStack)rewindableStream).Rewind(UInt32Size);

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated code.

// Fill buffer if needed
if (_bufferedLength == 0)
{
_bufferedLength = Stream.Read(_buffer!, 0, _bufferSize);
Copy link

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The buffer refill logic reads the full buffer size even when only a small amount is needed, which could impact performance for large buffers with small reads.

Suggested change
_bufferedLength = Stream.Read(_buffer!, 0, _bufferSize);
int bytesToRead = Math.Min(_bufferSize, count);
_bufferedLength = Stream.Read(_buffer!, 0, bytesToRead);

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

@Nanook Nanook Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supposed to attempt to read the size of the buffer. That change makes lots of things fail.

@adamhathcock
Copy link
Owner

This is exactly the type of thing I've been hoping for: a way to bring things closer and that it makes sense. Thank you!

I'm gonna sit on it for a couple days to read and think but if the tests pass and there's little to no breaking changes (haven't read yet) then I don't see why not.

@Nanook
Copy link
Collaborator Author

Nanook commented Jul 21, 2025

...and that is the exact response I was hoping for too. Not a rejection because it looks too intrusive and not straight out acceptance :-). I'm cautious about it, but as you say "it makes sense" to attempt it. I'm happy to discuss it and rework parts of it (or for anyone to). You can always reach me on discord too if you want a more productive chat. Cheers.

@Nanook
Copy link
Collaborator Author

Nanook commented Jul 21, 2025

I noticed that SetPosition was mis-spelled as SetPostion and realised that I'd corrected this already. I'd accidentally merged the commit from getting it working and not the commit after finessing it a bit. You'll notice that StackSeek now does not have a hard coded 0 for the position when called by the various factory classes that use it too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments