Skip to content

Fix tar archive enumeration after fully reading entry streams#1324

Closed
Copilot wants to merge 182 commits into
releasefrom
copilot/fix-tararchive-missing-entries
Closed

Fix tar archive enumeration after fully reading entry streams#1324
Copilot wants to merge 182 commits into
releasefrom
copilot/fix-tararchive-missing-entries

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 14, 2026

TarArchive.OpenArchive(stream) could stop yielding entries when callers iterated archive.Entries, fully read an entry via OpenEntryStream(), and moved on before disposing that stream. In practice, the shared tar stream could remain parked on entry padding, so the next header read started from the wrong offset.

  • Behavior change

    • Update TarReadOnlySubStream to advance to the next tar header as soon as the current entry has been fully consumed.
    • Keep the existing disposal-based cleanup path for partially read entry streams.
  • Why this matters

    • Enumeration no longer depends on callers disposing each entry stream before requesting the next archive entry.
    • This matches the expected sync IArchive usage pattern shown in the issue, where each file is read to EOF during iteration.
  • Regression coverage

    • Add a tar archive test that:
      • writes multiple files plus a directory entry,
      • iterates archive.Entries,
      • opens and fully reads each file entry,
      • delays disposing those entry streams until after enumeration,
      • verifies all entries are still present.
using var archive = TarArchive.OpenArchive(stream);

foreach (var entry in archive.Entries)
{
    if (entry.IsDirectory)
        continue;

    var entryStream = entry.OpenEntryStream();
    entryStream.CopyTo(Stream.Null); // fully consume without immediate dispose
}

var count = archive.Entries.Count(); // now remains stable

adamhathcock and others added 30 commits March 6, 2026 11:22
…atch interface name

Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
Rename IWriteableArchiveFactory.cs to IWritableArchiveFactory.cs
…path

1. Fix off-by-one in Update(BLAKE2SP): pos was masked with 448 (64×7)
   instead of 511 (64×8−1), causing incorrect leaf assignment when update
   chunks are not multiples of 64 bytes. This produced wrong hashes
   whenever streaming reads didn't align to 64-byte boundaries.

2. Fix double-finalization when Read() is called again after returning 0.
   Final() was called unconditionally on each EOF read, re-running
   compression on an already-finalized state and corrupting the hash.
   EnsureHash() guards with a null-check and is idempotent; both sync
   and async Read paths share the fix. _blake2sp is set to null after
   finalization so any erroneous post-final calls fail fast rather than
   silently corrupting state.

3. Fix false-positive CRC check when stream is not fully drained.
   _hash was initialized to fileHeader.FileCrc in the constructor, so
   GetCrc() returned the expected CRC rather than the computed one if
   the stream was abandoned early. The check would then compare the
   expected hash against itself and always succeed, silently accepting
   a corrupt or incomplete file. _hash is now null until the stream is
   fully read; GetCrc() throws if called early rather than returning a
   misleading value.

Additional changes:
- Compress: stackalloc m/v arrays instead of heap; LE fast path via
  MemoryMarshal.Cast replaces 16 BitConverter.ToUInt32 calls per block
- Final(BLAKE2S): write directly to Span<byte>, eliminating MemoryStream
  and BitConverter.GetBytes allocations; 8 leaf digests now stackalloc'd
- Inner classes and fields: private, sealed, readonly where applicable
Fix three BLAKE2sp correctness bugs and eliminate allocations in hot path
# Conflicts:
#	src/SharpCompress/packages.lock.json
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
… in PooledMemoryStream

Agent-Logs-Url: https://github.com/adamhathcock/sharpcompress/sessions/1f2c3c48-1112-43f1-82ce-efb157fbe0d5

Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@JohnCampionJr
Copy link
Copy Markdown

@adamhathcock any chance we can get this merged?

@adamhathcock
Copy link
Copy Markdown
Owner

@copilot add a TarReader and TarReaderAsync and TarArchiveAsync style tests to this PR

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 29, 2026

@copilot add a TarReader and TarReaderAsync and TarArchiveAsync style tests to this PR

Added matching regression coverage in TarReaderTests, TarReaderAsyncTests, and TarArchiveAsyncTests in ee64868.

adamhathcock and others added 5 commits May 29, 2026 14:43
…ive-missing-entries

# Conflicts:
#	tests/SharpCompress.Test/Tar/TarArchiveAsyncTests.cs
Try to fix global.json to avoid churn in locks
@adamhathcock
Copy link
Copy Markdown
Owner

Closing in favor of #1342

This fix got lost. I'll put it out shortly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TarArchive.OpenArchive is missing entries

6 participants