-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ported interleaving support to System.IO.Packaging #97898
Conversation
This is essentially KevinCathcart's work, with a test suite and changes required to ensure that these tests pass.
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsContributes to (and appears to fix) #51929. This was previously part of PR #78374, (originally by @KevinCathcart 's work) which was closed due to poor test coverage in System.IO.Packaging. I've added a set of tests for my changes, (although there's some overlap with normal functionality) and corrected a few deviations. To try to keep the internal state of ZipPackage manageable, I've introduced a behaviour change: users can no longer try to create an atomic part (i.e. a part backed by a single-ZipArchiveEntry) with a URI which belongs to an interleaved part; this means that trying to pass a URI of The OPC spec also states that the piece number should be a single digit. ZipPackage is more forgiving in this PR, for two reasons:
I've made a few changes to KevinCathcart's original code, but most of them are stylistic. The only key differences are in the InterleavedZipPackagePartStream - it would try to seek the underlying streams, and calling SetLength didn't always work because ZipArchiveEntry had a stream left open.
|
Also removed other unused "using" statements.
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs
Outdated
Show resolved
Hide resolved
Removed use of string.CompareOrdinal when processing part pieces. Replaced byte allocations with ArrayPool Rent/Returns. Made InterleavedZipPackagePartStream use Spans instead of byte arrays. This has increased memory usage when .NET Framework calls Read(byte[], int, int) and Write(byte[], int, int).
Split Read and ReadCore apart, so that a new Read(Span<byte>) method doesn't appear on the .NET Standard 2.0 path. Also adjusted the conditions from "NETCOREAPP2_1_OR_GREATER" to "!NETFRAMEWORK && !NETSTANDARD2_0", to align with the rest of .NET.
…poses I can't see a simple way to do this in the Write() path - we rent and return across some loop iterations
I should probably mention that my work here was mostly just porting back the code from .NET Framework. While essentially all of that code got contributed under MIT between the WPF repository, and this repository, portions were ifdefed out, and some was modified in the initial commits, so I did need to reference the actual .NET Framework sources in a few spots (via the referencesource site, not the MIT reference source repo, which lacked needed libraries), to verify what the code originally looked like. I feel like anything taken form there is likely too small to matter, and probably was authorized to be relicensed anyway. Presumably the changes in the initial commits vs framework were likely not for licensing reasons, but stemmed from a desire to have the initially checked in code able to actually compile. Of course there was some original work to get things to fit on top of ZipStream instead of MS.Internal.IO.Zip, and any original contributions of mine there are covered under MIT, and my CLA with the .NET Foundation. |
@huoyaoyuan can you review please? |
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs
Outdated
Show resolved
Hide resolved
...ystem.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.PieceDirectory.cs
Outdated
Show resolved
Hide resolved
...ystem.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.PieceDirectory.cs
Outdated
Show resolved
Hide resolved
I'm not area owner or experienced about System.IO.Packaging. I can only provide review for general practices. Please wait for the area owners. |
Thanks - I agree with your comments. The Read logic makes sense, not quite sure what the thought process was there! I'll take a look this evening and resolve them. No worries with the PR timing. It's a little larger than I'd normally want to submit, it's a feature implementation rather than a bugfix and I imagine the preview release has taken some of the teams' bandwidth, so I appreciate it'll take longer to review. |
@richlander the code on referencesource - is it all under MIT like the reference source repo is? |
* Corrections to comments and spacing. * Switched to the use of an EndOfStreamException in the correct situation. * Adjusted the SeekUnderlyingPieceStream method to slightly tweak the mechanism and to make use of a readBuffer which might be larger than BufferSize. * Small refactor of part piece deletion.
The post-code review changes are completed now, thanks @huoyaoyuan. I'm not convinced that the buffer length calculation in |
shouldn't this (and others in the class) be https://learn.microsoft.com/en-us/dotnet/api/system.argumentnullexception.throwifnull?view=net-7.0 or is it because of compatibility with .net standard? https://github.com/dotnet/runtime/blob/e9218122ae717a393198be04c611061a14f6ee88/src/libraries/System.IO.Packaging/src/System/IO/Packaging/PackUriHelper.cs#L142#L144 |
That's correct, yes - the code needs to compile under .NET Standard 2.0, which excludes that particular helper method. A pattern match would definitely look a bit better, but it wouldn't avoid a cast. This: if (!(partUri is ValidatedPartUri validatedUri))
Console.WriteLine("Hello, World!"); will actually compile down to: ValidatedPartUri validatedUri = partUri as ValidatedPartUri;
if (validatedUri != null)
Console.WriteLine("Hello, World!"); Eliminating the second cast could be helpful though; I'll make the changes tomorrow morning to incorporate any intermediary changes. |
I think all of the comments from the initial code review have been completed. Is anything preventing this going to the @dotnet/area-system-io area owners for review besides a response from @richlander to danmoseley's question about licensing for the .NET Reference Source? It's a larger PR, so I expect it'll take a while to review. As far as I know though, there's nothing more that I need to do to enable it to enter that particular queue. Once it's reviewed, I'll follow through with any requested changes as normal. |
Maybe @terrajobst can answer the license question, but I doubt it's a problem |
@edwardneal what is the status of this? Will it be in .net 9 (and maybe backported?) |
I've not got a status update to share unfortunately. I'd be glad to see this in .NET 9, but it's currently waiting on code review from the System.IO area owners. A .NET 6.0/7.0/8.0 project could still reference the updated System.IO.Packaging NuGet package; as a result, the older project should use the .NET Standard 2.0 build within the package and be able to use the fixed functionality. This'll hopefully do what you're looking for, but if it doesn't and the .NET team approve the PR for backport, I'm happy enough to help where needed. |
Tagging subscribers to this area: @dotnet/area-system-io-compression |
src/libraries/System.IO.Packaging/src/System.IO.Packaging.csproj
Outdated
Show resolved
Hide resolved
This was only used to reference ZipPackagePartPiece.TryParse and .Index, so switched the tests to use reflection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your change, @edwardneal . I left some comments for you to consider. The one I care the most about is making sure we are able to handle unseekable streams properly. I left a comment in the test file with a suggestion.
src/libraries/System.IO.Packaging/src/System/IO/Packaging/PackUriHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/PackUriHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/PackUriHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackagePart.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/InterleavedZipPackagePartStream.cs
Outdated
Show resolved
Hide resolved
* Using exception helpers for .NET * Removing #regions * Fixing nullability in references to ZipPackagePartPiece.PieceDescriptors * Spacing/spelling/grammar changes to comments
src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackagePartPiece.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackagePartPiece.cs
Outdated
Show resolved
Hide resolved
Replaced references to NETFRAMEWORK and NETSTANDARD2_0 with NET and !NET as appropriate.
When is the P7 snap @carlossanlop? We should try to get this one in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ready to merge. P7 snap is next week. This will be part of it.
Thanks so much @edwardneal for working on this, and also thank you for your patience.
Edit: I was just waiting for the CI re-runs to pass, there was a ton of noise.
For the folks tracking this -- it would be great if you could get the preview 7 build as soon as its out (or any daily build that has this in) and validate that you're getting good results. |
You can find the latest package with this fix here: https://dev.azure.com/dnceng/public/_artifacts/feed/dotnet9/NuGet/System.IO.Packaging/overview/9.0.0-preview.7.24371.6 NuGet.Config <?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
<add key="dotnet9" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet9/nuget/v3/index.json" />
</packageSources>
</configuration> Package reference: <PackageReference Include="System.IO.Packaging" Version="9.0.0-preview.7.24371.6" /> Please let us know if it works for you! |
Sorry for the delay here, but I've successfully tested with a known-interleaved XPS document from @wstaelens original issue dotnet/wpf#3546 - the file in Sample code: using System.IO.Packaging;
const string FileName = "02.xps";
const string InterleavedPart = "/FixedDocumentSequence.fdseq";
using var package = Package.Open(FileName);
var parts = package.GetParts();
var namedPart = package.GetPart(new Uri(InterleavedPart, UriKind.Relative));
Console.WriteLine($"Loaded file {Path.GetFileName(FileName)}");
Console.WriteLine($"Parts: {parts.Count()}");
Console.WriteLine($"Found named part \"{InterleavedPart}\": {namedPart is not null}");
if (namedPart is not null)
{
using var partStream = namedPart.GetStream(FileMode.Open, FileAccess.Read);
using var reader = new StreamReader(partStream);
Console.WriteLine("Contents of named part:");
Console.WriteLine(reader.ReadToEnd());
} Thanks to everyone for your reviews, and for pushing the merge through @carlossanlop. |
Currently on holiday without pc access and limited internet access. (Sent already to my colleagues but did not hear from them yet) |
I want to add a woohoo here, I moved to ASP.NET since this issue was opened but really happy to see it happen, and kudos again for the hard work and contributions of everyone here. |
Contributes to (and appears to fix) #51929.
Also contributes to (and appears to fix) dotnet/wpf#3546.
This was previously part of PR #78374, (originally by @KevinCathcart 's work) which was closed due to poor test coverage in System.IO.Packaging. I've added a set of tests for my changes, (although there's some overlap with normal functionality) and corrected a few deviations.
To try to keep the internal state of ZipPackage manageable, I've introduced a behaviour change: users can no longer try to create an atomic part (i.e. a part backed by a single-ZipArchiveEntry) with a URI which belongs to an interleaved part; this means that trying to pass a URI of
/[Content_Types].xml/[0].piece
toPackage.CreatePart
will fail. I don't think this'll actually have an impact though - it's a violation of the OPC spec, and would have resulted in inconsistent behaviour anyway.The OPC spec also states that the piece number should be a single digit. ZipPackage is more forgiving in this PR, for two reasons:
I've made a few changes to KevinCathcart's original code, but most of them are stylistic. The only key differences are in the InterleavedZipPackagePartStream - it would try to seek the underlying streams, and calling SetLength didn't always work because ZipArchiveEntry had a stream left open.