Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strip NuGet.org signatures from downloaded nupkg files #326345

Open
Smaug123 opened this issue Jul 11, 2024 · 16 comments
Open

Strip NuGet.org signatures from downloaded nupkg files #326345

Smaug123 opened this issue Jul 11, 2024 · 16 comments

Comments

@Smaug123
Copy link
Contributor

Issue description

NuGet.org alters packages during upload by unzipping, adding a signature file to the zip, and re-zipping. This means no local Nix build of a NuGet package can possibly match the version pulled from nuget.org.

https://github.com/NixOS/nixpkgs/blob/8be517b5c9817f2122f27cf8000dda859747adc0/pkgs/build-support/dotnet/make-nuget-deps/default.nix defines how we pull NuGet dependencies.

Would you accept a PR that verifies the downloaded signature if possible (I don't yet know what dotnet nuget verify does, but it may be possible to get it to work in the sandbox), and then strips the signature file signature.p7s from NuGet.org-downloaded packages before writing the nupkg to the store? That way, in principle in a content-addressed Nix world, it would become possible to locally (reproducibly) build a NuGet package (after adjusting for the fact that NuGet packs are nondeterministic due to zip file timestamps) and have the result be equal to the one that fetchNuget gives you.

(This is step 1 of my plan; step 2 would be "make dotnet pack reproducible, either here or upstream", and step 3 would be "demonstrate that a content-addressed Nix can skip downloading from NuGet if I've already built my nupkg locally".)

Maintainers

@corngood is listed in CODEOWNERS; @tie has a bunch of commits that look .NET-specific and therefore might be interested.

@corngood
Copy link
Contributor

I have a big change I've been working on that is essentially doing your step 3, and maybe some of step 2?

What I have:

  • a consistent output for .net packages (currently share/nuget/packages)
  • nupkgs are unpacked and patched in fetch, instead of buildDotnetModule.
  • NUGET_PACKAGES is built by sdk hooks from build the inputs
  • fetch-deps is changed to be based on the package derivation (as is done with VMR already)

The branch is here if you want to take a look. It's still a bit rough: https://github.com/corngood/nixpkgs/tree/dotnet-unpacked-packages

I've learned quite a bit about manually packing and unpacking packages, package directories vs. sources, restore vs tool restore vs paket, etc, etc.

Ideally I'd like to get to the point where we can just automatically configure something like NUGET_FALLBACK_PACKAGES and not have any .nupkg files in the nix store at all, but there are various quirks in the tools that are blocking that.

One consideration with repacking nupkgs is that unless they are uncompressed, runtime dependencies will be obfuscated. In my branch I've just kept the .nupkg files in the same outputs as the unpacked packages, which is now NUGET_PACKAGES is normally structured. It's also an annoying bit of redundancy.

As for your step 1:

then strips the signature file signature.p7s from NuGet.org-downloaded packages before writing the nupkg to the store

I haven't done any work on this yet, but it's something that's been on my mind. If we could get that done, I think it would simplify and speed up nuget-to-nix considerably, since it wouldn't have to search for a deterministic package source. It could even allow mirrors to be determined.

@Smaug123
Copy link
Contributor Author

Cool - it would cause mass rebuilds, but naively I'd just edit make-nuget-deps to do that manual unzip-and-rezip dance to strip out the nondeterministic signature files. I'll give that a go and see what it looks like.

@corngood
Copy link
Contributor

Do you know if it's just the .signature.p7 file you need to strip out for it to be deterministic? I had to strip those out when patching packages because it's not valid when the package is repacked, so it would be nice if they were already gone in the fixed-output derivation.

Do you have any real examples of the same package existing on multiple sources? In my experience it happens most on the packages with lots of sources in nuget.config. roslyn might be a good one to try. You could modify nuget-to-nix to always check all the sources for a package (instead of stopping on the first that has it), and compare them all.

@Smaug123
Copy link
Contributor Author

dotnet pack is nondeterministic because of zip file timestamps; in my experience that's its only nondeterminism. I am fairly sure dotnet nuget push to NuGet.org is nondeterministic solely because NuGet.org adds that signature file.

I don't have any examples of packages existing simultaneously in multiple sources other than the trivial "I built this package locally" vs "I uploaded this package to nuget.org". (I don't think I've ever actually operated with multiple NuGet sources except in rather contrived situations where I was fully in control of the contents of most of the sources.)

@corngood
Copy link
Contributor

If you're not able to find one by the time you have something to test, I'll dig something up.

dotnet pack is nondeterministic because of zip file timestamps;

In my branch I'm planning on using strip-nondeterminism for that. I'd actually like to make a general purpose hook, which could be used on both fetched and built packages.

This is a bit off topic, but my idea for that is something along the lines of:

  1. install (e.g. dotnet pack) puts .nupkg file in $out/share/nuget/packages
  2. preFixup hook unpacks all .nupkgs into NUGET_PACKAGES compatible structure
  3. regular fixup can run and patch shebangs, shared libs etc
  4. postFixup hook (optionally?) re-packs the nupkg

@Smaug123
Copy link
Contributor Author

By the way, the reason dotnet pack is nondeterministic is that when they made it so, they hit an absolutely hilarious error case which even I, jaded after six years of dealing with NuGet, would never have imagined. NuGet/Home#8603

Basically, we should check empirically that the nondeterminism fix doesn't break dotnet restore for computers which are in time zones at a positive offset from UTC.

@Smaug123
Copy link
Contributor Author

dotnet nuget verify is using a frankly hilarious amount of CPU when I put it immediately before the "strip the signature file" operation. I imagine it'll also fail to verify in a sandboxed environment. Think it's OK to just skip any signature verification?

@corngood
Copy link
Contributor

it could be done in nuget-to-nix. once the hash is locked there, there should be no need to check it again. nuget-to-nix is going to need to figure out the deterministic hash of the package one way or another anyway.

It might work offline if it has embedded public keys. I'm not sure how it works exactly. I'm not sure it's even reliably checking signatures currently. When I stripped the signatures I was able to build all of dotnet 8 VMR without any problem, but dotnet 9 complained about an invalid signature somewhere late in the build.

@Smaug123
Copy link
Contributor Author

In my mind the obvious place to put this would be straight in fetchNuget (which is defined in make-nuget-deps/default.nix). No existing hashes in deps files would change (I think we'll need the original hashes anyway for fetchurl?). I'd be relying on content-addressed derivations to short-circuit the "download from NuGet and remove signature file" work if you've built the package locally; I need to check that this does in fact short-circuit.

@corngood
Copy link
Contributor

Sorry, I'm not all that familiar with content-addressed nix, so I think I slightly misunderstood how problems relate to each other.

I'll try to explain the problem I've been trying to fix in some more detail:

  • you have a package with a PackageReference to Foo in the .csproj
  • nuget.config has two sources defined: A and B
  • both A and B contain the same version of Foo, but the actual .nupkg files differ (due to the signature?)

In this situation dotnet restore during fetch-deps will non-deterministically (see e.g. dotnet/source-build#4206) use the package from either A or B, so the unpacked package in $NUGET_PACKAGES may vary.

nuget-to-nix looks at $NUGET_PACKAGES and tries to write a fetchNuget for each one. Even though it has the .nupkg for Foo, and knows which source it came from, it still has to attempt to download the package from each source, and use the first one (deterministically) that contains it.

If we could make fetchNupkg output source-independent (perhaps by just removing the signature?), we could make it a fixed-output derivation. Then nuget-to-nix would need to determine the fixed-output hash, but it could do so with the .nupkg from $NUGET_PACKAGES. It would likely still need to check which sources contain the package in order to write the url/mirror list, but it wouldn't have to download anything.

I guess this is more complex than what you were thinking, but it should also solve the CA problem.

@Smaug123
Copy link
Contributor Author

Right, I see. I believe my proposed change (see the draft #326785) does make fetchNuget source-independent (though I haven't tested that empirically yet), so what you're looking for could be built on top of it. We'll find out!

@Smaug123
Copy link
Contributor Author

Yup, just tested that empirically:

  • built a package
  • observed its SHA by sha256sum ~/Documents/GitHub/TestNuget/result to be f503214402c5a73e6e95811b39b58c06d9a189fd4dc06f1237a4708fb9236aff
  • copied and renamed the result to WoofWare.Test.Thing.0.0.3.nupkg
  • did a vanilla dotnet nuget push 'WoofWare.Test.Thing.*.nupkg' --source https://api.nuget.org/v3/index.json --api-key BLAH interactively on that (note that this is an unlisted package name which I keep around for test purposes, so it won't show up on a NuGet.org search)
  • waited for it to propagate to NuGet.

Then consumed it using the nixpkgs at #326785 (commit a6e32ac9bf59628fb17359d8f448161213d47cde), and observed that indeed the version that made it through fetchNuget is identically equal to the one I pushed:

> sha256sum /nix/store/llilh0vaf5n20ir22znb5zpaqab5y82a-WoofWare.Test.Thing.0.0.3.nupkg
f503214402c5a73e6e95811b39b58c06d9a189fd4dc06f1237a4708fb9236aff   /nix/store/llilh0vaf5n20ir22znb5zpaqab5y82a-WoofWare.Test.Thing.0.0.3.nupkg

So as far as I have been able to test so far, that PR does indeed make the NuGet package source-independent.

@corngood
Copy link
Contributor

I still haven't been able to find any practical examples in nixpkgs of packages that are available from multiple sources. I think some of the places where it used to happen are now using packageSourceMappings.

@corngood
Copy link
Contributor

Ah, here we go. In case this proves useful for something:

  (fetchNuGet { pname = "Microsoft.Azure.WebJobs"; version = "3.0.39"; hash = "sha256-Ndym81F4BFIHsH4NF/weySuRdWgzQQM8hHsjTW8c1vs="; })
  (fetchNuGet { pname = "Microsoft.Azure.WebJobs"; version = "3.0.39"; hash = "sha256-DLFcCFeOQRMnZzPjJlQZEyVXX1iey/QTQKcT1br2q1I="; url = "https://azfunc.pkgs.visualstudio.com/e6a70c92-4128-439f-8012-382fe78d6396/_packaging/21e57804-e42a-44f4-a801-493faaf56251/nuget/v3/flat2/microsoft.azure.webjobs/3.0.39/microsoft.azure.webjobs.3.0.39.nupkg"; })
  (fetchNuGet { pname = "Microsoft.Azure.WebJobs.Core"; version = "3.0.39"; hash = "sha256-R8WFHImU5uctfCCF0cTzBlu+4QORk14Wku5tutxVzyI="; })
  (fetchNuGet { pname = "Microsoft.Azure.WebJobs.Core"; version = "3.0.39"; hash = "sha256-n1yL6u3IYK82N1zWCt9g5oqIRteX/tOI67z0FH8gR/k="; url = "https://azfunc.pkgs.visualstudio.com/e6a70c92-4128-439f-8012-382fe78d6396/_packaging/21e57804

@corngood
Copy link
Contributor

I also confirmed that the first package there has the same content when you remove the signature file from both versions.

@Smaug123
Copy link
Contributor Author

Next up:

  • Get nuget-to-nix to check all available sources for each package and verify that the package is equal in all of them
  • Have nuget-to-nix write out an additional field to the deps file, sourceIndependentHash or similar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants