Skip to content

add maintainer scripts for haskell package generation#121391

Merged
maralorn merged 1 commit intoNixOS:haskell-updatesfrom
maralorn:regenerate-haskell-packages
May 2, 2021
Merged

add maintainer scripts for haskell package generation#121391
maralorn merged 1 commit intoNixOS:haskell-updatesfrom
maralorn:regenerate-haskell-packages

Conversation

@maralorn
Copy link
Member

@maralorn maralorn commented May 1, 2021

This is a slightly reworked version of #86699. Thx again @hyperfekt.

Introduces a script that can be used to update the Nix expressions for
the Haskell package set. In service of that, also

  • introduces cabal2nix-latest, which pins the hackage2nix version used
  • changes all-cabal-hashes to use fetchFromGitHub
  • adds update-hackage.sh & update-cabal2nix-latest.sh maintainer scripts
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@github-actions github-actions bot added the 6.topic: haskell General-purpose, statically typed, purely functional programming language label May 1, 2021
@ofborg ofborg bot added 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels May 1, 2021
@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

Okay, as I feared this PR breaks the callHackage function, because it expects all-cabal-hashes to be a tarball.

@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

There are two ways out. Either change callHackage to not expect a tarball or change this PR to not unpack.

I think the second is the way to go, because we are talking about a lot of files and directories here and it seems quite inefficient. But I may be unaware of other arguments or it might be that we pay the performance cost anyways …

@ghost
Copy link

ghost commented May 1, 2021

I've packaged ratarmount and am currently reworking my PR to use the compressed version of all-cabal-hashes instead. Ideally cabal2nix would adopt the same technique cabal does to work directly with the compressed archive (as suggested by peti), but in lieu of that it seems like a workable solution.

@maralorn maralorn force-pushed the regenerate-haskell-packages branch from 4209406 to 2b52879 Compare May 1, 2021 14:33
@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

@hyperfekt Oh, I have already pushed a fixed version of this PR. What do you think?

@ghost
Copy link

ghost commented May 1, 2021

I think that the last time I tried to unpack that archive on ext4 it took dozens of minutes if I recall correctly. ^^
Not to mention that it's 700MB big, which even considering GCing is a lot.

@maralorn maralorn force-pushed the regenerate-haskell-packages branch from 2b52879 to d0934a4 Compare May 1, 2021 14:40
@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

Yes, put since hackage2nix expects it to be unpacked we don‘t have a choice, do we?

At least this way people using callHackage don‘t need to unpack it. Only people who run the regenerate script.

@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

But, yeah, if the ratarmount solution is cooler. Let’s go for it.

@ghost
Copy link

ghost commented May 1, 2021

That is why I packaged ratarmount. It allows us to mount the archive without actually unpacking it onto the disk. I'll see in a bit how fast that ends up being.

@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

Yeah, go for it!

@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

Sadly there might be a trade-off. Unpacking with nix once is probably slow on the first run, but maybe it‘ll be faster when rerunning?

@ghost
Copy link

ghost commented May 1, 2021

Ideally it'll be almost as fast, at least if I can get Hydra to build the index file for the archive once and have that be used by everyone for every run.

@sternenseemann
Copy link
Member

I would rather rework callHackage and have all-cabal-hashes be the unpacked version. Having a compressed store path has no real advantages in that case because we compress them anyways while fetching from a binary cache. However mounting a tar archive will probably have significant impact on either RAM usage or access performance over having the archive already unpacked in store.

@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

I don‘t have a strong opinion about this. (And the old solution still in the reflog.)

But it’s a 700MB file that everyone would need to write to disk and keep around. I am not exactly sure what callHackage does with the tarball but it seems pretty fast. So when we can get hackage2nix to also use that tarball with reasonable speed I‘d prefer that.

I personally can spare the RAM more than the disk space … For a real principled decision we would need measurements. But that’s likely not worth the effort.

@ghost
Copy link

ghost commented May 1, 2021

The only reason I got into this in the first place was that the number of files was so high that some file systems had serious trouble with it.
But after measuring it the ratarmount version is almost 5 times slower at regenerating hackage-packages.nix.
I guess the slowness for some is just something we'll have to live with, rather than making it this much slower for everyone.

Leaving all-cabal-hashes uncompressed would mean that anyone using callHackage gets 700MB dumped into their store (filesystem compression is presumably without effect because these are tiny files), versus only the people regenerating the Haskell package set.

@sternenseemann
Copy link
Member

callHackage extracts only the necessary subset of the tar:

all-cabal-hashes-component = name: version: buildPackages.runCommand "all-cabal-hashes-component-${name}-${version}" {} ''
tar --wildcards -xzvf ${all-cabal-hashes} \*/${name}/${version}/${name}.{json,cabal}
mkdir -p $out
mv */${name}/${version}/${name}.{json,cabal} $out
'';

I guess ratarmount is fine for now, we can always change things if we run into problems.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should unpack directly to $out instead of using the unpackPhase, since it takes almost twice as long otherwise due to having to move all files into the store.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hyperfekt Does the change I pushed look better?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should work well.

@maralorn
Copy link
Member Author

maralorn commented May 1, 2021

@sternenseemann Well, this PR goes the middle ground. callHackage stays efficient but for actually doing the update it will extract into the nix-store. (Which apparently is faster than using ratarmount.)

@maralorn maralorn force-pushed the regenerate-haskell-packages branch from 15180bc to 4e00a36 Compare May 1, 2021 17:45
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally a comment would either indicate that an alternative hackage2nix can be passed to this script (which may be useful for developing fixes to hackage2nix), or the functionality would be removed; since the syntax for it isn't very obvious, well known, or googleable and will probably just become arcana without it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well then let’s remove it until someone needs it. It’s easy enough to add again.

@maralorn maralorn force-pushed the regenerate-haskell-packages branch from 4e00a36 to 5826ae7 Compare May 1, 2021 19:06
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the update script via passthru here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you enlighten me as to who or what does actually use the passthru update script?

I am a bit sceptical because at least fetchgit didn‘t accept a passthru attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetchurl accepts passthru. IIRC some automation uses it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, done.

nix-shell maintainers/scripts/update.nix --argstr package all-cabal-hashes is an alias for our script now.^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the update script here as well would be nice, but should probably be added via an override in non-hackage-packages.nix to avoid a terrible sed hack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, well. Now that I learned that I can use nix-shell maintainers/scripts/update.nix --argstr package haskellPackages.cabal2nix-latest I wonder if it’s a smart id to have the update script at such a prominent position as maintainers/scripts/haskell.

Introduces a script that can be used to update the Nix expressions for

the Haskell package set. In service of that, also

- introduces cabal2nix-latest, which pins the hackage2nix version used

- changes all-cabal-hashes to use fetchFromGitHub

- adds update-hackage.sh & update-cabal2nix-latest.sh & update-stackage.sh maintainer scripts
@maralorn maralorn force-pushed the regenerate-haskell-packages branch from 5826ae7 to f3f8485 Compare May 1, 2021 19:55
@teto
Copy link
Member

teto commented May 1, 2021

The faith is reborn #62105 (comment)

Comment on lines +22 to +41
# Drop restrictions on some tools where we always want the latest version.
sed -r \
-e '/ cabal-install /d' \
-e '/ cabal2nix /d' \
-e '/ cabal2spec /d' \
-e '/ distribution-nixpkgs /d' \
-e '/ git-annex /d' \
-e '/ hindent /d' \
-e '/ hledger/d' \
-e '/ hlint /d' \
-e '/ hoogle /d' \
-e '/ hopenssl /d' \
-e '/ jailbreak-cabal /d' \
-e '/ json-autotype/d' \
-e '/ language-nix /d' \
-e '/ shake /d' \
-e '/ ShellCheck /d' \
-e '/ stack /d' \
-e '/ weeder /d' \
< "${tmpfile}.new" > "${tmpfile}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're on Stackage Nightly, I imagine we can drop quite a few of these lines here, since these packages will generally also be the latest version on Hackage. That is to say, the version of the package in Stackage Nightly should also be the latest version of the package on Hackage.

If the packages in Stackage Nightly are not the latest version on Hackage, then the Stackage maintainers have probably done that for a good reason. For instance, the packages aren't compatible and don't compile together.

Although this is not something that should block this PR. We can do this at some point in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should drop every one of these except the ones that are nixpkgs-related jailbreak-cabal, cabal2nix, ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let’s clean that up. But that requires a bit of careful deliberation to see if we break anything right now, so I would like this to happen in an independent PR.

Copy link
Member

@cdepillabout cdepillabout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried out all of the maintainer scripts in this PR and they all seem to work well.

This LGTM.

@maralorn maralorn merged commit 5d4dc79 into NixOS:haskell-updates May 2, 2021
@maralorn maralorn deleted the regenerate-haskell-packages branch May 2, 2021 08:27
@maralorn
Copy link
Member Author

maralorn commented May 2, 2021

Thanks everyone! Especially to @teto and @hyperfekt for your tries of doing this.

@ghost
Copy link

ghost commented May 7, 2021

Thanks a lot to you as well! I really appreciate you taking this over the finish line; and I'm excited for the future of the nixpkgs Haskell package set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: haskell General-purpose, statically typed, purely functional programming language 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants