Skip to content

maintainers/scripts/haskell/exclude.nu: init#441204

Open
wolfgangwalther wants to merge 4 commits intoNixOS:haskell-updatesfrom
wolfgangwalther:haskell-updates-excluded
Open

maintainers/scripts/haskell/exclude.nu: init#441204
wolfgangwalther wants to merge 4 commits intoNixOS:haskell-updatesfrom
wolfgangwalther:haskell-updates-excluded

Conversation

@wolfgangwalther
Copy link
Contributor

@wolfgangwalther wolfgangwalther commented Sep 8, 2025

This script populates excluded.yaml, which allows hackage2nix to skip generating expressions for many old packages.

The script uses a very simple logic right now. Any packages that is both broken and has not been updated for 5 years on hackage, is filtered out entirely. An exception is made for packages which are listed in the current Stackage snapshot as well - these are kept, otherwise hackage2nix will complain, because it can't deal with the constraints on these packages.

This does not look at reverse dependencies, so will very likely in changes to hackage-packages.nix, where some packages that are still left in the set are passed null for their removed dependencies*. I'd argue that this is not a problem per se: These packages were transitively broken before and would now fail to build because of missing dependencies if anyone tried to resurrect them. As part of fixing these packages, the contributor would have to fix that dependency anyway, they'd now need to remove it from the exclusion list to do so.

Not considering reverse dependencies makes the logic considerably simpler.

Another thing this does not consider is when an already excluded package is resurrected on hackage, i.e. gets a new upload. This is left out for now on purpose - should we hit these cases more often, we can still write a script to automate removal from this list as well.

Only works with cabal2nix-unstable pointing at NixOS/cabal2nix#667.

* The affected packages were marked broken again in a second commit.

Some stats:

  • 7932 broken packages removed
  • 298,823 lines removed (40% of hackage-packages.nix)
  • 7.1 MB repo size reduction (hackage-packages.nix: 16MB -> 8.9MB)

cc @emilazy for the diffstat - and the fun!

Things done


Add a 👍 reaction to pull requests you find important.

@nixpkgs-ci nixpkgs-ci bot added 2.status: merge conflict This PR has merge conflicts with the target branch 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 5001+ This PR causes many rebuilds on Darwin and must target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. 6.topic: haskell General-purpose, statically typed, purely functional programming language 6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions backport release-25.05 labels Sep 8, 2025
@wolfgangwalther wolfgangwalther force-pushed the haskell-updates-excluded branch from 69cee18 to a3805ce Compare September 8, 2025 13:48
@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Sep 8, 2025
@wolfgangwalther wolfgangwalther force-pushed the haskell-updates-excluded branch from a3805ce to 178b04b Compare September 8, 2025 16:19
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux. 10.rebuild-darwin: 101-500 This PR causes between 101 and 500 packages to rebuild on Darwin. and removed 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 5001+ This PR causes many rebuilds on Darwin and must target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. labels Sep 8, 2025
@wolfgangwalther wolfgangwalther force-pushed the haskell-updates-excluded branch from 178b04b to a147664 Compare September 8, 2025 16:43
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux. and removed 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux. 10.rebuild-darwin: 101-500 This PR causes between 101 and 500 packages to rebuild on Darwin. labels Sep 8, 2025
@wolfgangwalther wolfgangwalther marked this pull request as ready for review September 8, 2025 16:55
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux. 10.rebuild-darwin: 101-500 This PR causes between 101 and 500 packages to rebuild on Darwin. and removed 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux. labels Sep 12, 2025
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux. and removed 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux. 10.rebuild-darwin: 101-500 This PR causes between 101 and 500 packages to rebuild on Darwin. labels Sep 12, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do a more extreme version of this

lib.optionalAttrs config.allowAliases (lib.genAttrs)

CI would flag issues with this that would otherwise stay hidden and it should be a bit faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also introduces the weird property that stuff will fail to evaluate with aliases, but not without them. I think it is usually not worth it to keep around package expressions for packages that are missing dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not directly related: consider let foo = builtins.fromJSON (builtins.readFile …).foo; in … outside of any lambda, which will benefit from the import cache.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do a more extreme version of this

lib.optionalAttrs config.allowAliases (lib.genAttrs)

CI would flag issues with this that would otherwise stay hidden and it should be a bit faster.

I assume you mean it literally like this, so not creating these attributes without aliases. That won't work, because...

This also introduces the weird property that stuff will fail to evaluate with aliases, but not without them. I think it is usually not worth it to keep around package expressions for packages that are missing dependencies.

The problem is that some packages are missing these dependencies but are still building fine. They might miss these dependencies for the benchmark, which is not enabled by default. If we remove these packages, that will remove a ton of actually building packages - and I think some very central packages with a lot of reverse dependencies are that way.

The problem is that "missing argument" throws way too early. We really need something like #442066 to get both of:

  • Not failing eval for the haskellPacakges. exposed attributes, because they are not evaluated.
  • But failing eval when any of these dependencies are used from other packages.

Currently we have to go back to null, otherwise eval will just fail on hitting them in haskellPackages..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm what if we evaluate the package descriptions with benchmarks disabled in hackage2nix (but not cabal2nix)? I would be surprised if many people actually build the benchmarks from Nixpkgs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try, but I'd imagine that this only shifts the problem. There could be other cases where it's not the benchmark, but some of the tests. When we explicitly disable these relevant parts in Nixpkgs, then we don't need their dependencies either.

Doing it this way means we'd be in some kind of a deadlock situation:

  • The script would remove the dependency because it's broken and outdated.
  • The script would then remove the package itself.
  • We try to add the package back and the only way to do it might be to do dontCheck.
  • The script can't tell and would remove the package the next time it's run.

So I really think we should not change the package expressions at all and instead pass throw's or null's - and improve on that in the future.

I mean... we could theoretically just take this little piece of the alias PR already and use that independently of config.allowAliases:

https://github.com/NixOS/nixpkgs/pull/442066/files#diff-bfe3742899d9abd995e3b03ca21225dfc95b9d94d49898c45f2dfc435428c140R231-R234

Aka, we'd create a tiny fake "derivation" (so much fake that it doesn't even have type = derivation). Thus would just prevent the Eval errors in CI, because it's essentially an attrset without recurseForDerivations. But when this is used as a dependency for any hackage package, it will trigger the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not directly related: consider let foo = builtins.fromJSON (builtins.readFile …).foo; in … outside of any lambda, which will benefit from the import cache.)

Done.

Aka, we'd create a tiny fake "derivation" (so much fake that it doesn't even have type = derivation). Thus would just prevent the Eval errors in CI, because it's essentially an attrset without recurseForDerivations. But when this is used as a dependency for any hackage package, it will trigger the error.

Did that. This will throw the same error with and without allowAliases, but still pass CI.

@sternenseemann
Copy link
Member

I think checking whether a broken package is deprecated is also a good shout, though a lot of deprecated packages are still necessary to build other things.

@sternenseemann
Copy link
Member

Can't we make exclusion a constraint? exclude.nu would record the version at the time of the exclusion, but hackage2nix would render the package again when a new version is released (i.e. the constraint no longer applies).

Having a script to clean up useless entries in excluded-packages.json sounds better than having to run a script to re-add packages periodically.

With this you could almost be very aggressive and remove most of the broken packages, though it would have the flaw that a Hackage metadata revision wouldn't resurrect a package.

@wolfgangwalther
Copy link
Contributor Author

I think checking whether a broken package is deprecated is also a good shout, though a lot of deprecated packages are still necessary to build other things.

The way I understand "deprecation" to work on hackage, that is specific to a certain version. Although I remember having seen something else as well. I haven't found the right api endpoints to catch this stuff, yet. IIUC, the version specific deprecation is exposed as "preferred versions" or so? But maybe that's something else again.

As long as we remove the whole package without considering versions, this doesn't apply, yet.

In any case, I think deprecated packages will by definition not be updated anymore, so will end up in any time-related cutoff eventually.

Can't we make exclusion a constraint? exclude.nu would record the version at the time of the exclusion, but hackage2nix would render the package again when a new version is released (i.e. the constraint no longer applies).

Having a script to clean up useless entries in excluded-packages.json sounds better than having to run a script to re-add packages periodically.

That sounds like a good idea, indeed - with the caveat(s) below.

With this you could almost be very aggressive and remove most of the broken packages, though it would have the flaw that a Hackage metadata revision wouldn't resurrect a package.

True. And also, it would still be a problem with the transitive broken cases, that we are removing as well right now.

Example:

  • foo depends on bar.
  • bar is marked broken.
  • foo is marked transitively broken.
  • both foo and bar were updated in 2015 the last time.

The exclude script will now exclude both of them.

Now, bar is updated. We re-introduce it by either method (version constraints or periodic checks). It builds fine. But we don't know whether foo builds now or not - it is still excluded. I guess we would have to also include all direct reverse dependencies of bar in that case. But.. the same can also happen if bar was just broken, but never removed.

Ultimately... a script to run periodically to just include all packages temporarily and start building might be the simplest solution to this problem. It would catch all cases. Aka, my initial idea of rewriting unbreak.nu.

This script populates `excluded.yaml`, which allows hackage2nix to skip
generating expressions for many old packages.

The script uses a very simple logic right now. Any packages that is both
broken and has not been updated for 5 years on hackage, is filtered out
entirely. An exception is made for packages which are listed in the
current Stackage snapshot as well - these are kept, otherwise
hackage2nix will complain, because it can't deal with the constraints on
these packages.

This does not look at reverse dependencies, so will very likely in
changes to hackage-packages.nix, where some packages that are still left
in the set are passed `null` for their removed dependencies. I'd argue
that this is not a problem per se: These packages were transitively
broken before and would now fail to build because of *missing*
dependencies if anyone tried to resurrect them. As part of fixing these
packages, the contributor would have to fix that dependency anyway,
they'd now need to remove it from the exclusion list to do so.

Not considering reverse dependencies makes the logic considerably
simpler.

Another thing this does not consider is when an already excluded package
is resurrected on hackage, i.e. gets a new upload. This is left out for
now on purpose - should we hit these cases more often, we can still
write a script to automate removal from this list as well.
Created by maintainers/scripts/haskell/exclude.nu.
These packages were previously marked as transitively broken, before the
exclusion of old packages temporarily marked them unbroken. I only
tested the first 10 or so, but they all failed to build. Chances are
high, that *all* of them will fail to build, because some of their
dependencies were removed.

In the odd case that one of these would succeed, this would show up in
the next run of unbreak.nu.
@nixpkgs-ci nixpkgs-ci bot removed the 6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions label Sep 14, 2025
@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Sep 16, 2025
@MattSturgeon MattSturgeon removed their request for review September 16, 2025 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: haskell General-purpose, statically typed, purely functional programming language 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants