From 37ec2fa965a7f4e3878fb5851df75d109a4ae021 Mon Sep 17 00:00:00 2001 From: Gabriella Gonzalez Date: Tue, 22 Nov 2022 13:27:45 -0800 Subject: [PATCH] haskell.lib.incremental: init MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds a new `incremental` utility for Haskell CI that supports incremental builds based on the approach outlined in this blog post: https://harry.garrood.me/blog/easy-incremental-haskell-ci-builds-with-ghc-9.4/ The basic idea is that instead of Nix doing a full build for a package, we split every build into two builds: - A full build at an older point in time e.g. a daily or weekly time boundary - An incremental build relative to the last full build This incremental build reuses the build products left over from the most recent full build. In order to do this, though, we need a way to "snap" a package's `git` source input to an earlier point in time (e.g. a daily boundary or weekly boundary). This would allow multiple incremental builds to share the same full rebuild if they snap to the same time boundary. The approach I went with to make that possible was to extend Nix's `builtins.fetchGit` to support a new `date` argument and you can find the corresponding PR for that here: https://github.com/NixOS/nix/pull/7362 That is why the `incremental` utility added here requires a sufficiently new version of Nix (one that would incorporate that change, presuming it is merged). This also requires GHC 9.4 or newer in order to pick up a fix to GHC's change detection logic, as described in more detail in the above blog post. However, if you satisfy those requirements then this works exactly the way you'd expect: all of the incremental builds only have to build the diff since the last time boundary. Moreover, if CI caches the full build then developers can also run `nix build` locally and only have to build the diff, too. Lower required version … so that it works against the upstream PR I'll change the required version to an official release if the PR is merged. s/pkgs/pkg/g … as caught by @cdepillabout Co-authored-by: Dennis Gosnell Skip the use of `tar` We can store the `dist` directory decompressed, which speeds up the dist export/import This potentially requires more disk space *but* by storing the files unpacked it may actually improve disk utilization in some cases if `auto-optimise-store` is enabled by permitting deduplication of `dist` files. Add an `installDist` phase … which is disabled by default The motivation for this is to bring the behavior of `enableSeparateDistOutput` more in line with the other options where it doesn't change *whether* or not something is exported, but rather *where* it is exported. Now `installDist` controls whether or not the `dist` directory is exported. Based on this discussion: https://github.com/NixOS/nixpkgs/pull/203499/files#r1034150076 Document `interval` argument … as suggested by @cdepillabout s/for use for/for use with/ … based on feedback from @MaxGabriel Move `installDistPhase` to `postPhases` There are two reasons for doing this: - We can get rid of the hack to remove the dist output from the outputs - We can ensure that any changes that happen in the install phase are correctly reflected in the `dist` export Disable dylib workaround for incremental build Improve correctness of `incremental` function Typically we don't want to just roll back the source code that is the input for the Haskell package because the dependencies for the package may have changed In other words, if you roll back the source code for the top-level package without also rolling back the Nix-supplied dependencies for that build then you run the risk of an unexpected build failure (due to an older version of the Haskell package being built against a newer version of the Nix-supplied dependencies). What you actually want to do is to roll back the entire repository (i.e. the Haskell source code and the supporting Nix code) to ensure that the Haskell source code and Nix code stay in sync. This more generalized rollback complicates the UX for the `incremental` function. I did my best to try to streamline the UX so that the user just needs to specify how to locate the matching (older) package after a rollback. Make date relative to revision (if possible) This way if you attempt to incrementally build an older revision then the full rebuild will be relative to the older revision instead of being relative to the present. Add `extraFetchGitArgs` option This in particular comes in handy if you want to specify `ref = "main";` to ensure that the older build comes from the `main` branch of your repository. --- .../haskell-modules/lib/compose.nix | 140 ++++++++++++++++++ .../haskell-modules/lib/default.nix | 2 + 2 files changed, 142 insertions(+) diff --git a/pkgs/development/haskell-modules/lib/compose.nix b/pkgs/development/haskell-modules/lib/compose.nix index fa6d2a20a9a23..c756027c5c110 100644 --- a/pkgs/development/haskell-modules/lib/compose.nix +++ b/pkgs/development/haskell-modules/lib/compose.nix @@ -504,4 +504,144 @@ rec { libraryPkgconfigDepends = propagatedPlainBuildInputs old.libraryPkgconfigDepends or [ ]; testPkgconfigDepends = propagatedPlainBuildInputs old.testPkgconfigDepends or [ ]; }); + + # The motivation for this utility is for use with CI builds in order to avoid + # a full rebuild on every commit to the trunk development branch or every pull + # request. For more details, see: + # + # https://harry.garrood.me/blog/easy-incremental-haskell-ci-builds-with-ghc-9.4/ + # + # This accelerates a Haskell package build by building the package + # "incrementally", meaning that a "full" rebuild is only done once every + # interval and all rebuilds in between are "incremental", meaning that each + # incremental build reuses the `dist` directory from the last full rebuild. + # + # This only works for packages that use `git` for their source. + # + # The `interval` argument is in seconds. For example, if you wanted to do a + # full rebuild every day, you would specify `interval = 24 * 60 * 60;`. + # + # This function may require a sufficiently new version of macOS because it + # disables the work-around from https://github.com/NixOS/nixpkgs/pull/25537 + # in order for incremental builds to work on Mac. However, the work-around + # appears to no longer be necessary anyway on newer versions of macOS. For + # example, this was stress-tested successfully without the work-around on + # macOS Ventura 13.0.1. + # + # The type of this function is conceptually: + # + # ``` + # incremental + # : { interval : Duration + # , makePreviousBuild : (Derivation → Derivation) → Derivation + # } + # → Derivation + # → Derivation + # ``` + # + # Example usage: + # + # ``` + # let + # interval = 24 * 60 * 60; # 1 day + # + # makePreviousBuild = + # floorToTimeBoundary: + # import "${floorToTimeBoundary ./path/to/repository}/example.nix"; + # + # in + # incremental { inherit interval makePreviousBuild; } example + # ``` + # + # To understand how the above example works, suppose that: + # + # - you are building a Haskell package named `example` + # - `./path/to/repository/example.nix` is a Nix file that builds that package + # + # Then what `floorToTimeBoundary` does in the above example is it takes the + # path to any repository (e.g. `./path/to/repository`) and rolls back that + # repository to the last time boundary (e.g. the latest UTC midnight in the + # above example, because the `interval` is 1 day). Then all we need to do + # is locate and build the older version of our package stored within that + # earlier snapshot of the repository (in the above example by importing + # `./example.nix`, although the exact details of how to locate and build the + # the Haskell package will vary from repository to repository). + # + # In other words, if you explain to the `incremental` function how to build + # the older version of your package then it will take care of automatically + # selecting the correct revision to use for the full build. + incremental = { interval, makePreviousBuild, extraFetchGitArgs ? { } }: pkg: + let + requiredNixVersion = "2.12.0pre20221128_32c182b"; + requiredGHCVersion = "9.4"; + + truncate = src: + let + srcAttributes = + if lib.isAttrs src + then src + else { url = src; }; + + url = srcAttributes.url or null; + name = srcAttributes.name or null; + submodules = srcAttributes.fetchSubmodules or null; + + arguments = { + ${ if name == null then null else "name" } = name; + ${ if url == null then null else "url" } = url; + ${ if submodules == null then null else "submodules" } = submodules; + }; + + # You might wonder why we don't just do something like: + # + # builtins.fetchGit { + # inherit (srcAttributes) rev; + # date = "1 day ago"; + # } + # + # This does not produce the desired behavior because it will not + # ensure that each incremental build for a given day shares the same + # full build (especially if the prior day had multiple commits, each + # of which could potentially be selected as the commit from "1 day + # ago". + # + # Instead, what we want is for each build for a given day (or whatever + # time interval) to select the same commit from the prior day to + # promote reuse of the same full build. That's why we need to do this + # complicated calculation at evaluation time in Nix instead of reusing + # Git's built-in support for relative date specifications. + startingTime = + if srcAttributes ? rev + && srcAttributes.rev != "0000000000000000000000000000000000000000" + then + let + startingRepository = builtins.fetchGit (arguments // { + inherit (srcAttributes) rev; + }); + in + startingRepository.lastModified + else + builtins.currentTime; + + in + builtins.fetchGit (arguments // { + date = "${toString ((startingTime / interval) * interval)}"; + } // extraFetchGitArgs); + + previousBuild = + (overrideCabal + (old: { + doInstallDist = true; + enableSeparateDistOutput = true; + }) + (makePreviousBuild truncate) + ).dist; + + in + if builtins.compareVersions requiredNixVersion builtins.nixVersion == 1 then + abort "pkgs.haskell.lib.incremental requires Nix version ${requiredNixVersion} or newer" + else if builtins.compareVersions requiredGHCVersion pkg.passthru.compiler.version == 1 then + abort "pkgs.haskell.lib.incremental requires GHC version ${requiredGHCVersion} or newer" + else + overrideCabal (old: { inherit previousBuild; }) pkg; } diff --git a/pkgs/development/haskell-modules/lib/default.nix b/pkgs/development/haskell-modules/lib/default.nix index ffd9ac0578906..86f11bd0ff00f 100644 --- a/pkgs/development/haskell-modules/lib/default.nix +++ b/pkgs/development/haskell-modules/lib/default.nix @@ -354,4 +354,6 @@ rec { # same package in the (recursive) dependencies of the package being # built. Will delay failures, if any, to compile time. allowInconsistentDependencies = compose.allowInconsistentDependencies; + + incremental = pkg: args: compose.incremental args pkg; }