Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colmena slower than nixos-rebuild, possibly bypassing cache with flakes #235

Open
justinas opened this issue Oct 27, 2024 · 1 comment
Open

Comments

@justinas
Copy link
Contributor

With the following flake (similar to solutions in #60):

{
  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";

  outputs = { self, nixpkgs }: {
    nixosConfigurations = {
      alpha = nixpkgs.lib.nixosSystem {
        system = "x86_64-linux";
        modules = [
          {
            boot.loader.grub.devices = [ "/dev/sda" ];
            fileSystems."/" = {
              device = "/dev/sda1";
            };
            system.stateVersion = "24.11";
          }
        ];
      };
    };
    colmena =
      let
        confs = self.nixosConfigurations;
      in
      {
        meta = {
          description = "my personal machines";
          # This can be overriden by node nixpkgs
          nixpkgs = import nixpkgs { system = "x86_64-linux"; };
          nodeNixpkgs = builtins.mapAttrs (name: value: value.pkgs) confs;
          nodeSpecialArgs = builtins.mapAttrs (name: value: value._module.specialArgs) confs;
        };
      } // builtins.mapAttrs
        (name: value: {
          imports = value._module.args.modules ++ [
            # The following undo some Colmena specific settings, ensuring that `nixos-rebuild` and `colmena build` result in the exact same result.
            {
              system.nixos.revision = nixpkgs.rev;
              system.nixos.versionSuffix = ".${builtins.substring 0 8 nixpkgs.lastModifiedDate}.${nixpkgs.shortRev}";
            }
          ];
        })
        confs;
  };
}

Colmena and a manual nix build build the same stuff, but Colmena is quite a bit slower (+48%):

$ time nix build .#nixosConfigurations.alpha.config.system.build.toplevel --print-out-paths --quiet --no-warn-dirty --no-eval-cache
/nix/store/7hnmzlgb83rxyn2hpg6p8sdwnc519cx5-nixos-system-nixos-24.11.20240529.ad57eef
nix build .#nixosConfigurations.alpha.config.system.build.toplevel  --quiet    7.39s user 1.88s system 84% cpu 11.041 total
$ time colmena build --on alpha -v
<...>
alpha | /nix/store/7hnmzlgb83rxyn2hpg6p8sdwnc519cx5-nixos-system-nixos-24.11.20240529.ad57eef
alpha | Built "/nix/store/7hnmzlgb83rxyn2hpg6p8sdwnc519cx5-nixos-system-nixos-24.11.20240529.ad57eef"
      | All done!
colmena build --on alpha -v  9.67s user 3.20s system 78% cpu 16.304 total

More interestingly, Colmena seems to not utilize the Nix eval cache, so when one rebuilds with no actual changes made, Colmena will still take the same long time to "build" (~16 seconds in my case), whereas nix build will be more or less instantaneous:

$ time nix build .#nixosConfigurations.alpha.config.system.build.toplevel --print-out-paths --quiet --no-warn-dirty
/nix/store/7hnmzlgb83rxyn2hpg6p8sdwnc519cx5-nixos-system-nixos-24.11.20240529.ad57eef
nix build .#nixosConfigurations.alpha.config.system.build.toplevel  --quiet   0.04s user 0.05s system 24% cpu 0.368 total
@zhaofengli
Copy link
Owner

Right, we need better visibility into the evaluation process. When testing, it's easier to drive the eval.nix machinery manually through the colmenaHive interface (see #228).

This is roughly what Colmena does when you run colmena build --on alpha,beta --experimental-flake-eval:

nix eval .#colmenaHive --json --apply 'with builtins; hive: attrNames hive.nodes'
nix eval .#colmenaHive --json --apply 'with builtins; hive: hive.metaConfig'
nix eval .#colmenaHive --json --apply 'with builtins; hive: hive.deploymentConfigSelected ["alpha" "beta"]'
nix eval .#colmenaHive --json --apply 'with builtins; hive: hive.evalSelectedDrvPaths ["alpha" "beta"]'

Right now, the evaluation cache works on the attribute level (e.g., legacyPackages.x86_64-linux.bash.drvPath) and only when the type is known beforehand (e.g., nix build leads to forceDerivation since it expects the path to contain a derivation). In the commands above, nix eval doesn't know the resulting type (i.e., always does forceValue) so it cannot benefit from the cache. Even if it did, all of the attributes are colmenaHive which is a complex attrset.

We could change it to do the following:

nix build .#colmenaHive.toplevel.alpha
nix build .#colmenaHive.toplevel.beta
# ... etc

This does make use of the evaluation cache, but then we won't be able to perform chunking to evaluate multiple nodes at once in the cold evaluation. Another possibility is to link against the C++ code (like Attic) but that would introduce additional complexity and maintenance burden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants