libexpr: allocate the Exprs themselves in Exprs::alloc by Radvendii · Pull Request #14472 · NixOS/nix

Radvendii · 2025-11-04T14:53:15Z

Motivation

for big-picture motivation, see the tracking issue

This PR moves the Expr objects into Exprs::alloc. In truth, eventually we want them to live outside of the allocator in vectors.

However, we're in a bit of a catch 22. Part of the reason we want to move the Exprs into vectors is so they don't get leaked so that we can do fuzz testing. But it's a big enough change that we're going to want to be able to fuzz-test it!

This is my way of snipping that loop. It gets rid of the leak so we can fuzz test the bigger changes.

Context

tracking issue

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

Radvendii · 2025-11-04T14:55:34Z

CC @Ericson2314 @xokdvium

Radvendii · 2025-11-04T15:06:33Z

I don't know how to make sense of these performance numbers:

For parsing, we see

-7.6% memory usage
+5.6% instructions
-13.1% cache misses
-1.3% branch misses

The only number that I have an explanation for is -13% cache misses. That could make sense since we're now allocating Exprs (a) near each other, and (b) near the data that comprises them.

$ poop -d 60000 '../master/result/bin/nix-instantiate --parse ../../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix' '../feature/result/bin/nix-instantiate --parse ../../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix'
Benchmark 1 (198 runs): ../master/result/bin/nix-instantiate --parse ../../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           304ms ± 6.22ms     293ms …  355ms          9 ( 5%)        0%
  peak_rss            101MB ±  190KB     101MB …  102MB          1 ( 1%)        0%
  cpu_cycles         1.22G  ± 24.1M     1.20G  … 1.40G          17 ( 9%)        0%
  instructions       3.42G  ± 8.79K     3.42G  … 3.42G           2 ( 1%)        0%
  cache_references   27.5M  ± 3.01M     26.0M  … 51.4M          32 (16%)        0%
  cache_misses       1.78M  ± 18.1K     1.72M  … 1.95M           8 ( 4%)        0%
  branch_misses      3.86M  ± 24.8K     3.82M  … 4.02M           7 ( 4%)        0%
Benchmark 2 (201 runs): ../feature/result/bin/nix-instantiate --parse ../../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           300ms ± 3.16ms     294ms …  316ms          9 ( 4%)        ⚡-  1.3% ±  0.3%
  peak_rss           93.7MB ±  188KB    93.2MB … 94.1MB          0 ( 0%)        ⚡-  7.6% ±  0.0%
  cpu_cycles         1.21G  ± 11.3M     1.19G  … 1.30G           9 ( 4%)          -  0.9% ±  0.3%
  instructions       3.61G  ± 8.76K     3.61G  … 3.61G           0 ( 0%)        💩+  5.6% ±  0.0%
  cache_references   26.9M  ± 1.30M     26.0M  … 39.2M          17 ( 8%)          -  2.2% ±  1.7%
  cache_misses       1.54M  ± 14.6K     1.52M  … 1.60M           6 ( 3%)        ⚡- 13.1% ±  0.2%
  branch_misses      3.81M  ± 47.5K     3.77M  … 4.08M          17 ( 8%)        ⚡-  1.3% ±  0.2%

Overall performance numbers when we look at evaluation of a NixOS configuration are

-1.9% (20MB) memory usage
+2.6% instructions

$ poop '../master/result/bin/nix eval /path/to/my/nixos/flake#nixosConfigurations.smudge.config.system.build.toplevel' '../feature/result/bin/nix eval /path/to/my/nixos/flake#nixosConfigurations.smudge.config.system.build.toplevel'
Benchmark 1 (3 runs): ../master/result/bin/nix eval /path/to/my/nixos/flake#nixosConfigurations.smudge.config.system.build.toplevel
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.40s  ± 54.1ms    7.34s  … 7.45s           0 ( 0%)        0%
  peak_rss           1.17GB ±  101KB    1.17GB … 1.17GB          0 ( 0%)        0%
  cpu_cycles         36.6G  ±  161M     36.4G  … 36.8G           0 ( 0%)        0%
  instructions       74.1G  ± 2.89M     74.1G  … 74.1G           0 ( 0%)        0%
  cache_references   1.92G  ± 12.2M     1.91G  … 1.93G           0 ( 0%)        0%
  cache_misses        276M  ± 6.69M      268M  …  282M           0 ( 0%)        0%
  branch_misses       218M  ± 4.00M      214M  …  222M           0 ( 0%)        0%
Benchmark 2 (3 runs): ../feature/result/bin/nix eval /path/to/my/nixos/flake#nixosConfigurations.smudge.config.system.build.toplevel
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.34s  ±  301ms    7.07s  … 7.66s           0 ( 0%)          -  0.8% ±  6.6%
  peak_rss           1.15GB ±  223KB    1.15GB … 1.15GB          0 ( 0%)        ⚡-  1.9% ±  0.0%
  cpu_cycles         37.1G  ±  714M     36.4G  … 37.8G           0 ( 0%)          +  1.3% ±  3.2%
  instructions       76.1G  ± 34.7M     76.0G  … 76.1G           0 ( 0%)        💩+  2.6% ±  0.1%
  cache_references   1.92G  ± 15.7M     1.90G  … 1.94G           0 ( 0%)          +  0.2% ±  1.7%
  cache_misses        280M  ± 8.59M      271M  …  288M           0 ( 0%)          +  1.4% ±  6.3%
  branch_misses       213M  ± 5.02M      208M  …  218M           0 ( 0%)          -  2.3% ±  4.7%

Radvendii · 2025-11-04T15:17:24Z

I tried putting [[gnu::always_inline]] in front of the add functions but it didn't change anything. I thought that might have been the source of the added instructions.

src/libexpr/include/nix/expr/nixexpr.hh

src/libexpr/include/nix/expr/parser-state.hh

Radvendii · 2025-11-04T16:49:31Z

I forgot parser-state.hh. good catch.

Ericson2314 · 2025-11-04T16:52:41Z

@Radvendii as to the instruction count:

If the cash misses speed things up enough, who cares! :)
One way to bisect is to introduce the add function, and make everything use that, but make add use regular heap just like before. Than we can separate allocator cost vs new infra cost. (Cause it does look like the new infra ought to be essential free, after inlining, I agree!)

Radvendii · 2025-11-04T16:55:06Z

If the cash misses speed things up enough, who cares! :)

Maybe we could get even more performance!

One way to bisect is to introduce the add function, and make everything use that, but make add use regular heap just like before. Than we can separate allocator cost vs new infra cost. (Cause it does look like the new infra ought to be essential free, after inlining, I agree!)

Good thought. I'll try this later today.

Ericson2314 · 2025-11-04T17:01:57Z

Good thought. I'll try this later today.

And do be clear, feel free to just keep the 3-commit history, and I don't mind merging the first 2 steps (current first commit and the new in-between commit) in the meantime.

Radvendii · 2025-11-04T21:03:20Z

Okay. Things are now in three commmits. Only the last one causes the performance changes.

src/libexpr/include/nix/expr/nixexpr.hh

Ericson2314 · 2025-11-05T04:28:05Z

Great, thanks @Radvendii. Make a PR with the first two commits (with the good perf numbers), and I am happy merging it right away.

Radvendii · 2025-11-06T17:28:59Z

This is proving quite hard to debug (increased instruction count isn't even reproducible with debugoptimized mode), and as @Ericson2314 pointed out, it's not even a problem.

So I think it's best to just move on.

Ericson2314 · 2025-11-06T18:21:14Z

Glad it's merged!

Radvendii requested a review from edolstra as a code owner November 4, 2025 14:53

Ericson2314 reviewed Nov 4, 2025

View reviewed changes

src/libexpr/include/nix/expr/nixexpr.hh Show resolved Hide resolved

Ericson2314 reviewed Nov 4, 2025

View reviewed changes

src/libexpr/include/nix/expr/parser-state.hh Outdated Show resolved Hide resolved

Ericson2314 reviewed Nov 4, 2025

View reviewed changes

src/libexpr/include/nix/expr/parser-state.hh Outdated Show resolved Hide resolved

Radvendii force-pushed the exprs-alloc branch from 1d1ebcb to a82f09f Compare November 4, 2025 16:49

Radvendii force-pushed the exprs-alloc branch from a82f09f to c693e86 Compare November 4, 2025 20:48

Radvendii force-pushed the exprs-alloc branch from c693e86 to 0dff461 Compare November 4, 2025 23:09

Ericson2314 reviewed Nov 5, 2025

View reviewed changes

src/libexpr/include/nix/expr/nixexpr.hh Show resolved Hide resolved

Radvendii mentioned this pull request Nov 5, 2025

parser.y: abstract new into a function on Exprs #14481

Merged

parser.y: allocate Exprs in the allocator

b2f0472

Radvendii force-pushed the exprs-alloc branch from 0dff461 to b2f0472 Compare November 5, 2025 16:10

xokdvium approved these changes Nov 6, 2025

View reviewed changes

xokdvium added this pull request to the merge queue Nov 6, 2025

Merged via the queue into NixOS:master with commit d596b97 Nov 6, 2025
16 checks passed

edolstra mentioned this pull request Dec 9, 2025

2.33 release notes #14752

Merged

Uh oh!

Conversation

Radvendii commented Nov 4, 2025

Motivation

Context

Uh oh!

Radvendii commented Nov 4, 2025

Uh oh!

Radvendii commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Radvendii commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Radvendii commented Nov 4, 2025

Uh oh!

Ericson2314 commented Nov 4, 2025

Uh oh!

Radvendii commented Nov 4, 2025

Uh oh!

Ericson2314 commented Nov 4, 2025

Uh oh!

Radvendii commented Nov 4, 2025

Uh oh!

Uh oh!

Ericson2314 commented Nov 5, 2025

Uh oh!

Radvendii commented Nov 6, 2025

Uh oh!

Uh oh!

Ericson2314 commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Radvendii commented Nov 4, 2025 •

edited

Loading

Radvendii commented Nov 4, 2025 •

edited

Loading