Skip to content

libexpr: Add EvalProfiler and use it for FunctionCallTrace#13219

Merged
Mic92 merged 2 commits intoNixOS:masterfrom
xokdvium:eval-profiler
May 18, 2025
Merged

libexpr: Add EvalProfiler and use it for FunctionCallTrace#13219
Mic92 merged 2 commits intoNixOS:masterfrom
xokdvium:eval-profiler

Conversation

@xokdvium
Copy link
Contributor

@xokdvium xokdvium commented May 16, 2025

Motivation

Having and extensible framework for evaluation profiling is much needed if we want to get flamegraph/tracy-based profilers #9967, #11373. This patch adds a EvalProfiler interface and MultiEvalProfiler, which support preFunctionCallHook and postFunctionCallHook that get call on entering and exiting callFunction. The actual hook invocation is guarded behind a simple cached boolean check with an [[unlikely]] annotation. This machinery is enough to reimplement existing FunctionCallTrace.

I hope this patch can pave the way forward for #9967, #11373 in an extensible and well-architected manner. Since the actual profiler implementation is hidden behind an interface, multiple profilers can be easily supported, so flamegraph/tracy can hopefully co-exist behind a --eval-profiler flag. (Current setting trace-function-calls can also be absorbed into that hypothetical feature).

(Inlined message from the second commit, which has non-tracing performance measurements)

Note that branches when the hook gets called are marked with [[unlikely]]
as a hint to the compiler that this is not a hot path. For non-tracing
evaluation this should be a 100% predictable branch, so the performance
cost is nonexistent.

Some measurements to support this point:

nix build .#nix-cli
nix build github:nixos/nix/d692729759e4e370361cc5105fbeb0e33137ca9e#nix-cli --out-link before

(Before)

$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g before/bin/nix eval nixpkgs#gnome --no-eval-cache" --warmup 4
Benchmark 1: GC_INITIAL_HEAP_SIZE=16g before/bin/nix eval nixpkgs#gnome --no-eval-cache
  Time (mean ± σ):      2.517 s ±  0.032 s    [User: 1.464 s, System: 0.476 s]
  Range (min … max):    2.464 s …  2.557 s    10 runs

(After)

$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g result/bin/nix eval nixpkgs#gnome --no-eval-cache" --warmup 4
Benchmark 1: GC_INITIAL_HEAP_SIZE=16g result/bin/nix eval nixpkgs#gnome --no-eval-cache
  Time (mean ± σ):      2.499 s ±  0.022 s    [User: 1.448 s, System: 0.478 s]
  Range (min … max):    2.472 s …  2.537 s    10 runs

Context

#9967
#11373


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

auto duration = std::chrono::high_resolution_clock::now().time_since_epoch();
auto ns = std::chrono::duration_cast<std::chrono::nanoseconds>(duration);
printMsg(lvlInfo, "function-trace entered %1% at %2%", pos, ns.count());
printMsg(lvlInfo, "function-trace entered %1% at %2%", state.positions[pos], ns.count());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #13211 operator[] is much cheaper to call, since PosTable now caches the line information.

xokdvium added a commit to xokdvium/nix that referenced this pull request May 17, 2025
Rough of an initial draft implementation based on the `EvalProfiler`
interface. The interface/options is still TBD, so this serves as an technical
POC.

This depends on NixOS#13219.

Co-authored-by: Jörg Thalheim <joerg@thalheim.io>
xokdvium added 2 commits May 18, 2025 11:55
This patch adds an EvalProfiler and MultiEvalProfiler that can be used
to insert hooks into the evaluation for the purposes of function tracing
(what function-trace currently does) or for flamegraph/tracy profilers.

See the following commits for how this is supposed to be integrated into
the evaluator and performance considerations.
This wires up the {pre,post}FunctionCallHook machinery
in EvalState::callFunction and migrates FunctionCallTrace
to use the new EvalProfiler mechanisms for tracing.

Note that branches when the hook gets called are marked with [[unlikely]]
as a hint to the compiler that this is not a hot path. For non-tracing
evaluation this should be a 100% predictable branch, so the performance
cost is nonexistent.

Some measurements to prove support this point:

```
nix build .#nix-cli
nix build github:nixos/nix/d692729759e4e370361cc5105fbeb0e33137ca9e#nix-cli --out-link before
```

(Before)

```
$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g before/bin/nix eval nixpkgs#gnome --no-eval-cache" --warmup 4
Benchmark 1: GC_INITIAL_HEAP_SIZE=16g before/bin/nix eval nixpkgs#gnome --no-eval-cache
  Time (mean ± σ):      2.517 s ±  0.032 s    [User: 1.464 s, System: 0.476 s]
  Range (min … max):    2.464 s …  2.557 s    10 runs
```

(After)

```
$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g result/bin/nix eval nixpkgs#gnome --no-eval-cache" --warmup 4
Benchmark 1: GC_INITIAL_HEAP_SIZE=16g result/bin/nix eval nixpkgs#gnome --no-eval-cache
  Time (mean ± σ):      2.499 s ±  0.022 s    [User: 1.448 s, System: 0.478 s]
  Range (min … max):    2.472 s …  2.537 s    10 runs
```
@Mic92 Mic92 merged commit 638b7ec into NixOS:master May 18, 2025
12 checks passed
@xokdvium xokdvium deleted the eval-profiler branch May 24, 2025 22:45
@roberth roberth added the backports ignored Seen but not applied label Jul 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backports ignored Seen but not applied

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants