Skip to content

libutil: Make computeClosure async#14786

Open
xokdvium wants to merge 3 commits intomasterfrom
async-compute-closure
Open

libutil: Make computeClosure async#14786
xokdvium wants to merge 3 commits intomasterfrom
async-compute-closure

Conversation

@xokdvium
Copy link
Contributor

@xokdvium xokdvium commented Dec 13, 2025

Motivation

This makes computeFSClosure much faster in practice, because the graph is now traversed in an async BFS manner, while
previously it serialised on each single path info query. For something like

rm ~/.cache/nix/binary-cache-v7.sqlite && nix path-info --store https://cache.nixos.org/ --recursive /nix/store/7zz3zmv2a0ssmgqlfhy4rsb6ii6z475a-stdenv-linux.drv

The difference is huge:

(Before)

Command being timed: "nix path-info --store https://cache.nixos.org/ --recursive /nix/store/7zz3zmv2a0ssmgqlfhy4rsb6ii6z475a-stdenv-linux.drv"
        User time (seconds): 0.07
        System time (seconds): 0.06
        Percent of CPU this job got: 1%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:08.88

(After)

Command being timed: "build/src/nix/nix path-info --store https://cache.nixos.org/ --recursive /nix/store/7zz3zmv2a0ssmgqlfhy4rsb6ii6z475a-stdenv-linux.drv"
        User time (seconds): 0.07
        System time (seconds): 0.04
        Percent of CPU this job got: 22%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.49

Basically 20x speedup with just processing stuff in an async manner.
The crux of the approach is based on making our current callback-driven API easier to handle by using stackless C++20 coroutines. I've experimented with using stackful Boost.Coro ones and it really didn't pan out in terms of cancellation handling. This PR goes the cooperative cancellation route for interrupts.

Note that for now we have rather ad-hoc event loop just for
computeClosure. This seems perfectly fine for now, but in the future
we could extended it and possibly have a global event loop with multiple
threads handling it.

For now we have basic rate limiting <= 64 coroutines in flight which is
smaller than CURLMOPT_MAX_CONCURRENT_STREAMS (100 by default).

Context

Supersedes #14604.
Depends on #14838.


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@xokdvium
Copy link
Contributor Author

cc @pkpbynum. This is what I mentioned I was working on in #14604 (comment).

Comment on lines +49 to +50
auto info = co_await callbackToAwaitable<ref<const ValidPathInfo>>(
[this, path](Callback<ref<const ValidPathInfo>> cb) { queryPathInfo(path, std::move(cb)); });
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The special sauce that makes this work is this. The callback is invoked on the libcurl thread and the processing gets suspended until it's marshalled to the executor. This only works this way for the binary cache store at the moment, all other stores implement this callback synchronously

@roberth
Copy link
Member

roberth commented Dec 14, 2025

For now we have basic rate limiting <= 64 coroutines in flight which is
smaller than CURLMOPT_MAX_CONCURRENT_STREAMS (100 by default).

Sounds vaguely ominous, but is it?
If so, perhaps their relationship could be reified in code, e.g. by defining maxConcurrent in terms of CURLMOPT_MAX_CONCURRENT_STREAMS, or adding a warning somewhere if the inequality is the other way around.
(Just some vague ideas; you probably know a better solution)

@xokdvium
Copy link
Contributor Author

Sounds vaguely ominous, but is it?

Not really, this is just as note for the rate limit I've defined in the code to be 64. If we increase it further nothing bad will happen. curls default is just a rough ballpark number so as to not pick the rate limit arbitrarily. Ideally it could in the future be defined by the store type, but for now there's no benefit anyway because only curl-based async is really async.

@xokdvium xokdvium force-pushed the async-compute-closure branch from 4cdbeaf to c6166e5 Compare December 19, 2025 03:57
This makes computeFSClosure much faster in practice, because the graph is now traversed in an async BFS manner, while
previously it serialised on each single path info query. For something like

rm ~/.cache/nix/binary-cache-v7.sqlite && nix path-info --store https://cache.nixos.org --recursive /nix/store/7zz3zmv2a0ssmgqlfhy4rsb6ii6z475a-stdenv-linux.drv

The difference is huge:

(Before)

Command being timed: "nix path-info --store https://cache.nixos.org --recursive /nix/store/7zz3zmv2a0ssmgqlfhy4rsb6ii6z475a-stdenv-linux.drv"
        User time (seconds): 0.07
        System time (seconds): 0.06
        Percent of CPU this job got: 1%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:08.88

(After)

Command being timed: "build/src/nix/nix path-info --store https://cache.nixos.org --recursive /nix/store/7zz3zmv2a0ssmgqlfhy4rsb6ii6z475a-stdenv-linux.drv"
        User time (seconds): 0.07
        System time (seconds): 0.04
        Percent of CPU this job got: 22%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.49

Basically 20x speedup with just processing stuff in an async manner.

Note that for now we have rather ad-hoc event loop just for
computeClosure. This seems perfectly fine for now, but in the future
we could extended it and possibly have a global even loop with multiple
threads handling it.

For now we have basic rate limiting <= 64 coroutines in flight which is
smaller than CURLMOPT_MAX_CONCURRENT_STREAMS (100 by default).
@xokdvium xokdvium force-pushed the async-compute-closure branch from c6166e5 to b1c0f41 Compare December 19, 2025 04:34
@dpulls
Copy link

dpulls bot commented Dec 26, 2025

🎉 All dependencies have been resolved !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants