libstore: Optimize BinaryCacheStore::queryValidPaths#14877
libstore: Optimize BinaryCacheStore::queryValidPaths#14877Zaczero wants to merge 1 commit intoNixOS:masterfrom
Conversation
213aae8 to
1d90cf8
Compare
|
Using an ad-hoc thread-pool like this doesn't make much sense, since the libcurl is already single-threaded. Something more async would make more sense. I have #14786 up, but it might need a bit more work. Would be nice if we had a more general approach to this than sprinkling ad hoc threading everywhere. |
|
@xokdvium I believe I did notice that and I agree the broader libcurl rework is ideal. This PR is an easy win because without specialization, it already uses multiple threads for networking but instead of going through GET+parse it just does HEADs. So it's not like this change adds more threading. |
|
But networking is still single-threaded - it's only the dispatching that's parallel (and that should just be async). Our filetransfer code runs on a singleton curlFileTransfer that only has a single worker thread. The bottleneck here is not the networking, but the usage of blocking completion handlers. |
TBH there isn't much to rework in filetransfer. It's more about making the downstream usages more ergonomic and efficient by avoiding the usage of blocking completion handlers and instead bridging the callback-based handlers into whatever is more appropriate to use for these sort of fork-style concurrency without paying the absolutely unnecessary overhead of spawning threads (though yes, it's quite low), but we are not going to get anywhere if we keep on sprinkling thread pools everywhere that we can't be bothered to avoid using blocking handlers. Ideally, all the |
Let me repeat that the point of this is to perform HEADs instead of GET+parse. The pooling code is mostly copied over from the existing, non-specializing implementation. |
That code is a hack and we shouldn't spread it further.
FWIW the GET does have the benefit of filling the on-disk negative (and positive) pathinfo cache, so that quite minimal cost is paid just once. I think there's general preference to query path infos and not just existence checks for this reason: #14769. |
Ye I saw that but from at a glance check it didn't benefit all queryValidPaths call sites. If that's wanted maybe using HEADs is a mistake in the first place? Maybe this PR should be other way around? Removing HEADs and replacing with GETs? |
Store::queryValidPaths() uses queryPathInfo() per path. For HTTP binary caches, that means GET hash.narinfo and parsing it. Override BinaryCacheStore::queryValidPaths() to use isValidPath() instead. For binary caches, isValidPathUncached() is already implemented in terms of fileExists(hash.narinfo); HttpBinaryCacheStore::fileExists() uses HEAD probes rather than GET+parse.
Motivation
Context
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.