refactor: pypi mapping into cached client by baszalmstra · Pull Request #3318 · prefix-dev/pixi

baszalmstra · 2025-03-10T16:28:22Z

Refactors the pypi mapping in a more client based approach. The MappingClient has a function that can be called to amend purls. It also serves as the entry point for coalescing of requests and in memory caching.

Also fixes #2615

ruben-arts · 2025-03-11T10:09:50Z

#3322 This should fix the check-cli-docs job

wolfv · 2025-03-11T12:51:18Z

I asked @mariusvniekerk on Zulip to try this.

Maybe @tobiasfischer or @matthewfeickert could also give it a go to see if that helps with slow PyPI mapping issues that you observed.

matthewfeickert · 2025-03-11T12:59:35Z

Maybe @tobiasfischer or @matthewfeickert could also give it a go to see if that helps with slow PyPI mapping issues that you observed.

@wolfv I was too slow in testing the original PR @wolfv linked me to on Discord, and the artifacts had expired. You'd like testing of the Updating lock file takes forever Discord questions with the build artifacts from this PR though?

wolfv · 2025-03-11T13:47:27Z

yes, @matthewfeickert - this one is the better implementation. If you could test again with artifacts from this PR here that would be great.

crates/pypi_mapping/src/lib.rs

nichmor · 2025-03-12T09:00:26Z

I've tested it on a slow mapping example, as reported by @mariusvniekerk .

after profiling with cargo flamegraph

I can't find any samples for ament_purls - this means that we have a huge improvement :) comparing to previous 464 samples for it.

wolfv · 2025-03-12T11:51:56Z

@baszalmstra can we close this PR then? #3079

baszalmstra · 2025-03-12T14:04:34Z

Yep!

nichmor

awesome work!

Works as expected.

matthewfeickert · 2025-03-13T09:37:38Z

this one is the better implementation. If you could test again with artifacts from this PR here that would be great.

Ah okay great. I'll test this now. (Sorry at CERN this week and very behind everything.)

matthewfeickert · 2025-03-13T09:58:50Z

Hm. I may be doing something dumb in how I'm testing, but this build is slower than the current v0.42.1 release for me.

pixi manifest

Using the following manifest from Updating lock file takes forever Discord question on a Linux x86 machine with NVIDIA GPU.

pixi.toml:

[workspace]
authors = ["Matthew Feickert <matthew.feickert@cern.ch>"]
channels = ["conda-forge"]
name = "example"
platforms = ["linux-64"]
version = "0.1.0"

[tasks]

[system-requirements]
cuda = "12"

[dependencies]
python = "3.11.*"
pip = "<25.1"
numpy = "<2"
pytorch = ">=2.4"
torchvision = "*"
tqdm = "*"
matplotlib = "*"
scipy = "*"
Pillow = ">=7.1"
git = "*"
natsort = "*"
compilers = "*"
cmake = "*"
xformers = "*"
pytorch-lightning = "*"
pytorch-metric-learning = "*"
timm = "*"
h5py = "*"
scikit-learn = "*"
ftfy = "*"
cython = "*"
protobuf = "*"
termcolor = ">=1.1"
werkzeug = "*"
yacs = ">=0.1.6"
pycocotools = ">=2.0.2"
hydra-core = ">=1.1.0rc1"
grpcio = "*"
fvcore = "*"
cloudpickle = "*"
click = "*"
black = "==21.4b2"
absl-py = "*"
markdown = "*"
omegaconf = ">=2.1.0rc1"
pathspec = "*"
platformdirs = "*"
portalocker = "*"
opencv = "*"
tensorboard-data-server = "*"
tensorboard = "*"
setuptools = "*"
typing_extensions = "4.11.*"
regex = ">=2024.11.6,<2025"
gdown = ">=5.2.0,<6"
pydot = "*"
dataclasses = "*"
future = "*"
tabulate = "*"
pyparsing = "==3.0.9"
lvis = ">=0.5.3,<0.6"

[target.linux-64.dependencies]
cuda-version = "12.*"
pytorch-gpu = "*"
cuda-cudart-dev = "*"
cuda-crt = "*"
cudnn = "*"
libcusparse-dev = "*"
cuda-driver-dev = "*"
cuda-nvcc = "*"
cuda-nvrtc-dev = "*"
cuda-nvtx-dev = "*"
cuda-nvml-dev = "*"
cuda-profiler-api = "*"
cusparselt = "*"
libcublas-dev = "*"
libcudss-dev = "*"
libcufile-dev = "*"
libcufft-dev = "*"
libcurand-dev = "*"
libcusolver-dev = "*"

[pypi-dependencies]
diffdist = ">=0.1,<0.2"
iopath = ">=0.1.7,<0.1.9"

This PR build

$ uname -sm
Linux x86_64
$ command -v pixi
/tmp/check/pixi-bin/pixi
$ pixi --version
pixi 0.42.1
$ time pixi lock
...
real	0m30.242s
user	0m22.262s
sys	0m9.682s
$ time pixi update
✔ Lock-file was already up-to-date

real	0m24.757s
user	0m14.616s
sys	0m7.653s
$ rm pixi.lock
$ time pixi lock
...
real	0m25.713s
user	0m15.182s
sys	0m7.865s

current public `v0.42.1` release build

$ command -v pixi
/home/feickert/.pixi/bin/pixi
$ pixi self-update 
✔ pixi is already up-to-date (version 0.42.1)
$ rm -rf .pixi pixi.lock
$ time pixi lock  # Some of these are already cached
...
real	0m19.997s
user	0m10.373s
sys	0m1.343s
$ time pixi update
✔ Lock-file was already up-to-date

real	0m22.995s
user	0m11.546s
sys	0m1.554s
$ rm pixi.lock
$ time pixi lock
...
real	0m21.862s
user	0m10.973s
sys	0m1.498s

wolfv · 2025-03-13T10:03:32Z

@matthewfeickert just to confirm, you are seeing this on a "normal" machine with normal (speedy) internet?

matthewfeickert · 2025-03-13T10:06:21Z

just to confirm, you are seeing this on a "normal" machine with normal (speedy) internet?

I'm on my normal work laptop and I'm currently on CERN wifi which (while patchy sometimes) gives me a speedtest result of about 200 Mbps down/up.

I can login to some clusters in the US at the University of Chicago that I have access to and see if I see different results there.

wolfv · 2025-03-13T10:31:13Z

And also, could you run withotu the PyPI dependencies so that we have a baseline?

matthewfeickert · 2025-03-13T10:34:21Z

University of Chicago cluster login nodes (that don't have GPUs attached, as those are on workers)

This PR build

$ command -v pixi
/home/feickert/workarea/pixi-debug/pixi-bin/pixi
$ time pixi lock

real	1m38.032s
user	1m22.396s
sys	2m17.010s
$ time pixi update
✔ Lock-file was already up-to-date

real	0m18.171s
user	0m10.260s
sys	0m39.578s
$ rm pixi.lock 
$ time pixi lock
...
real	0m18.392s
user	0m10.330s
sys	0m40.193s

current public `v0.42.1` release build

[05:33] login02.af.uchicago.edu:~/workarea/pixi-debug/example $
$ command -v pixi
/home/feickert/.pixi/bin/pixi
$ pixi self-update 
✔ pixi is already up-to-date (version 0.42.1)
$ rm -rf .pixi pixi.lock
$ time pixi lock  # Some of these are already cached
...
real	0m6.682s
user	0m6.278s
sys	0m1.650s
$ time pixi update
✔ Lock-file was already up-to-date

real	0m6.618s
user	0m6.238s
sys	0m1.632s
$ rm pixi.lock
$ time pixi lock
...
real	0m6.680s
user	0m6.201s
sys	0m1.741s

ruben-arts · 2025-03-13T11:00:02Z

Thanks a lot for the testing @matthewfeickert, the difference is so extreme that it might have to do with the CI build, let me test the difference and otherwise provide you with a build using the release profile.

matthewfeickert · 2025-03-13T11:27:13Z

And also, could you run withotu the PyPI dependencies so that we have a baseline?

Will do, but will be a few hours given meetings. 👍

ruben-arts · 2025-03-13T14:13:47Z

src/lock_file/update.rs


+        let mapping_client = self.mapping_client.unwrap_or_else(|| {
+            MappingClient::builder(client)
+                .with_concurrency_limit(Arc::new(Semaphore::new(100)))


After some testing I found that this could be really fast, but also extremely slow "sometimes" as in a single hyperfine run of 10 runs, it could differ between 0.7 second and 10 seconds.

Lowering the value to 10 made it much more consistent between 1 and 2 seconds but it wouldn't go below 1 second anymore. Thus making the fastest run less fast.

I think this should reuse the configuration setting --concurrent-downloads or we add --concurrent-io-actions and use that here too.

I cannot reproduce any of these results on my windows machine. Ill try to reuse the --concurrent-downloads flag though!

ruben-arts · 2025-03-13T14:14:51Z

@matthewfeickert I've spend some more time on testing it, I'll get back to it with @baszalmstra and @wolfv to figure out opportunities to make it consistently faster.

matthewfeickert · 2025-03-14T09:57:49Z

And also, could you run withotu the PyPI dependencies so that we have a baseline?

(Day later, sorry)
This is on my work laptop again on CERN wifi

This PR build

$ command -v pixi
/tmp/check/pixi-bin/pixi
$ time pixi lock
...
real	0m10.638s
user	0m12.271s
sys	0m9.605s
$ time pixi update
✔ Lock-file was already up-to-date

real	0m2.933s
user	0m4.596s
sys	0m6.754s
$ rm pixi.lock
$ time pixi lock
...
real	0m2.956s
user	0m4.429s
sys	0m6.786s

current public `v0.42.1` release build

$ command -v pixi
/home/feickert/.pixi/bin/pixi
$ time pixi lock
...
real	0m1.787s
user	0m2.224s
sys	0m0.370s
$ time pixi update
✔ Lock-file was already up-to-date

real	0m1.797s
user	0m2.222s
sys	0m0.388s
$ rm pixi.lock
$ time pixi lock
...
real	0m1.765s
user	0m2.188s
sys	0m0.374s

This seems to be more similar to the timing that @ruben-arts had noted earlier on Discourse.

nichmor · 2025-03-18T09:46:09Z

hello everyone! We have benchmarked a little more this PR!

The 4 binaries story

so we took 4 binaries as a baseline to benchmark.
pixi - released version with all performance optimizations enabled.
pixi-pypi-mapping - this PR version
pixi-0.42.1-main - latest pixi main branch
pixi-0.42.1 - pixi from the 0.42.1 tag.

All benchmarks were run using the pixi.toml from this comment: #3318 (comment).

These are the results when run with cache:

pixi update ran
    1.15 ± 0.03 times faster than pixi-0.42.1 update
    1.17 ± 0.02 times faster than pixi-pypi-mapping update
    1.21 ± 0.05 times faster than pixi-0.42.1-main update

These are the results when running without cache:

pixi update ran
    1.20 ± 0.04 times faster than pixi-0.42.1-main update
    1.21 ± 0.07 times faster than pixi-pypi-mapping update
    1.36 ± 0.68 times faster than pixi-0.42.1 update

as we can see, released pixi is always faster as it have all performance optimizations enabled ( like another allocator, some jemalloc settings ). We will exclude it from the benchmarks as it will be always faster.

The 3 binaries story

Now that we exclude released pixi, let's benchmark pixi main and pixi 0.42.1 against this PR.

This is with clean cache:

Benchmark 1: pixi-v0.42.1 update
  Time (mean ± σ):      4.701 s ±  0.319 s    [User: 3.879 s, System: 2.220 s]
  Range (min … max):    4.175 s …  5.096 s    10 runs
 
Benchmark 2: pixi-main update
  Time (mean ± σ):      4.638 s ±  0.236 s    [User: 3.801 s, System: 2.073 s]
  Range (min … max):    4.182 s …  5.006 s    10 runs
 
Benchmark 3: pixi-pypi-mapping update
  Time (mean ± σ):      4.543 s ±  0.330 s    [User: 3.765 s, System: 2.012 s]
  Range (min … max):    4.044 s …  5.010 s    10 runs
 
Summary
  pixi-pypi-mapping update ran
    1.02 ± 0.09 times faster than pixi-main update
    1.03 ± 0.10 times faster than pixi-v0.42.1 update

This is with warm cache:

Benchmark 1: pixi-v0.42.1 update
  Time (mean ± σ):      2.396 s ±  0.051 s    [User: 3.046 s, System: 0.842 s]
  Range (min … max):    2.330 s …  2.458 s    10 runs
 
Benchmark 2: pixi-main update
  Time (mean ± σ):      2.477 s ±  0.066 s    [User: 3.038 s, System: 0.776 s]
  Range (min … max):    2.398 s …  2.606 s    10 runs
 
Benchmark 3: pixi-pypi-mapping update
  Time (mean ± σ):      2.392 s ±  0.088 s    [User: 2.993 s, System: 0.703 s]
  Range (min … max):    2.315 s …  2.590 s    10 runs
 
Summary
  pixi-pypi-mapping update ran
    1.00 ± 0.04 times faster than pixi-v0.42.1 update
    1.04 ± 0.05 times faster than pixi-main update

as we can see, this PR is faster than pixi-main or pixi-v0.42.1 so we can assert! that the new implementation actually makes things faster

Other ideas

some other ideas left to try:

lower concurrency number of requests
try to spawn tasks instead of collecting futures so other workers can steal them.

cc @baszalmstra @ruben-arts

baszalmstra · 2025-03-18T14:45:29Z

I added some logging statements that signify the efficiency of the mapping:

INFO resolve_conda{group=default platform=linux-64}: pypi_mapping: Amended 409 out of 409 records with purls in 300ms. 371 cache hits and 37 cache misses (90.93%).

Cache misses are also reporting when running with -vvv:

DEBUG resolve_conda{group=default platform=linux-64}:derive_purl{record="libgettextpo-devel-0.23.1-h5888daf_0.conda"}: pypi_mapping: Cache miss on 'https://conda-mapping.prefix.dev/hash-v0/90f29ec7a7e2d758cb61459e643dcb54933dcf92194be6c29b0a1591fcbb163e' (404 Not Found)

wolfv · 2025-03-18T14:51:03Z

This one says cache miss but it also says 404 - is that the same?

DEBUG resolve_conda{group=default platform=linux-64}:derive_purl{record="libgettextpo-devel-0.23.1-h5888daf_0.conda"}: pypi_mapping: Cache miss on 'https://conda-mapping.prefix.dev/hash-v0/90f29ec7a7e2d758cb61459e643dcb54933dcf92194be6c29b0a1591fcbb163e' (404 Not Found)

baszalmstra · 2025-03-18T14:57:09Z

I forgot to mention, the latest code also honors the --concurrent-downloads flag.

baszalmstra · 2025-03-18T14:58:09Z

This one says cache miss but it also says 404 - is that the same?

Its a cache miss because the server was contacted. And 404's dont have any cache headers so they will always be cache misses.

ruben-arts · 2025-03-18T15:01:28Z

This works as I expect so far as I can test it! Thanks guys for the in-depth review of the situation!

matthewfeickert · 2025-03-18T22:34:20Z

Thanks for the detailed work and analysis. It is great to see this level of commitment to excellence in projects like this that touch the entire community. 🙇

refactor: pypi mapping into cached client

8501756

baszalmstra marked this pull request as draft March 10, 2025 16:28

wolfv mentioned this pull request Mar 11, 2025

Failed to download compressed mapping from GitHub, even if I can wget the file. #1882

Open

2 tasks

baszalmstra added 2 commits March 11, 2025 10:50

refactor: add progress

deb90bc

fix: doc reference

fad9e9e

baszalmstra added 2 commits March 11, 2025 11:35

fix: relative paths

9bc5a59

fix: lock file of pixi_docs

e2fb325

baszalmstra marked this pull request as ready for review March 11, 2025 10:40

fix: test mapping

18e84cc

baszalmstra added the test:extra_slow Run the extra slow tests label Mar 11, 2025

Merge branch 'main' into refactor/pypi_mapping

8a192a0

baszalmstra mentioned this pull request Mar 11, 2025

fix: use rayon when fetching pypi mapping #3299

Closed

Merge branch 'main' into refactor/pypi_mapping

a42d7af

nichmor reviewed Mar 12, 2025

View reviewed changes

crates/pypi_mapping/src/lib.rs Outdated Show resolved Hide resolved

crates/pypi_mapping/src/lib.rs Outdated Show resolved Hide resolved

crates/pypi_mapping/src/lib.rs Outdated Show resolved Hide resolved

wolfv mentioned this pull request Mar 12, 2025

fix: make conda-pypi-map relative to manifest #3079

Closed

baszalmstra added 3 commits March 13, 2025 08:49

code review

56440b5

Merge remote-tracking branch 'upstream/main' into refactor/pypi_mapping

4e17c27

code review

70ce5bb

nichmor approved these changes Mar 13, 2025

View reviewed changes

ruben-arts reviewed Mar 13, 2025

View reviewed changes

baszalmstra added 6 commits March 18, 2025 11:43

fix: limit concurrent downloads

cac8eae

fix: dependency issues

890a158

fix: bump rattler

d607d5b

fix: update cargo lock

07c570c

fix: add logging to mapping

0fd4c7a

Merge remote-tracking branch 'upstream/main' into refactor/pypi_mapping

9d3d205

ruben-arts approved these changes Mar 18, 2025

View reviewed changes

ruben-arts enabled auto-merge (squash) March 18, 2025 15:01

ruben-arts disabled auto-merge March 18, 2025 15:31

ruben-arts merged commit 85dfc40 into prefix-dev:main Mar 18, 2025
31 checks passed

ruben-arts mentioned this pull request Mar 18, 2025

chore: rattler branch for trash creation #3376

Closed

BrewTestBot mentioned this pull request Mar 21, 2025

pixi 0.43.0 Homebrew/homebrew-core#213250

Merged

Conversation

baszalmstra commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ruben-arts commented Mar 11, 2025

Uh oh!

wolfv commented Mar 11, 2025

Uh oh!

matthewfeickert commented Mar 11, 2025

Uh oh!

wolfv commented Mar 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nichmor commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wolfv commented Mar 12, 2025

Uh oh!

baszalmstra commented Mar 12, 2025

Uh oh!

nichmor left a comment

Choose a reason for hiding this comment

Uh oh!

matthewfeickert commented Mar 13, 2025

Uh oh!

matthewfeickert commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

pixi manifest

This PR build

current public v0.42.1 release build

Uh oh!

wolfv commented Mar 13, 2025

Uh oh!

matthewfeickert commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wolfv commented Mar 13, 2025

Uh oh!

matthewfeickert commented Mar 13, 2025

This PR build

current public v0.42.1 release build

Uh oh!

ruben-arts commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matthewfeickert commented Mar 13, 2025

Uh oh!

ruben-arts Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

baszalmstra Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

ruben-arts commented Mar 13, 2025

Uh oh!

matthewfeickert commented Mar 14, 2025

This PR build

current public v0.42.1 release build

Uh oh!

nichmor commented Mar 18, 2025

The 4 binaries story

The 3 binaries story

Other ideas

Uh oh!

baszalmstra commented Mar 18, 2025

Uh oh!

wolfv commented Mar 18, 2025

Uh oh!

baszalmstra commented Mar 18, 2025

Uh oh!

baszalmstra commented Mar 18, 2025

Uh oh!

ruben-arts commented Mar 18, 2025

Uh oh!

Uh oh!

matthewfeickert commented Mar 18, 2025

Uh oh!

Reviewers

baszalmstra commented Mar 10, 2025 •

edited

Loading

nichmor commented Mar 12, 2025 •

edited

Loading

matthewfeickert commented Mar 13, 2025 •

edited

Loading

current public `v0.42.1` release build

matthewfeickert commented Mar 13, 2025 •

edited

Loading

current public `v0.42.1` release build

ruben-arts commented Mar 13, 2025 •

edited

Loading

current public `v0.42.1` release build