Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream zip archive when fetching non-range-request metadata #1792

Merged
merged 1 commit into from
Feb 21, 2024

Conversation

charliermarsh
Copy link
Member

Summary

If a registry doesn't support range requests, then today, we download the entire wheel to disk and then read the metadata from the downloaded archive. This PR instead modifies the registry client to stream the zipfile and stop as soon as it's seen the metadata, which should be more efficient.

Closes #1596.

Test Plan

Made this the only path for downloading metadata; verified that the test suite passed.

@charliermarsh charliermarsh added the enhancement New feature or request label Feb 21, 2024
@charliermarsh
Copy link
Member Author

Something like 10-20% faster:

puffin on  charlie/async-meta:main [$!⇡] is 📦 v0.1.6 via 🐍 v3.12.1 (.venv) via 🦀 v1.75.0
❯ python -m scripts.bench \
        --uv-path ./target/release/main \
        --uv-path ./target/release/uv \
        ./scripts/requirements/dtlssocket.in --benchmark resolve-cold --min-runs 50
Benchmark 1: ./target/release/main (resolve-cold)
  Time (mean ± σ):     896.5 ms ±  79.0 ms    [User: 404.9 ms, System: 202.6 ms]
  Range (min … max):   726.3 ms … 1075.0 ms    50 runs

Benchmark 2: ./target/release/uv (resolve-cold)
  Time (mean ± σ):     802.7 ms ±  58.6 ms    [User: 420.7 ms, System: 179.8 ms]
  Range (min … max):   722.5 ms … 953.8 ms    50 runs

Summary
  './target/release/uv (resolve-cold)' ran
    1.12 ± 0.13 times faster than './target/release/main (resolve-cold)'
puffin on  charlie/async-meta:main [$!⇡] is 📦 v0.1.6 via 🐍 v3.12.1 (.venv) via 🦀 v1.75.0 took 1m34s
❯ python -m scripts.bench \
        --uv-path ./target/release/main \
        --uv-path ./target/release/uv \
        ./scripts/requirements/black.in --benchmark resolve-cold --min-runs 50
Benchmark 1: ./target/release/main (resolve-cold)
  Time (mean ± σ):     380.8 ms ±  83.5 ms    [User: 102.9 ms, System: 39.9 ms]
  Range (min … max):   281.8 ms … 550.7 ms    50 runs

Benchmark 2: ./target/release/uv (resolve-cold)
  Time (mean ± σ):     325.3 ms ±  44.7 ms    [User: 113.8 ms, System: 26.9 ms]
  Range (min … max):   269.4 ms … 411.9 ms    50 runs

Summary
  './target/release/uv (resolve-cold)' ran
    1.17 ± 0.30 times faster than './target/release/main (resolve-cold)'

@charliermarsh charliermarsh merged commit 5d53040 into main Feb 21, 2024
7 checks passed
@charliermarsh charliermarsh deleted the charlie/async-meta branch February 21, 2024 03:12
charliermarsh added a commit that referenced this pull request Jul 16, 2024
## Summary

When range requests aren't supported, we fall back to streaming the
wheel, stopping as soon as we hit a `METADATA` file. This is a small
optimization, but the downside is that we don't get to cache the
resulting wheel...

We don't know whether `METADATA` will be at the beginning or end of the
wheel, but it _seems_ like a better tradeoff to download and cache the
entire wheel?

Closes: #5088.

Sort of a revert of: #1792.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use async zip streaming for non-range-request supporting registries
1 participant