Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for incremental lock resolves. #2044

Open
jsirois opened this issue Jan 19, 2023 · 2 comments
Open

Implement support for incremental lock resolves. #2044

jsirois opened this issue Jan 19, 2023 · 2 comments

Comments

@jsirois
Copy link
Member

jsirois commented Jan 19, 2023

The basic idea here is to add a new --lock <PATH> flag to pex3 lock create and use the lock specified for pex3 lock update <PATH> to implement faster resolves by:

  1. Create a venv from the existing lock that includes the configured Pip version.
  2. Instead of performing an isolated pip download --log ..., perform a pip install --log in the venv created in step 1.
  3. Merge the changes recorded in the pip log to the original lock file.

It turns out this is a good deal faster than performing isolated pip downloads.

The example from #2036 shows ~3x speedup with a warm re-lock today taking 22s vs a warm venv + pip install taking ~7s. This factor will likely drop closer to 2.5 once the additional overheads of processing lock diffs are added in, but it seems likely this will still net a significant win.

@thejcannon
Copy link
Contributor

Throwing 2 infos into the mix:

  • PEP 658 exists, and ought to make this kind of thing faster. However the current code of using pip download to download then extract the metadata doesn't allow Pex to leverage (directly or indirectly) the new information. I don't think leveraging PEP 658 is really possible from the API we're consuming from pip.
  • pip install --dry-run --report gives a nice report containing _almost) everything that goes into a Pex lockfile today (it's only missing hashes from VCS reqs). This has 3 niceties:
    1. Almost all code responsibility is thrown over the fence to pip
    2. It is internally able to leverage PEP 658
    3. You can run it inside a venv. Therefore I think in your list of steps, 1 stays. 2 and 3 get changed (if possible)

Note that this ticket specifically is about incremental lock resolves. Using --report would speed up fresh lockfile installs as well.

@jsirois
Copy link
Member Author

jsirois commented Jul 5, 2023

On this 1st bullet point, a typical resolve process with backtracks etc, might visit, say 10k nodes and the final solution set have 100 nodes. With Pip supporting PEP 658 and Pex supporting that pip, 99900 downloads are saved, 100 performed at the end. So there is a download savings by using a dry run report, but a relevant question is the 10k vs 100 - what is the typical savings - those numbers are for illustration and surely inaccurate. The other consideration is that the final set downloads are actually needed anyway in common workflows, so the time is not completely wasted. The pip download is serial vs a lock download later which is parallel, so there is definitely that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants