Implement support for incremental lock resolves. #2044

jsirois · 2023-01-19T00:11:49Z

The basic idea here is to add a new --lock <PATH> flag to pex3 lock create and use the lock specified for pex3 lock update <PATH> to implement faster resolves by:

Create a venv from the existing lock that includes the configured Pip version.
Instead of performing an isolated pip download --log ..., perform a pip install --log in the venv created in step 1.
Merge the changes recorded in the pip log to the original lock file.

It turns out this is a good deal faster than performing isolated pip downloads.

The example from #2036 shows ~3x speedup with a warm re-lock today taking 22s vs a warm venv + pip install taking ~7s. This factor will likely drop closer to 2.5 once the additional overheads of processing lock diffs are added in, but it seems likely this will still net a significant win.

The text was updated successfully, but these errors were encountered:

thejcannon · 2023-07-05T18:13:07Z

Throwing 2 infos into the mix:

PEP 658 exists, and ought to make this kind of thing faster. However the current code of using pip download to download then extract the metadata doesn't allow Pex to leverage (directly or indirectly) the new information. I don't think leveraging PEP 658 is really possible from the API we're consuming from pip.
pip install --dry-run --report gives a nice report containing _almost) everything that goes into a Pex lockfile today (it's only missing hashes from VCS reqs). This has 3 niceties:
1. Almost all code responsibility is thrown over the fence to pip
2. It is internally able to leverage PEP 658
3. You can run it inside a venv. Therefore I think in your list of steps, 1 stays. 2 and 3 get changed (if possible)

Note that this ticket specifically is about incremental lock resolves. Using --report would speed up fresh lockfile installs as well.

jsirois · 2023-07-05T18:24:53Z

On this 1st bullet point, a typical resolve process with backtracks etc, might visit, say 10k nodes and the final solution set have 100 nodes. With Pip supporting PEP 658 and Pex supporting that pip, 99900 downloads are saved, 100 performed at the end. So there is a download savings by using a dry run report, but a relevant question is the 10k vs 100 - what is the typical savings - those numbers are for illustration and surely inaccurate. The other consideration is that the final set downloads are actually needed anyway in common workflows, so the time is not completely wasted. The pip download is serial vs a lock download later which is parallel, so there is definitely that.

jsirois added enhancement resolver feature request performance labels Jan 19, 2023

jsirois mentioned this issue Jan 19, 2023

lock create with "large" set of dependencies spends 95+% of time in sequential pip download #2036

Closed

cosmicexplorer mentioned this issue Aug 6, 2023

make use of pip install --report JSON output #2210

Open

4 tasks

jsirois mentioned this issue Feb 18, 2024

Consider using uv as an optional alternate resolver. #2371

Open

jsirois mentioned this issue Jul 28, 2024

Performance Tracking: generate-lockfiles pantsbuild/pants#21223

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement support for incremental lock resolves. #2044

Implement support for incremental lock resolves. #2044

jsirois commented Jan 19, 2023

thejcannon commented Jul 5, 2023

jsirois commented Jul 5, 2023

Implement support for incremental lock resolves. #2044

Implement support for incremental lock resolves. #2044

Comments

jsirois commented Jan 19, 2023

thejcannon commented Jul 5, 2023

jsirois commented Jul 5, 2023