Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New resolver: Build automated testing to check for acceptable performance #8664

Open
brainwane opened this issue Jul 30, 2020 · 51 comments
Open
Labels
C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion

Comments

@brainwane
Copy link
Contributor

brainwane commented Jul 30, 2020

Our new dependency resolver may make pip a bit slower than it used to be.

Therefore I believe we need to pull together some extremely rough speed tests and decide what level of speed is acceptable, then build some automated testing to check whether we are meeting those marks.

I just ran a few local tests (on a not-particularly-souped-up laptop) to do a side-by-side comparison:

$ time pip install --upgrade pip
Requirement already up-to-date: pip in [path]/python3.7/site-packages (20.2)

real	0m0.867s
user	0m0.680s
sys	0m0.076s

$ time pip install --upgrade pip --use-feature=2020-resolver
Requirement already satisfied: pip in [path]/python3.7/site-packages (20.2)

real	0m1.243s
user	0m0.897s
sys	0m0.060s

Or, in 2 different virtualenvs:

$ time pip install --upgrade chardet
Requirement already up-to-date: chardet in [path].virtualenvs/form990/lib/python3.7/site-packages (3.0.4)

real	0m0.616s
user	0m0.412s
sys	0m0.053s

$ time pip install --upgrade chardet --use-feature=2020-resolver
Requirement already satisfied: chardet in [path].virtualenvs/ical3/lib/python3.7/site-packages (3.0.4)

real	0m1.137s
user	0m0.404s
sys	0m0.053s

These numbers will add up with more complicated processes, dealing with lots of packages at a time.

Related to #6536 and #988.


Edit by @brainwane: As of November 2020 we have defined some speed goals and the new resolver has acceptable performance, so I've switched this issue to be about building automated testing to ensure that we continue to meet our goals in the future.


Edit by @uranusjr: Some explanation for people landing here. The new resolver is generally slower because it checks the dependency between packages more rigorously, and tries to find alternative solutions when dependency specifications do not meet. The legacy resolver, on the other hand, just picks the one specification it likes best without verifying, which of course is faster but also irresponsible.

Feel free to post examples here if the new resolver runs slowly for your project. We are very interested in reviewing all of them to identify possible improvements. When doing so, however, please make sure to also include the pip install output, not just your requirements.txt. The output is important for us to identify what pip is spending time for, and suggest workarounds if possible.

@brainwane brainwane added state: needs discussion This needs some more discussion C: new resolver labels Jul 30, 2020
@stefanv
Copy link

stefanv commented Jul 30, 2020

For SkyPortal, we use pip to verify that all required Python packages are present. This takes about 2 seconds on the old pip, and 20 with the resolver enabled.

Will it be possible to revert to the old behavior in the future (i.e., switch off the resolver?).

pip install -r requirements.txt  1.16s user 0.20s system 54% cpu 2.492 total
pip install -r requirements.txt --use-feature=2020-resolver  16.67s user 0.26s system 84% cpu 20.057 total

Perhaps it would be possible to do a quick "first check" to see if all packages just happen to satisfy requirements, and if they don't to only then enable the resolver?

Our requirements.txt:

supervisor>=4
numpy>=1.12.1
scipy>=0.16.0
pandas>=0.17.0
dask>=0.15.0
joblib>=0.11
seaborn>=0.10.0
bokeh==0.12.9
pytest-randomly>=2.1.1
factory-boy==2.11.1
astropy>=4.0
aplpy>=1.1.1
reproject>=0.7
avro-python3==1.8.2
fastavro==0.21.7
tqdm>=4.23.2
matplotlib>=3
astroquery>=0.4
sqlalchemy-utils
apispec>=3.2.0
marshmallow>=3.4.0
marshmallow-sqlalchemy>=0.21.0
marshmallow-enum>=1.5.1
Pillow>=6
sncosmo>=2.1.0
tdtax>=0.1.1
healpix-alchemy>=0.1.2
jsonschema
jsonpath_ng>=1.5.1
pytest-rerunfailures>=9.0
astroplan>=0.6

@uranusjr
Copy link
Member

uranusjr commented Jul 30, 2020

I believe both resolvers already do a scan to check whether packages are already satisfied. The problem is the new resolver is slower to determine what are needed to satisfy the dependencies, since the checks are much more involved than the naive legacy logic.

In your particular use case, if you always list all requirements (instead of relying on pip to discover transient dependencies), you can use the --no-deps option to skip the dependency discovery part entirely, which would make the operation lightning fast in both implementations. (OK, that’s an exaggeration, nothing in pip is lightning fast. But it’ll be a lot faster.)

@sbidoul
Copy link
Member

sbidoul commented Aug 1, 2020

I reported a similar performance issue in #8675.

@minusf
Copy link

minusf commented Aug 18, 2020

i have simplifed the test case a bit and measure only the pip running time. download times are not included as prefer binary is used. requirements:

Django==3.0.9
django-auth-ldap
django-cors-headers
django-debug-toolbar
django-extensions
django-uwsgi
django-haystack==3.0b2
hyperkitty==1.3.3
mailman
mailman-hyperkitty
postorius
psycopg2-binary==2.8.5
supervisor
uWSGI==2.0.19.1
whoosh

Please find the 2 log files in this gist: https://gist.github.com/minusf/bd0edfeaf5975980917f2d0792677b52

old: sh -x tvenv.sh 2>&1  19.82s user 6.60s system 92% cpu 28.517 total

new: sh -x tvenv.sh 2>&1  59.04s user 7.78s system 95% cpu 1:10.22 total

@stefanv
Copy link

stefanv commented Aug 18, 2020

@uranusjr I don't see the --no-deps option listed in the help, even when enabling the new resolver. Is this expected?

@pradyunsg
Copy link
Member

@stefanv It's the 4th option in pip install --help's "Install Options" section in basically any reasonably new pip (say >= 20.0).

$ pip install --help

Usage:
[snipped for brevity]
Description:
[snipped for brevity]
Install Options:
  --no-clean                  Don't clean up build directories.
  -r, --requirement <file>    Install from the given requirements file. This option can be used multiple times.
  -c, --constraint <file>     Constrain versions using the given constraints file. This option can be used multiple times.
  --no-deps                   Don't install package dependencies.
[snipped for brevity]

@stefanv
Copy link

stefanv commented Aug 18, 2020

Thanks @pradyunsg! But I see now that this would cause problems too, since it would require us to list all dependencies in our requirements.txt file.

@tlandschoff-scale
Copy link

We also tried to enable the new resolver (and actually fixed a number of dependency conflicts by using it, so that's good!)

But the performance is abysmal in the usual developer case where after switching git branches I'll just run pip install -r requirements.txt so ensure everything is at the expected version.

Compare the runtime of the new resolver:

$ /usr/bin/time pip install --use-feature=2020-resolver -r requirements.txt
77.60user 0.75system 3:38.77elapsed 35%CPU (0avgtext+0avgdata 249740maxresident)k
19160inputs+18912outputs (0major+87724minor)pagefaults 0swaps

with the runtime of the old resolver:

$ /usr/bin/time pip install -r requirements.txt
1.75user 0.12system 0:01.87elapsed 100%CPU (0avgtext+0avgdata 41828maxresident)k
40inputs+8outputs (0major+25404minor)pagefaults 0swaps

Granted, this is for 211 installed packages. Many of those are internal and rely on other internal and PyPI packages, so the dependency graph is far from trivial. But a slowdown by a factor of ~100 appears a bit too much.

Interestingly, the new resolver takes 1-2 seconds to check each already installed package and even checks many packages multiple times:

$ cat pip-output.txt | sort | uniq -c | sort -n | tail -n 20
      4 Requirement already satisfied: cython==0.23.4 in ./.../site-packages (from -r etc/requirements/common.txt (line 14)) (0.23.4)
      4 Requirement already satisfied: dynapp==2.7.3 in ./.../site-packages (from -r etc/requirements/common.txt (line 16)) (2.7.3)
      4 Requirement already satisfied: requests==2.21.0+scale1 in ./.../site-packages (from -r etc/requirements/common.txt (line 48)) (2.21.0+scale1)
      4 Requirement already satisfied: scale.toolbelt==0.1.0 in ./.../site-packages (from -r etc/requirements/common.txt (line 69)) (0.1.0)
      5 Requirement already satisfied: enum34 in ./.../site-packages (from cryptography==1.3.4+scale1->-r etc/requirements/common.txt (line 13)) (1.1.6)
      5 Requirement already satisfied: lxml==3.8.0 in ./.../site-packages (from -r etc/requirements/common.txt (line 25)) (3.8.0)
      5 Requirement already satisfied: python-dateutil==2.7.5 in ./.../site-packages (from -r etc/requirements/loco2-common.txt (line 5)) (2.7.5)
      5 Requirement already satisfied: pytz in ./.../site-packages (from spyne==2.9.3+scale5->-r etc/requirements/common.txt (line 94)) (2018.7)
      5 Requirement already satisfied: scale.util.pubsub==1.3.0 in ./.../site-packages (from -r etc/requirements/common.txt (line 78)) (1.3.0)
      6 Requirement already satisfied: cryptography==1.3.4+scale1 in ./.../site-packages (from -r etc/requirements/common.txt (line 13)) (1.3.4+scale1)
      6 Requirement already satisfied: scale.program-manager==1.1.4 in ./.../site-packages (from -r etc/requirements/common.txt (line 64)) (1.1.4)
      6 Requirement already satisfied: scale.util.event==2.0.2 in ./.../site-packages (from -r etc/requirements/common.txt (line 70)) (2.0.2)
      6 Requirement already satisfied: scale.util.progress-monitor==1.3.2 in ./.../site-packages (from -r etc/requirements/common.txt (line 77)) (1.3.2)
      7 Requirement already satisfied: scale.util.threadutil==1.2.1 in ./.../site-packages (from -r etc/requirements/common.txt (line 80)) (1.2.1)
      7 Requirement already satisfied: setuptools>=11.3 in ./.../site-packages (from cryptography==1.3.4+scale1->-r etc/requirements/common.txt (line 13)) (44.1.1)
      8 Requirement already satisfied: pytest==3.7.4 in ./.../site-packages (from -r etc/requirements/common-test.txt (line 18)) (3.7.4)
      8 Requirement already satisfied: scale.util.exception==1.2.0 in ./.../site-packages (from -r etc/requirements/common.txt (line 71)) (1.2.0)
      9 Requirement already satisfied: scale.util.osutils==1.2.3 in ./.../site-packages (from -r etc/requirements/common.txt (line 76)) (1.2.3)
     10 Requirement already satisfied: scale.util.i18n==2.0.0 in ./.../site-packages (from -r etc/requirements/common.txt (line 74)) (2.0.0)
     15 Requirement already satisfied: six==1.12.0 in ./.../site-packages (from -r etc/requirements/common.txt (line 93)) (1.12.0)

@adamchainz
Copy link

I benchmarked on a medium-sized Django project and found the slowdown was from 1.6 seconds to 41 seconds (again when all packages are already installed locally at the correct versions):

$ wc -l requirements.txt
     171 requirements.txt
$ time python -m pip install --no-deps -r requirements.txt
Requirement already satisfied: ...
python -m pip install --no-deps -r requirements.txt  0.65s user 0.10s system 92% cpu 0.809 total
$ time python -m pip install -r requirements.txt
Requirement already satisfied: ...
python -m pip install -r requirements.txt  1.52s user 0.12s system 99% cpu 1.646 total
$ time python -m pip install --use-feature=2020-resolver -r requirements.txt
Requirement already satisfied: ...
python -m pip install --use-feature=2020-resolver -r   37.05s user 0.69s system 91% cpu 41.121 total

I profiled the project with py-spy:

$ sudo py-spy record --threads --idle --rate 1000 --format speedscope --output pip-install.speedscope /path/to/project/venv/bin/pip install -- --use-feature=2020-resolver -r requirements.txt

This resulted in a speedscope file - see attached. It can be used at https://www.speedscope.app/ to investigate the profile.

pip-install-redacted.speedscope.zip

Most of the time - 93,547 out of 99,225 frames - was unsurprisingly under Resolver.resolve():

Screenshot_2020-09-15 py-spy profile - speedscope(1)

Tracing it down I noticed there are a lot of invocations of parse_links. From this I surmised that the new resolver is hitting parsing HTML a lot.

Indeed when I turned off my internet connection and tried again, using --no-deps or the old resolver, pip install can succeed entirely with the local set of information. But the new resolver makes requests that fail almost immediately - trying to get the PyPI page for a requirement after already printing that it has been satisfied:

$ time python -m pip install --use-feature=2020-resolver -r requirements.txt
Requirement already satisfied: aiohttp==3.6.2 in ./venv/lib/python3.8/site-packages (from -r requirements.txt (line 7)) (3.6.2)
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x104ccd940>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /simple/aiohttp/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x104ccd820>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /simple/aiohttp/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x104ccda00>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /simple/aiohttp/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x104ccd580>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /simple/aiohttp/
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x104ccd6a0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /simple/aiohttp/
Requirement already satisfied: aioredis==1.3.1 in ./venv/lib/python3.8/site-packages (from -r requirements.txt (line 8)) (1.3.1)

These requests don't seem necessary as the resolution continues just fine (although I didn't wait until the end)... I hope this can help.

@adamchainz
Copy link

I was just told about --use-feature=fast-deps to use range requests for dependencies. I tried combining it with the new resolver but it didn't make it any faster:

$ time python -m pip install --use-feature=2020-resolver --use-feature=fast-deps -r requirements.txt
WARNING: pip is using lazily downloaded wheels using HTTP range requests to obtain dependency information. This experimental feature is enabled through --use-feature=fast-deps and it is not ready for production.
Requirement already satisfied: ...
python -m pip install --use-feature=2020-resolver --use-feature=fast-deps -r   37.98s user 0.83s system 90% cpu 42.658 total

@antoncohen
Copy link

Pip version: 20.2.3
Python version: 3.8.1
Number of requirements: 269
Computer: MacBook Pro with Core i9
Internet: ~170 Mbps home internet in San Francisco

With a fully frozen requirements.txt (all packages specified, all with ==), that installs 269 requirements, in a virtual environment where all the requirements are already satisfied:

Classic resolver:

$ time pip install -r requirements.txt

real	0m1.635s
user	0m1.483s
sys	0m0.144s

2020-resolver:

$ time pip --use-feature=2020-resolver install -r requirements.txt

real	4m5.993s
user	1m56.048s
sys	0m3.692s

2020-resolver with --no-deps:

$ time pip --use-feature=2020-resolver install --no-deps -r requirements.txt

real	1m33.484s
user	0m42.211s
sys	0m1.923s

That is 2 seconds for the classic resolver, 246 seconds for 2020-resolver (123x slower), 94 seconds for 2020-resolver with --no-deps (47x slower). Poetry does the same in about 8 seconds (I realize Poetry is different because it keeps a fully resolved local lock file).

I really like what the 2020-resolver does. I'd be happy to take the performance penalty in CI, in a new virtual env, to ensure correctness. But for local development, where users may be expected to run tox to update multiple existing virtual envs and run tests against multiple versions of Python, adding 4 minutes per virtual env isn't very nice.

I don't want --no-deps, I want something more like --no-deps-for-already-satisfied-requirements. It can be hard with pip to have a fully frozen requirements.txt when supporting multiple versions of Python, because some packages have "fancy" setup.py files that calculate requirements in Python instead of using environment markers. So for new requirements (not already satisfied in the virtual env) I want deps to be installed. But for existing packages there is no need to calculate deps because the package and its deps are already installed.

@pfmoore
Copy link
Member

pfmoore commented Sep 22, 2020

@antoncohen Can I just check I understand your example here? You have a requirements.txt that contains a list of every package that gets installed, with an exact equality constraint forcing precisely one version for every one. Every package is already installed, so there's nothing for pip to do? That seems like on the one hand, it's a completely artificial case, so not representative of real-world situations, but on the other hand, something that the new resolver should certainly be able to handle better than what you're seeing.

Assuming I haven't misunderstood, there's something odd going on here. If we have a requirement foo==1.0.0 that should generate only a single candidate, because packages are unique by name and version. Once we see that requirement, the resolver should have a solution set with just that one version in it, and there's no choices to be made. All of the requirements in requirements.txt are in the root set, so they should be applied first - so even though the finder may return multiple candidates, we'll drop them straight away.

If my reasoning above is correct, then we should never even see candidates that don't get installed.

As everything is pre-installed, we should pick the installed version over a new install, and we can get metadata by a simple filesystem lookup. So there's no need to go to PyPI (or any other index) at all.

It's possible there's a genuine bug here, and the resolver is not constraining the candidates based on the root set early enough. To prove that, we'd likely need to instrument a run of the problem case and see exactly what order the code is doing things. That would mean getting a reproducible example, though.

If your test case is genuinely made up of fully pinned requirements (foo==1.0) and no local directories, URL/VCS links, etc, then it should be reproducible, or at least it should be possible to create an artificial version of it. To help me try to create a reproducer for this, could you post somewhere the actual requirements file you used, and the full dependency list of every package in that list? (You can get the dependency data from the installed metadata using something like grep Requires-Dist (dir $env:VIRTUAL_ENV\Lib\site-packages\*.dist-info\METADATA) - that's Powershell syntax, Unix shouldn't be that dissimilar).

If I've misunderstood how your example is set up, my analysis above is wrong. In that case don't bother with the detail data. But I would be glad to know what I didn't understand about your test case 🙂

@adamchainz
Copy link

You have a requirements.txt that contains a list of every package that gets installed, with an exact equality constraint forcing precisely one version for every one. Every package is already installed, so there's nothing for pip to do? That seems like on the one hand, it's a completely artificial case, so not representative of real-world situations, but on the other hand, something that the new resolver should certainly be able to handle better than what you're seeing.

I know you weren't asking, but this is the case I tested. I don't think it's 'completely artificial' - I quite often run the command when switching branches or pulling latest changes on a project just in case dependencies changed.

@pfmoore
Copy link
Member

pfmoore commented Sep 22, 2020

I know you weren't asking, but this is the case I tested. I don't think it's 'completely artificial' - I quite often run the command when switching branches or pulling latest changes on a project just in case dependencies changed.

But the example has everything, including dependencies, pinned. So "just in case dependencies changed" doesn't apply. It's a situation where we know absolutely, up front, that pip won't install anything. Or are you expecting to get pip install fail with a "cannot resolve" error as a way of reporting that the dependencies changed?

Unless I'm misunderstanding 🙂

Anyway, the main point here is that we really need a reproducible test case. At the moment, we don't have one, so I'm mostly just trying to get enough information to construct one (that can be run with all local files, so we can avoid network/cache effects).

@tlandschoff-scale
Copy link

So "just in case dependencies changed" doesn't apply.

He ist taking about the dependencies listed in the requirements.txt of his project. Changing branches will change the contents of requirements.txt, so to work on the new branch one has to update the virtual environment to work in.

I do that about 20 times a day. And sometimes I forget to run it with the result that our application does not come up. For this reason some colleagues like to add a hook to git to run pip install -r requirements.txt after each checkout.

@pfmoore
Copy link
Member

pfmoore commented Sep 22, 2020

But again, the example I was responding to said "in a virtual environment where all the requirements are already satisfied". So again this is a different situation.

I don't want to dismiss your use cases. They are just harder to analyze, because if pip might find it needs to install things, that introduces extra work the resolver has to do. The significant advantage of @antoncohen's case is that it doesn't have those complexities, making it easier to analyze. The disadvantage is that it is (or at least seems to be) more unrealistic than the sorts of case you're talking about.

@dstufft
Copy link
Member

dstufft commented Sep 22, 2020

If we have a requirement foo==1.0.0 that should generate only a single candidate, because packages are unique by name and version.

This isn't exactly True right? I haven't dug into the new code at all yet, but presumably you can have an sdist and multiple wheels that all have the same version and different dependencies? Also in PEP 440, ==1.0.0 can match multiple versions, if local versions are being used (banned on PyPI).

@pfmoore
Copy link
Member

pfmoore commented Sep 22, 2020

@dstufft Yes, it's not entirely accurate. I'm assuming no local versions are involved. And the finder will give back a list of compatible files for the given version, but we should pick just one to hand to the resolver, based on things like --prefer-binary etc. The resolver should only see one file, though, it doesn't allow for different "candidate" objects for the same name/version.

Apologies, I'm doing this from memory at the moment, it's a few weeks since I've gone into the code in depth. My main interest at the moment is pinning down the reported behaviour well enough to replicate it locally. Once I've got that, I'd intend to fire a test case at an instrumented version of the code, and really dig into precisely what's happening.

To get the sort of slowdowns being reported suggests that the resolver is backtracking badly, or otherwise doing a lot of unnecessary work. If the situation is as described, that may be a bug - because the described situation is so constrained that there's nothing to backtrack to. So either we have a bug or the description is failing to make clear where the source of additional options is coming from. Hopefully someone can come up with enough detail that we can establish which is the case here.

@antoncohen
Copy link

@pfmoore, thanks for the response!

If your test case is genuinely made up of fully pinned requirements (foo==1.0) and no local directories, URL/VCS links

In my initial test 6 of the 200+ requirements were directory tarballs, I consider them frozen because they are referenced by hash and don't change. I removed them so truly 100% of requirements.txt is fully pinned like foo==1.0. The result is the same:

$ time pip --use-feature=2020-resolver install -r requirements-no-tar.txt

...
Requirement already satisfied: google-auth==1.21.2 in /path/to/lib/python3.8/site-packages (from -r requirements-no-tar.txt (line 84)) (1.21.2)
Requirement already satisfied: pytz==2017.2 in /path/to/lib/python3.8/site-packages (from -r requirements-no-tar.txt (line 211)) (2017.2)
Requirement already satisfied: protobuf==3.13.0 in /path/to/lib/python3.8/site-packages (from -r requirements-no-tar.txt (line 169)) (3.13.0)
...

real	4m10.779s
user	1m54.720s
sys	0m3.405s

The output is all "Requirement already satisfied", and every line of "Requirement already satisfied" takes about a second.

You have a requirements.txt that contains a list of every package that gets installed, with an exact equality constraint forcing precisely one version for every one. Every package is already installed, so there's nothing for pip to do? That seems like on the one hand, it's a completely artificial case, so not representative of real-world situationsions

Everyone has different use cases. But in my experience, dealing with applications that get deployed to production, this is the 99% use case. Every production application that uses requirements.txt will usually have a fully pinned and resolved requirements.txt. Usually that requirements.txt is generated from looser constraints with something like pip-compile, poetry export, or pipenv lock -r.

In local development there is almost always an existing virtual environment with dependencies installed. pip install is used to update the dependencies. Most of the time there will be no updates, some of the time there will be a few updates. Installing all packages would be rare. But you don't know what packages need installing until pip install checks.

In CI often times there will be fresh installs. But when people try to optimize CI build times they might end up caching virtual envs or layering images.

To help me try to create a reproducer for this, could you post somewhere the actual requirements file you used, and the full dependency list of every package in that list?

I can't provide the exact requirements.txt because it includes private packages. But I can construct one. I searched Google for [django open source projects], found taiga-back, grabbed their requirements.txt, added a bunch of random large packages, and used Poetry to export a locked requirements.txt.

One important note, my testing that takes 4 minutes has some packages that come from a private PyPI repo. I noticed that even if no packages come from the private PyPI, having --extra-index-url makes the pip install take 2x longer. So I searched Google for [pypi simple] and found a public mirror (Alibaba Cloud) to use for testing with a second PyPI.

This gist contains the requirements.txt and the grep Requires-Dist *.dist-info/METADATA output:

https://gist.github.com/antoncohen/ace9499dc881fc472873c4c0da97663c

Here are the timings I got:

No extra-index-url:

$ time pip --use-feature=2020-resolver install -r random-django-requirements.txt

real	0m34.021s
user	0m28.477s
sys	0m0.503s

With extra-index-url:

$ time pip --use-feature=2020-resolver install --extra-index-url https://mirrors.aliyun.com/pypi/simple -r random-django-requirements.txt

real	1m9.045s
user	0m56.234s
sys	0m0.870s

Classic resolver:

$ time pip install --extra-index-url https://mirrors.aliyun.com/pypi/simple -r random-django-requirements.txt

real	0m0.992s
user	0m0.856s
sys	0m0.125s

Our actual requirements.txt is twice as large, and our private PyPI is probably slower than Alibaba Cloud. But hopefully this example where it takes over a minute will be helpful.

@tlandschoff-scale
Copy link

The output is all "Requirement already satisfied", and every line of "Requirement already satisfied" takes about a second.

Same here for our internal project.

I noticed that even if no packages come from the private PyPI, having --extra-index-url makes the pip install take 2x longer.

That's good to know because we use an internal devpi install to take some load from pypi.org and to provide our internal packages. pip of course checks both indexes as it seems - not sure if it is possible to disable pypi lookups?!

I can't provide the exact requirements.txt because it includes private packages. But I can construct one.

Same here. Maybe I could provide it but not the packages...

But I constructed a sufficiently large requirements file by taking an older project, dropping private packages and adding some from pypi. For reproducability I created a Dockerfile to run this independent of the local setup.

You can find Dockerfile and requirements.txt here: https://gist.github.com/tlandschoff-scale/83a95661e40bf4b51c32c0f990e15a37

Run time here:

Step 8/9 : RUN echo "This is the old resolver:" && time pip install -r requirements.txt
 ---> Running in ac4b41b4498b
This is the old resolver:
...
2.17user 0.16system 0:02.44elapsed 95%CPU (0avgtext+0avgdata 43108maxresident)k
456inputs+344outputs (1major+16502minor)pagefaults 0swaps

compared with the new resolver:

Step 9/9 : RUN echo "This is the new resolver:" && time pip install --use-feature=2020-resolver -r requirements.txt
 ---> Running in 2bd39d69ad0b
This is the new resolver:
40.78user 0.41system 0:49.47elapsed 83%CPU (0avgtext+0avgdata 77680maxresident)k
0inputs+16768outputs (0major+31328minor)pagefaults 0swaps

Out of curiosity I added the extra index from Alibaba and did an extra run:

Step 10/10 : RUN echo "This is the new resolver, extra index:" && time pip install --extra-index-url https://mirrors.aliyun.com/pypi/simple --use-feature=2020-resolver -r requirements.txt
 ---> Running in b1536adc0e11
This is the new resolver, extra index:
Looking in indexes: https://pypi.org/simple, https://mirrors.aliyun.com/pypi/simple
...
82.18user 0.69system 4:02.02elapsed 34%CPU (0avgtext+0avgdata 102144maxresident)k
32inputs+33488outputs (0major+41613minor)pagefaults 0swaps
Removing intermediate container b1536adc0e11

@pfmoore
Copy link
Member

pfmoore commented Sep 23, 2020

Thanks @antoncohen for taking the time to provide a reproducer and for the explanation of your use case. Please understand, I'm not dismissing your situation at all, my only thought was that it may be sufficiently specialised that if we get into a trade-off where we have to make something else slower to speed this up, we will need to consider the question of what is the common case we should optimise (and that's always very hard to determine, as we get very conflicting reports of what counts as the "common case" from people with radically different workflows).

I'll do some investigation of your reproducer over the next few days and see what I can find.

@adamchainz
Copy link

adamchainz commented Sep 23, 2020

requirements.txt
Here's my reproduction. Attached is the requirements.txt file - everything is pinned with == thanks to pip-compile.

Setup, on Python 3.8.5:

python -m venv venv
source venv/bin/activate
python -m pip install -U pip wheel
python -m pip install -r requirements.txt

Again testing, with old resolver:

$ time python -m pip install -r requirements.txt
Requirement already satisfied: aiohttp==3.6.2 in ./venv/lib/python3.8/site-packages (from -r requirements.txt (line 7)) (3.6.2)
...
Requirement already satisfied: pip>=10.0.0 in ./venv/lib/python3.8/site-packages (from pip-lock==2.1.1->-r requirements.txt (line 101)) (20.2.3)
python -m pip install -r requirements.txt  1.59s user 0.18s system 91% cpu 1.935 total

With new resolver:

$ time python -m pip install --use-feature=2020-resolver -r requirements.txt
Requirement already satisfied: aiohttp==3.6.2 in ./venv/lib/python3.8/site-packages (from -r requirements.txt (line 7)) (3.6.2)
...
Requirement already satisfied: hyperlink==20.0.1 in ./venv/lib/python3.8/site-packages (from -r requirements.txt (line 66)) (20.0.1)
python -m pip install --use-feature=2020-resolver -r requirements.txt  39.06s user 0.95s system 90% cpu 44.422 total

@pradyunsg
Copy link
Member

pradyunsg commented Sep 23, 2020

Spending some more time to debug this... pip's new resolver is hitting the network even when the currently installed version does satisfy the version requested. Further, it's also hitting the same index page (i.e. https://pypi.org/simple/{project}) each time we see it during the graph exploration, which is obviously the wrong thing to do.

That's 100% a genuine bug, and I'll file a new issue for tracking that.

@sbidoul
Copy link
Member

sbidoul commented Oct 25, 2020

@uranusjr I see thanks for the explanation. Do you think the same reasoning could explain a catastrophic degradation of pip wheel -r requirements.txt --no-deps --use-feature=2020-resolver on python 2, with all wheels in cache ? I'm examining a case right now that seems to take forever (> 3 hours) on python 2, while similar cases work fine on python 3.

Is it the plan to make the new resolver the default for python 2 too ?

@uranusjr
Copy link
Member

I’m not sure, TBH I don’t really personally use pip wheel much myself (and almost always with --no-deps when I do), and don’t really understand its internals to say what it’s doing differently to cause performance degradation not present in pip install.

@sbidoul
Copy link
Member

sbidoul commented Oct 26, 2020

@uranusjr besides the python 2 issue, I see the new resolver has a visible performance impact on pip wheel under python 3 too, even when using --no-deps. I've not had time to dig into the issue enough to pinpoint it. If you are interested I can PM you a reproducer.

@uranusjr
Copy link
Member

That would be awesome. Are you on Zulip? It should be easiest to DM there since all of the people working on the resolver can be reached.

@brainwane
Copy link
Contributor Author

I definitely want us to know more and look into the pip wheel performance issue.

Per our Python 2 support policy, pip 20.3 users who are using Python 2 and who have trouble with the new resolver can choose to switch to the old resolver behavior using the flag --use-deprecated=legacy-resolver. Then in pip 21.0 in January 2021 this question will be moot as pip will drop support for Python 2 altogether.

@McSinyx
Copy link
Contributor

McSinyx commented Oct 27, 2020

@sbidoul, may I take a look at the reproducer for pip wheel's performance regression as well?

In addition, I'll be profiling pip's basic functionalities (comparing when legacy and new resolver used) in the next few days. It's for a course at university (scientific communication) so there'll be quite some time and human resource to take a deeper look—is there anything anyone here wants us to focus on, otherwise we'll just go for {install,download,wheel} of the combination of the most popular packages?

@sbidoul
Copy link
Member

sbidoul commented Oct 28, 2020

@McSinyx I sent you the reproducer too.

@pradyunsg
Copy link
Member

pradyunsg commented Nov 1, 2020

Update:

@brainwane
Copy link
Contributor Author

Based on the benchmarking and progress from the past several weeks I believe pip's performance with the new resolver is now fine to ship as default. Moving to "needs triage" so we can decide whether to close, or to refactor this issue into something more useful for the next phase.

@McSinyx
Copy link
Contributor

McSinyx commented Nov 2, 2020

I believe pip's performance with the new resolver is now fine to ship as default.

My friends and I have just run benchmark and the result agrees with this 100%: as of 20.3.0b1, there's virtually no difference in performance between the two resolver. Here is our poster—it's far from perfect and we would love to have feedback on our work since it's the first time we do a scientific poster. Please feel more than free to use it to promote the new resolver roll-out process!

@bersbersbers
Copy link

bersbersbers commented Nov 12, 2020

The current release of pip, pip 20.2.4, includes the performance improvements from #8932 and #8912 plus a few other improvements, so please feel free to try it out and to spread the word.

I have posted an example where the new resolver in 20.2.4 is about 6 times as slow as the old one (55 vs 9 seconds) in #9126. This difference is down to less than 2 times (16 seconds) in the current dev version.

@bersbersbers
Copy link

My friends and I have just run benchmark and the result agrees with this 100%: as of 20.3.0b1, there's virtually no difference in performance between the two resolver.

I don't really agree with your summary.

First, your requirement sets are tiny (9 at most), so it's hard to draw conclusion on larger ones as effort may increase superlinearly.

Then, Figure 1. If you disable the download cache, you are including download times in your measurements, and this will dominate execution times.

Finally, Figure 2. I have found no way to use the old resolver in 20.3.0b1, so I think you are comparing apples with basically the same apples. It's no surprise to me you don't see a difference.

@McSinyx
Copy link
Contributor

McSinyx commented Nov 12, 2020

First, your requirement sets are tiny (9 at most), so it's hard to draw conclusion on larger ones as effort may increase superlinearly.

Agreed, the use case I examined is different from your use case: one is what people does on their work stations (incremental installations) and one is recreating an environment. I don't think the poster is anyhow complete but it might give an end-user an idea of what to expect. I suppose for really long requirement sets GH-9082 might be one of the reason for the poorer performance.

If you disable the download cache, you are including download times in your measurements, and this will dominate execution times.

Yes, but apparently 20.2.4 did even more downloads that make it a lot slower in many cases: while in 20.3.0 it's almost the graph if the identity function, 20.2.4 is obviously above it:

2020-11-12T18:20:15

(I'm sorry the the graph is not very straightforwardly annotated, it should be interpreted as new resolver performance in 20.2.4 and 20.3.0b1 compared to low resolver performance (which doesn't really change in the last many months.)

I have found no way to use the old resolver in 20.3.0b1

IIRC you can use --use-deprecated=legacy-resolver to force the legacy resolver and FYI the old resolver figures are from 20.2.4, regardless if compared to 20.2.4's or 20.3.0b1's new resolver.

@joshlk
Copy link

joshlk commented Nov 17, 2020

Just a note as it took me a while to find, you need to install pip version 20.3.0b1 to be able to use the --use-deprecated=legacy-resolver switch.

@brainwane brainwane changed the title New resolver: What performance is acceptable, and are we there yet? New resolver: Build automated testing to check for acceptable performance Nov 18, 2020
@jcrist
Copy link

jcrist commented Nov 30, 2020

Note: I was urged to comment here about our experience from twitter.

We (prefect) are a bit late on testing the new resolver (only getting around to it with the 20.3 release). We're finding that install times are now in the 20+ min range (I've actually never had one finish), previously this was at most a minute or two. The issue here seems to be in the large search space (prefect has loads of optional dependencies, for CI and some docker images we install all of them) coupled with backtracking.

I enabled verbose logs to try to figure out what the offending package(s) were but wasn't able to make much sense of them. I'm seeing a lot of retries for some dependencies with different versions of setuptools, as well as different versions of boto3. For our CI/docker builds we can add constraints to speed things up (as suggested here), but we're reluctant to increase constraints in our setup.py as we don't want to overconstrain downstream users. At the same time, we have plenty of novice users who are used to doing pip install prefect[all_extras] - telling them they need to add additional constraints to make this complete in a reasonable amount of time seems unpleasant. I'm not sure what the best path forward here is.

I've uploaded verbose logs from one run here (killed after several minutes of backtracking). If people want to try this themselves, you can run:

pip install "git+https://github.com/PrefectHQ/prefect.git#egg=prefect[all_extras]"

Any advice here would be helpful - for now we're pinning pip to 20.2.4, but we'd like to upgrade once we've figured out a solution to the above. Happy to provide more logs or try out suggestions as needed.

Thanks for all y'all do on pip and pypa!

@brainwane
Copy link
Contributor Author

To keep things a bit easier to manage: we're going to have this issue (#8664) be about building automated testing to check for acceptable performance, and we've made #9187 the issue to "centralize incoming reports of situations that seemingly run for a long time" - including the question in #9187 (comment) :

Do we have a good sense of whether these cases where it takes a really long time to solve are typically cases where there is no answer and it's taking a long time to exhaustively search the space because our slow time per candidate means it takes hours.. or are these cases where there is a successful answer, but it just takes us awhile to get there?

Donald moved a relevant comment from here to there #9187 (comment) . Sorry for accidentally misdirecting you @jcrist!

@stefanv
Copy link

stefanv commented Dec 8, 2020

I just wanted to close the loop here from the SkyPortal side. When the new beta resolver was made available, it was unworkable for us. @brainwane reached out, filed this issue, and within a few months our problems were addressed.

A huge shoutout to the team for soliciting community feedback, taking it seriously, and doing such dedicated work to making pip better. I know that many (most?) of you are volunteers, and your efforts are so appreciated. 🙏

@pradyunsg pradyunsg added C: dependency resolution About choosing which dependencies to install and removed C: new resolver labels Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion
Projects
None yet
Development

No branches or pull requests