Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poetry cannot correctly select dependencies #5896

Closed
3 tasks done
mihirsamdarshi opened this issue Jun 22, 2022 · 14 comments
Closed
3 tasks done

Poetry cannot correctly select dependencies #5896

mihirsamdarshi opened this issue Jun 22, 2022 · 14 comments
Labels
kind/bug Something isn't working as expected

Comments

@mihirsamdarshi
Copy link

  • I am on the latest Poetry version.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option).
  • OS version and name: macOS 12.4
  • Poetry version: 1.1.13
[tool.poetry]
name = "poetry-fail"
version = "0.0.1"
description = "Repro of project that doesn't work"
authors = ["mihirsamdarshi"]

[tool.poetry.dependencies]
python = "^3.10"
bjoern = "^3.2.1"
boto3 = "^1.24.14"
boto3-stubs = { version = "^1.24.14", extras = ["ec2", "s3"] }
caper = "^2.2.0"
Flask = "^2.1.2"
Flask-Cors = "^3.0.10"
Flask-RESTful = "^0.3.9"
Flask-SQLAlchemy = "^2.5.1"
google-cloud-storage = "^2.4.0"
pandas = "^1.4.2"
PyMySQL = "^1.0.2"
requests = "^2.28.0"
smart-open = "^6.0.0"
SQLAlchemy = "^1.4.37"
Werkzeug = "^2.1.2"
wsgi-request-logger = "^0.4.6"

[tool.poetry.dev-dependencies]
pytest = "^7.1.2"
black = "^22.3.0"
pylint = "^2.14.3"
hypothesis = "^6.47.0"
jupyter = "^1.0.0"
flake8 = "^4.0.1"

Issue

With this particular pyproject.toml Poetry is unable to select a version of awscli, regardless of if I run poetry update, poetry install, or poetry lock

When running with -vvv it hangs with the following repeated message:

   1: derived: not awscli (==1.21.1)
   1: fact: awscli (1.21.0) depends on botocore (1.22.0)
   1: fact: awscli (1.21.0) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.21.0) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.21.0) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.21.0) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.21.0) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.21.0)
   1: fact: awscli (1.20.65) depends on botocore (1.21.65)
   1: fact: awscli (1.20.65) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.65) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.65) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.65) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.65) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.65)
   1: fact: awscli (1.20.64) depends on botocore (1.21.64)
   1: fact: awscli (1.20.64) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.64) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.64) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.64) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.64) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.64)
   1: fact: awscli (1.20.63) depends on botocore (1.21.63)
   1: fact: awscli (1.20.63) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.63) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.63) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.63) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.63) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.63)
   1: fact: awscli (1.20.62) depends on botocore (1.21.62)
   1: fact: awscli (1.20.62) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.62) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.62) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.62) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.62) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.62)
   1: fact: awscli (1.20.61) depends on botocore (1.21.61)
   1: fact: awscli (1.20.61) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.61) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.61) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.61) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.61) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.61)
   1: fact: awscli (1.20.60) depends on botocore (1.21.60)
   1: fact: awscli (1.20.60) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.60) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.60) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.60) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.60) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.60)
   1: fact: awscli (1.20.59) depends on botocore (1.21.59)
   1: fact: awscli (1.20.59) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.59) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.59) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.59) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.59) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.59)
   1: fact: awscli (1.20.58) depends on botocore (1.21.58)
   1: fact: awscli (1.20.58) depends on docutils (>=0.10,<0.16)
   1: fact: awscli (1.20.58) depends on s3transfer (>=0.5.0,<0.6.0)
   1: fact: awscli (1.20.58) depends on PyYAML (>=3.10,<5.5)
   1: fact: awscli (1.20.58) depends on colorama (>=0.2.5,<0.4.4)
   1: fact: awscli (1.20.58) depends on rsa (>=3.1.2,<4.8)
   1: derived: not awscli (==1.20.58)
@mihirsamdarshi mihirsamdarshi added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Jun 22, 2022
@dimbleby
Copy link
Contributor

dimbleby commented Jun 22, 2022

That's not a "repeated" message, poetry is (slowly) working its way through the many versions of awscli that are not compatible with the selections it has previously made at that point in the search.

There are some performance improvements in the latest beta that might help a bit, but your best bet is almost certainly to specify that you want some recent version of awscli. That will allow poetry to fail much faster - and backtrack the search and find a solution.

@mihirsamdarshi
Copy link
Author

Thanks, I mean that it was repeatedly trying to solve. Adding in the awscli dev and figuring out where the conflict lay did it.

@mkniewallner mkniewallner removed the status/triage This issue needs to be triaged label Jun 25, 2022
@mm-matthias
Copy link

We are having this problem since more than half a year. Last time clearing the poetry caches helped, this time we had to do more.
In our case we had PyYAML = "*" and awscli ="*" as dependencies. Poetry got stuck just as the OP outlined. When we removed PyYAML the (seemingly) infinite loop would end and the poetry.lock file was created in seconds. The other fix that worked was to use awscli = ">=1.25.26" which cut that infinite loop short. This is basically the same solution that @dimbleby suggested.

@mm-matthias
Copy link

I checked with the latest poetry version 1.2.0b2 and it did not help. Setting a minimum version for awscli was the only real solution for us.

@zyv
Copy link
Contributor

zyv commented May 9, 2023

@mihirsamdarshi @mkniewallner could you please explain why did you close this issue?

We are still facing the same problem with the latest Poetry release and it's reproducible in a stable way. The changed in the other issues didn't solve the problem and there is no other issue linked that is open.

If there is no other issue tracking this performance issue, could you please reopen this one?

zaytsev@parallels:~$ poetry --version
Poetry (version 1.4.2)

@mihirsamdarshi mihirsamdarshi reopened this May 9, 2023
@camerondavison
Copy link

I am not sure if this is the problem or not for this specific issue but i am on poetry 1.5.1 and am able to get poetry to spin for a long time in a brand new project by just running

poetry init -n
poetry add 'urllib3@*' 'boto3@^1'

my understanding is that if any package selects urllib3 that is in conflict with botocore (ie >= 2.0) then your stuck downloading every boto3 library between the first time that you added the library (because poetry put ^1.x.x on whatever day you originally added it) and now (boto3 publishes new versions very often)

would it be possible to do some kind of shallow discovery of libraries that publish a lot of versions in order to select a new transitive dependency version. in this case it still resolves after a long time of downloading 100s of boto versions but only because it rules out all versions of boto and restarts at the top with a lower urllib version.

@finswimmer
Copy link
Member

The dependency resolution is known to be quite slow of awscli and/or botot3 are involved as there are an enormous number of versions.

The options to improve this are limited by Poetry. Best way, as others explained above, is to limit the version range that Poetry should try.

@finswimmer finswimmer closed this as not planned Won't fix, can't repro, duplicate, stale Oct 30, 2024
@zyv
Copy link
Contributor

zyv commented Oct 30, 2024

@finswimmer, would you please be so kind as to comment on what exactly limits Poetry's options to improve this, so that people facing this problem can understand why it's closed as not planned and they have to introduce an artificial lower limit to limit the packages that are scanned?

My vague understanding was that there is no metadata service, so in some cases Poetry has to download the packages from the beginning of the day to extract the metadata first to make sure they don't match, and this causes the slowness.

Is this understanding correct? If so, would it make sense to bring this up with the PyPI / PyPA people?

@zyv
Copy link
Contributor

zyv commented Oct 30, 2024

P.S. Just saw that in #8823 @dimbleby explained:

it's just good or bad luck whether your solver happens first to explore a path that fixes boto3 first (when it is very easy to find a satisfactory urllib3) or a path that fixes urllib3 first (when it is very hard to find a satisfactory boto3)

the most useful thing you can do right now, for future-you and the rest of the ecosystem, is to go and offer merge requests to django-distill or fiftyone or whoever, putting a (recent) lower bound on their boto3 dependency.

Then no installer is exposed to having to backtrack through the thousands of versions of boto3 that amazon release

I think that's as close to an explanation as I can get, but I'm still confused as to why backtracking through thousands of versions is a problem. It seems to me that if you have all the relevant constraints at hand, it shouldn't be difficult to solve. So is it true that constraints can only be obtained by downloading the packages in question, and that's the root cause of the issue?

@radoering
Copy link
Member

So is it true that constraints can only be obtained by downloading the packages in question, and that's the root cause of the issue?

It depends:

  • If you use PyPI as index, then dependencies of a package can be obtained without downloading the package if wheels are provided (thanks to the PEP 658 backfill). If it is an sdist only release, then (in most cases) the sdist has to be downloaded. However, even without downloading wheels or sdists, it still takes some time to backtrack thousands of versions and fetch the metadata of each version.
  • If you use another index than PyPI, then most likely wheels/sdists have to be downloaded (at least partially).

@zyv
Copy link
Contributor

zyv commented Oct 30, 2024

So is it true that constraints can only be obtained by downloading the packages in question, and that's the root cause of the issue?

It depends:

  • If you use PyPI as index, then dependencies of a package can be obtained without downloading the package if wheels are provided (thanks to the PEP 658 backfill). If it is an sdist only release, then (in most cases) the sdist has to be downloaded. However, even without downloading wheels or sdists, it still takes some time to backtrack thousands of versions and fetch the metadata of each version.

But OP is using PyPI and boto3 provides wheels, so the problem is really in the efficiency of Python code in this case?

@radoering
Copy link
Member

Not sure if it is about Python or just network requests or the algorithm itself (independent from the programming language).

@Secrus
Copy link
Member

Secrus commented Oct 30, 2024

TL;DR there are many moving parts between network, algorithms, and Python, but boto having daily releases and a ton of versions to check doesn't help.

@zyv
Copy link
Contributor

zyv commented Oct 30, 2024

Not sure if it is about Python or just network requests or the algorithm itself (independent from the programming language).

Well, I think it makes a huge difference.

I remember that about two years ago my colleagues complaining that Poetry was saturating our 500 Mbit link, so it definitively felt like it's downloading half of the world. If nowadays it's "just" network requests and/or resolution code, then in theory something could be done about it at the Poetry side.

But I see, I guess I can only get these detailed answers by looking at the code myself :( Thank you for the pointers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working as expected
Projects
None yet
Development

No branches or pull requests

9 participants