-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poetry doesn't try public pypi when private pypi included #3855
Comments
Any details? Is your private repository set as "secondary"? Why do you think the public PyPI is not checked anymore, does your project have dependencies that should be downloaded from the public PyPI? |
Seems to be related to #3306 Yeah, there are packages that are not on our private repository that then fail because it tries to pull from only the private one. Adding the public one as a source is a work around that unblocked me. |
I have this exact issue, you can see my pyproject.toml file here |
The way we have been solving it is by
Otherwise if you look at the lockfile the source for all packages is our private repo. This puts more work on our private repo because there isn't any reason to check it for the public packages since we are not running a full clone of pypi. |
@damienrj given this is related to #3306, do you want to check to see if the behavior is still present in https://github.com/python-poetry/poetry/releases/tag/1.2.0a1? Seems #3306 (PR #3406) made the release notes in 1.2.0a1 and may be cherry-picked for an upcoming patch release per #4241 |
I'm affected by this. Tried in 1.2.0a2 for good measure… At first glance it looks fixed: poetry.lock no longer updates with the incorrect default repository. However, it's apparent that poetry still checks the non-default repository for every package even if a valid package has been found on the default one. This might be intended, but it also slows operations down significantly. Common scenario: App depends on a handful of packages in a private repository, and a bunch of packages in pypi. A quick This makes |
The above happens even if |
Experiencing the same issue as @jleclanche has mentioned. I'm on poetry 1.1.8. During I have the following pypi configuration in my
And this is the log piece I'm getting (repeated pattern for every package) for
I looks like poetry doesn't respect Changing config to
as was suggested above doesn't help. |
I'm also finding this is an issue... |
Are you sure? What I observe is that those dependencies where BUT, I do think that explicitly specifying the "default" source should NOT be necessary in either case. I think the culprit is this loop packages = []
for repo in self._repositories:
packages += repo.find_packages(dependency) where packages are collected from every registered repo without regard to priorities. packages = []
for repo in self._repositories:
- packages += repo.find_packages(dependency)
+ packages = repo.find_packages(dependency)
+ if packages:
+ break Unfortunately, this could produce incorrect results due to what I'd consider a bug |
@mehes-kth yes I'm fairly certain. I'm still working on that same project and the poetry install command is exceedingly slow because of this, despite only having one single private dependency. |
@jleclanche I experience the excruciating slowness, too. All I'm saying is that any dependency explicitly marked as BTW, I'm wondering why Poetry seems to assume that any non-PyPI index is private. |
Ah, you might be right that it's caused by the recursive dependencies. It's been a while since I tested this so I don't remember exactly, but I did see lookups in the private repo when setting everything to source pypi. |
Will there be a fix supplied soon or some workaround? I am pretty stuck with trying to lock my deps with private repo and pypi. |
Also having the same issue here, it only works when I set my private repo as Thank you! |
Actually, it worked for me. I just add to add only my private repository with the |
When you combine e.g.:
There's no need to redefine the public pypi in this case. The |
If this is true, then the docs should be updated to mention |
@hugoantunes @jonapich This did not work for me. Could you share the environment you did this in, and what python/poetry versions and OSes you did this with? My experiences so far: The above does not work with a Docker build based on python:3.10, poetry 1.1.13. What did work was setting |
Poetry 1.1.12 and Windows, python 3.9. I just tested it again: Given this:
The lock doesn't contain any repository information. But once I do this:
Then the lock contains the repository information for the requests package exclusively. Once I change to this:
Then suddenly every single package in the lock contains my repository information (this seems to be a bug!). This seems to work though:
With the above, I don't see any repo information. When I add |
@jonapich An update/rectification from my side - setting both That leaves me agreeing with you that using only |
Poetry by default searches all sources for a package unless the package explicitly specifies a source ( If a source is set to |
@SimonVerhoek |
This issue isn't reproducible on poetry@master. Closing. Note that setting repository to default will disable default PyPI. https://python-poetry.org/docs/master/repositories/#project-configuration |
@abn setting Basically we try to use a toml like this:
Notice how the Now set It feels like the description of
|
There are two factors at play here. Default and secondary. This leads to the following scenarios.
Clarifications for the docs welcome. |
Clarifies default vs secondary (see discussion in python-poetry#3855)
That's exactly what's causing major slow-down when a private repo is not a full mirror, |
It adds extra requests during locking, yes, however "major slow-down" is probbaly an overstatement. If I recall correct, the extra request happens once per package when searching for the pakage when an update for the the package is allow lsited (eg:
This was discussed recently on discord when discussing #5442 with @tgolsson. One of the more common use caes for enterprises when using a self-hosted PEP 503 repository is to provide target environment specific wheels. This means, they set private repositories as A concrete example - PyPi has a source published (py3-none) but repo.org has prebuilt (py38-windows). If we say that the secondary source is only used when package is not found (ie. it is a fallback only), then Poetry will select only the As I have stated multiple times in various issues and discord discussions, modifying the current behaviour is the not the right approach, but rather to allow for an explicit feature that caters to the scenario where sources should only be used when the package explicitly requires this source. However, this also has other issues. As discussed on the discord thread, eg: how do we handle transitive dependncies then? There are far too many edge cases here. If you really bothered by the extra requests at present, perhaps using Hope this helps clarify why things are the way they are. |
has the situation significantly improved since the issue was filed? My recollection is it made a difference of one to two orders of magnitude in seconds for my project. |
Cannot speak for your project but based on the above example given in #3855 (comment) here is an unscientific evaluation. $ poetry source show
No sources configured for this project.
$ poetry lock --no-cache
Updating dependencies
Resolving dependencies... (0.9s)
Writing lock file
$ poetry source add --secondary fake-private https://pypi.org/simple/
Adding source with name fake-private.
$ poetry source show
name : fake-private
url : https://pypi.org/simple/
default : no
secondary : yes
$ poetry lock --no-cache
Updating dependencies
Resolving dependencies... (1.0s)
Writing lock file
$ |
Our private registry is configured to redirect to I just tested locking a huge project. Locking with our registry first took 4m40, but configuring it as non default, secondary and targetting only the few relevant packages brought that down to 4m14 (that's The huge difference though, is that we were able to slash down the size of our private registry server with this one easy trick 😅 when too many clients are doing a I improved (I think) the documentation in #5605 but I think some effort should be made to support this use case better. The options are simply misleading for anyone who didn't take the time to carefully read that documentation section. This is the dark side of the rules:
We have to think about the developer's thought process here. When you add I would say that if the user provided
Simply don't. If the user wants a transitive to use the private registry, it can be added to the dependencies with the I think that setting a registry as the first one to be checked should be the "you need to specify an option" way, and using the registry only when |
I remember what the issue was now: I was using a gitlab private pypi repository which 1) didn't have all packages (as @jonapich points out can be an issue) and 2) doesn't support the more recent package metadata protocol improvements, only the "simple" protocol (or something like that; I don't remember the internal details exactly) |
I agree. Was not going for anything else. I do appreciate you doing the actual test on a real project where the impact can be seen.
Additionally, this situation should be much improved once #5442 is merged, but perhaps not in your case if you are proxying public packages - you might end up with a huge index page. Fwiw, I am not advocating any particular solution here. However, I am trying to clarify the status quo. I am not saying that the way it is is fine and we shouldn't change anything.
Personally, I agree that the default behaviour should be similar to adding an extra index in
While it might work in the case you have identified, I do not think this is universally applicable. I can recall environments where they did prefer it to be the other way around. Question would be how to handle that, and what the right defaults should be. Further, doing this will also potentially leave unwanted packages in the project metadata. As an example if A depends on B and B depends on C, if we add C to A as you suggest and then later B drops dependency on C, you are left with an unused dependency in your project and/or your lockfile. Sure you can workaround this by adding it to a group instead of the main one. But similar issues apply for the version constriaints used as well. What is the right thing to do when B changes C's requirements? etc. All that said, I'd suggest that we move off this issue for this discussion. Might be more constructive to discuss the change of "default" behaviour of adding a package source. Alternatively, discuss addition of an option disabling package searches unless explicitly used. |
You're right, only secondary=true is needed. I think that was maybe an old bug, or just a manipulation error when I played around this months ago.
The same problem occurs when you need to specifically pin a version of a transitive dependency because reasons, no need to involve private registries to fall into this trap. I can't vouch for everyone's best practices, but if you need to add such an edge case to your In fact, the same problem occurs if someone adds dependency A for new python code, then someone alters the code later and removes the usage. Unless you actively search the code base for more usages of some random import you just removed, you're going to be left with one unused library. My opinion is that it's a non-issue / user-error, this isn't something poetry should be concerned about.
👍🏻 |
Resolves some discussion in #3855
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
-vvv
option).Issue
When using a private pypi repo, the public repo is no longer being checked. I can get everything working again by adding
But the wasn't needed in the past.
The text was updated successfully, but these errors were encountered: