Install from lock file always updates when using private sources, regression from 1.1.13 #5360
Closed
3 tasks done
Labels
kind/bug
Something isn't working as expected
-vvv
option).Issue
When using a custom source, installs from lock file always perform an "update". For projects with a very large number of dependencies, this can result in an installing taking multiple minutes, vs the seconds that are expected when there hasn't been any changes to the packages in the lock file.
Example custom source:
What did I expect to see
After installing once, subsequent installs should "skip" since the lock file hasn't changed, and the packages on disk have not changed.
Example success output:
What actually happened
When using a custom source, the packages are re-downloaded every time
What is causing this behavior in 1.2.0
I've narrowed this down to a difference in behavior when building the package information for the packages installed on disk, which is triggering an additional conditional in the logic for determining if two packages are the same.
In the is_same_package_as method, if one of the packages has a source_type configured, it runs some additional checks to ensure they are the same
this method is invoked as part of the solver/Transaction, where it is comparing an installed package to a package from the lock file. Notice how if
installed_package.source_type
isNone
, this condition would have failed, and the package would be correctly skipped (because the lock file package does have the type oflegacy
.I manually added some logs to see what these inputs were, and how they differed when using a private source, vs using pypi.
First, here is how the installed packages look, regardless of the source used to install (i trimmed for brevity).
Notice the
source_type=file, source_url=...
.Here is the list of packages from the lock file, first from an install that used pypi:
And now that same list of packages from the lock file, but this time installed using my custom source:
Again, notice the
source_type='legacy', source_url=...
.So, when using pypi, the lock file packages don't have a
source_type
set, and so they therefore aren't triggering the if condition that is causes the package to not be skipped.What does the behavior look like in 1.1.13, and what changed?
I added some similar logs into the source in 1.1.13, and there are some obvious differences with the data of the packages.
I added two log lines in the solver
Notice how the lock file package still has
source_type='legacy', source_url...
BUT, the installed package has
_source_type:None, _source_url:None
.You can see later down in the
_solve
source that there is a similar condition for checking thesource_type
, where if it is empty it will bail early.The check in 1.1.13
The check in 1.2.0b1
The conditions themselves are very similar, and should mostly behave the same, assuming they both exit as soon as they encounter the installed package with
source_type
None
In summary
I believe the most important change is the non-None
source_type
field on installed packages.When
None
in the 1.1.3 branch, it causes a miss on this condition (pkg
is the installed package,package
is the lock file package)In the 1.2.0b1 branch, if the installed package source had been
None
, we would have failed the none-empty check forinstalled_package.source_type
, and the subsequent check forresult_package.source_type != "legacy"
would have also failed, as the custom sources DO have their source_type set tolegacy
.You can see that in 1.1.13, the
InstalledRepository
class does not setsource_type=file
I believe this is the commit that made that change in the 1.2.X branch
If you follow the invocation paths in 1.2.0b1, we can trace back to where the installed packages are constructed from (sorry for the long list, I hope its helpful)
self._installed
self._installed_repository
self._installed_repository
is populated as part of the Installer init methodInstalledRepository.load(env)
create_package_from_distribution
create_package_from_pep610
source_type
is populatedReproducing this
If you need help creating a fully reproducible test case let me know, I used bandersnatch to create a minimal pypi mirror with
requests
, and a few other libraries, and then used docker compose with an nginx container to serve the repo, along with an interactive python container for invoking poetry against the project.The text was updated successfully, but these errors were encountered: