Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance of version solver for dependencies with many versions #8191

Closed
2 tasks done
ahvigil opened this issue Jul 12, 2023 · 4 comments · Fixed by #8256
Closed
2 tasks done

improve performance of version solver for dependencies with many versions #8191

ahvigil opened this issue Jul 12, 2023 · 4 comments · Fixed by #8256
Labels
kind/feature Feature requests/implementations status/triage This issue needs to be triaged

Comments

@ahvigil
Copy link

ahvigil commented Jul 12, 2023

  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have searched the FAQ and general documentation and believe that my question is not already covered.

Feature Request

The current dependency solver can perform suboptimally for packages with many frequent releases creating long chains to explore. botocore in particular seems to be problematic and can produce dead ends in the graph of possible package combinations that poetry takes a long time to resolve out of.

See:

The workarounds for these are generally of the "pin package X" variety, which works well but doesn't make for the best user experience and adds requirements to a pyproject.toml that a developer doesn't necessarily care about. The documentation section on "why-is-the-dependency-resolution-process-slow" also suggest adding pins as a solution for speed up resolution.

Investigate improving the performance of the version solver in these cases, preferably without too much impact on non-pathologic cases.

@ahvigil ahvigil added kind/feature Feature requests/implementations status/triage This issue needs to be triaged labels Jul 12, 2023
@dimbleby
Copy link
Contributor

the code you've linked has nothing much to do with the main part of poetry's search, which is neither depth-first nor breadth-first.

what you want to think about is changing the heuristic by which poetry chooses the next package to resolve -

# Prefer packages with as few remaining versions as possible,
# so that if a conflict is necessary it's forced quickly.
# In order to provide results that are as deterministic as possible
# and consistent between `poetry lock` and `poetry update`, the return value
# of two different dependencies should not be equal if possible.
def _get_min(dependency: Dependency) -> tuple[bool, int, int]:
# Direct origin dependencies must be handled first: we don't want to resolve
# a regular dependency for some package only to find later that we had a
# direct-origin dependency.
if dependency.is_direct_origin():
return False, Preference.DIRECT_ORIGIN, 1
is_specific_marker = not dependency.marker.is_any()
use_latest = dependency.name in self._provider.use_latest
if not use_latest:
locked = self._provider.get_locked(dependency)
if locked:
return is_specific_marker, Preference.LOCKED, 1
num_packages = len(
self._dependency_cache.search_for(
dependency, self._solution.decision_level
)
)
if num_packages < 2:
preference = Preference.NO_CHOICE
elif use_latest:
preference = Preference.USE_LATEST
else:
preference = Preference.DEFAULT
return is_specific_marker, preference, num_packages

eg most of the cases that currently go wrong are made much faster by reversing the preference for packages with fewer available versions. However: there's every chance that making that change will introduce new pathological cases - there is no silver bullet.

Feel free to experiment.

@ahvigil
Copy link
Author

ahvigil commented Jul 13, 2023

thats a great pointer, thanks! i must admit I'm not too familiar with the poetry codebase except as a user, seems I was looking in the wrong place.

some of the test cases in tests/mixology/version_solver/test_backtracking.py look like interesting threads to pull on, I'll fiddle with those to see if I can more exactly define the issue and a suggested improvement

@ahvigil
Copy link
Author

ahvigil commented Jul 13, 2023

using the backtracking unit test cases as a model I came up with a case that reproduces the performance issue I'm circling around

def test_i_can_reproduce_the_issue(
    root: ProjectPackage, provider: Provider, repo: Repository
) -> None:
    root.add_dependency(Factory.create_dependency("a", "*"))
    root.add_dependency(Factory.create_dependency("b", "*"))

    add_to_repo(repo, "a", "1.0.0", deps={"z": "<=2.0.0"})
    for i in range(500):
        add_to_repo(repo, "b", "%i.0.0" % i, deps={"z": "<=1.0.0"})
    add_to_repo(repo, "z", "1.0.0")
    add_to_repo(repo, "z", "2.0.0")

    check_solver_result(root, provider, {"a": "1.0.0", "b": "%i.0.0" % i, "z": "1.0.0"})

this test passes but the solver takes quite a long time

======================== 1 passed in 199.29s (0:03:19) =========================

as suggested, flipping the preference for packages with fewer versions speeds solving up tremendously

============================== 1 passed in 1.70s ===============================

I'll fiddle to see if I can figure out a way to get around this without breaking the solver for more common cases.

@ahvigil ahvigil changed the title add a breadth first search solver improve performance of version solver for dependencies with many versions Jul 13, 2023
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Feature requests/implementations status/triage This issue needs to be triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants