Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instability with preferences and forks: Lockfile changes on the second run #5180

Closed
konstin opened this issue Jul 18, 2024 · 2 comments · Fixed by #5485
Closed

Instability with preferences and forks: Lockfile changes on the second run #5180

konstin opened this issue Jul 18, 2024 · 2 comments · Fixed by #5485
Assignees
Labels
bug Something isn't working preview Experimental behavior

Comments

@konstin
Copy link
Member

konstin commented Jul 18, 2024

The lockfile can change on the second uv lock, collapsing previously forking dependencies into a single one. I believe this due to how preferences are handled.

Minimal example:

[project]
name = "transformers"
version = "4.39.0.dev0"
requires-python = ">=3.9.0"
dependencies = [
  "datasets",
  "librosa",
  "onnxruntime",
]
$ rm -f uv.lock && cargo run -q lock --preview && wc -l uv.lock && cargo run -q lock --preview && wc -l uv.lock && cargo run -q lock --preview && wc -l uv.lock
Resolved 60 packages in 169ms
1462 uv.lock
Resolved 58 packages in 33ms
1375 uv.lock
Resolved 58 packages in 28ms
1375 uv.lock

We have 3 different forkings until it converges:

$ rm -f uv.lock && cargo run -q lock --preview -v 2>&1 | rg -i split
DEBUG Adding split to cover possibly incomplete markers: python_version >= '3.11' and python_version != '3.11' and python_version < '3.12'
DEBUG Splitting resolution on pandas==2.2.2 over numpy
DEBUG Pre-fork split universal took 0.007s
DEBUG Solving split python_version >= '3.12' (requires-python: python_version >= '3.9' and python_full_version >= '3.9.0')
DEBUG Split (python_version >= '3.12') resolution took 0.003s
DEBUG Solving split python_version == '3.11' (requires-python: python_version >= '3.9' and python_full_version >= '3.9.0')
DEBUG Split (python_version == '3.11') resolution took 0.001s
DEBUG Solving split python_version < '3.11' (requires-python: python_version >= '3.9' and python_full_version >= '3.9.0')
DEBUG Split (python_version < '3.11') resolution took 0.001s
DEBUG Solving split python_version < '3.12' and python_version >= '3.11' and (python_version < '3.11' or python_version > '3.11') (requires-python: python_version >= '3.9' and python_full_version >= '3.9.0')
DEBUG Split (python_version < '3.12' and python_version >= '3.11' and (python_version < '3.11' or python_version > '3.11')) resolution took 0.002s
$ cargo run -q lock --preview -v 2>&1 | rg -i split
DEBUG Adding split to cover possibly incomplete markers: python_version > '3.11' and python_version < '3.12'
DEBUG Splitting resolution on pandas==2.2.2 over numpy
DEBUG Pre-fork split universal took 0.001s
DEBUG Solving split python_version >= '3.12' (requires-python: python_version >= '3.9' and python_full_version >= '3.9.0')
DEBUG Split (python_version >= '3.12') resolution took 0.001s
DEBUG Solving split python_version <= '3.11' (requires-python: python_version >= '3.9' and python_full_version >= '3.9.0')
DEBUG Split (python_version <= '3.11') resolution took 0.001s
DEBUG Solving split python_version < '3.12' and python_version > '3.11' (requires-python: python_version >= '3.9' and python_full_version >= '3.9.0')
DEBUG Split (python_version < '3.12' and python_version > '3.11') resolution took 0.001s
$ cargo run -q lock --preview -v 2>&1 | rg -i split
DEBUG Split universal resolution took 0.002s
$ cargo run -q lock --preview -v 2>&1 | rg -i split
DEBUG Split universal resolution took 0.002s

I think we need #4926 (to not create duplicate dependencies in the first run) and we need to marker- or fork-tag preferences if that heuristic fails.

@konstin konstin added bug Something isn't working preview Experimental behavior labels Jul 18, 2024
@konstin konstin self-assigned this Jul 24, 2024
@konstin
Copy link
Member Author

konstin commented Jul 24, 2024

A detailed write-up on the problem can be found in astral-sh/packse#203.

My plan to solve this is to:

  • Merge forks with identical resolutions so we have as few forks as possible in the end
  • Solve Prioritize forks  #4926 as a follow-up so we have diverging forks less often
  • When we still have diverging versions, preserve the fork markers from the resolution and for packages with diverging versions in the lockfile
  • When resolving from an existing lockfile, start already forked into the forks from the lockfile
  • Change the preferences so that we only use those applicable to the current fork. When we use the preserved forks, this will be a 1-to-1 mapping, what we do otherwise TBD
  • Add Add forking scenarios that are unstable with preferences packse#203 to the test suite

@konstin
Copy link
Member Author

konstin commented Jul 26, 2024

Background reading: astral-sh/packse#203

The root problem is that preferences can influence which fork points we get and thereby which forks we have in the eventual merge resolution.

In the worst case we could have a bistable system that flip-flops for every uv lock invocation: We start with preference I for lockfile I. These contain foo==1, and starting with foo==1 causes us to fork on python_version. The two python_version forks both end up rejecting foo==1 and we write lockfile II with foo==2. In the next uv lock invocation, we read preferences II from lockfile II with foo==2. Starting with foo==2 causes us to fork on sys_platform instead. With those forks, we end up rejecting foo==2 and write lockfile I with foo==1 instead; we're back in the other state, switching on every uv lock.

The solution is to lock store the forks in the lockfile and to start the resolution with the locked forks. For simplicity and to make it easier to understand for users why there are multiple versions per package in their lockfile, for each package with multiple versions, we write write the forks a version belongs to next to the fork. When we resolve, in each fork, we will find exactly one matching preference, the one that is the result in the fork in the previous resolution. This way, we get a stable resolution.

We need to both start the resolution with the locked forks, because we may not see the same fork points again with preferences that skip over them (diverging or not), and have per-fork preferences to ensure that in each fork, we're doing the same resolution as before

konstin added a commit that referenced this issue Jul 26, 2024
With our previous eager union, we were losing the fork markers. We now
carry this information into the resolution graph construction and, in
the next step, can read the markers there.

Part of
#5180 (comment)
@konstin konstin changed the title Lockfile changes on the second run Instability with preferences and forks: Lockfile changes on the second run Jul 26, 2024
konstin added a commit that referenced this issue Jul 26, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 29, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 29, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 29, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 29, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 29, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 30, 2024
Add the forks to the lockfile, without using them yet, which we'll add
in the next PR.

Please review commit-by-commit

Part of
#5180 (comment)
konstin added a commit that referenced this issue Jul 30, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 31, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 31, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
konstin added a commit that referenced this issue Jul 31, 2024
)

By resolving for each fork from the lockfile individually and by adding
using preferences for the current fork, we solve the instability #5180.
I've tested the locally and will add the packse test scenarios upstack.

Part of
#5180 (comment)
konstin added a commit that referenced this issue Jul 31, 2024
Add tests for the instabilities fix

Part of #5180 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working preview Experimental behavior
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant