Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter dependencies by tracking markers on resolver forks #4339

Merged
merged 7 commits into from
Jun 17, 2024

Conversation

BurntSushi
Copy link
Member

This PR adds the tracking of marker expressions across forked resolver
states. These marker expressions are then used to exlcude dependencies
with non-overlapping marker expressions from consideration in
corresponding forks only.

This PR also fixes a bug I found in marker disjoint checking as part of
testing this change.

Lots of packse tests were added here:
astral-sh/packse#194

Closes #4194

@BurntSushi BurntSushi requested a review from konstin June 15, 2024 14:10
@zanieb zanieb self-requested a review June 15, 2024 14:25
Copy link
Member

@zanieb zanieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good to me; I didn't review the lock snapshots.

/// always `true` for every `MarkerEnvironment`.
///
/// In non-universal mode, forking never occurs and so this marker
/// expression is always `true`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That needs some additional context about the intermediary step, where any marker A on a requirement is never disjoint with an empty marker tree (the default here), so we consider the marker tree here truthy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. Doesn't that immediately follow from the fact that the marker tree is always true for any possible marker environment?

Also, I'm not using "empty marker tree" because there are two conceivable variants of an empty marker tree: MarkerTree::And(vec![]) and MarkerTree::Or(vec![]). The former is always true (which is what matters here) while the latter is always false.

I'll try to add something to the docs here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added a paragraph. Hopefully it makes its use a little clearer:

    /// The marker expression that created this state.
    ///
    /// The root state always corresponds to a marker expression that is always
    /// `true` for every `MarkerEnvironment`.
    ///
    /// In non-universal mode, forking never occurs and so this marker
    /// expression is always `true`.
    ///
    /// Whenever dependencies are fetched, all requirement specifications
    /// are checked for disjointness with the marker expression of the fork
    /// in which those dependencies were fetched. If a requirement has a
    /// completely disjoint marker expression (i.e., it can never be true given
    /// that the marker expression that provoked the fork is true), then that
    /// dependency is completely ignored.

///
/// Note that in universal mode, it is possible and allowed for multiple
/// `PubGrubPackage` values in this list to have the same package name.
/// These conflicts are resolved via `Dependencies::fork`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can multiple dependencies of the same name also exist without forking, e.g. when foo has:

Requires-Dist: bar >= 1
Requires-Dist: bar < 2

These cases exist e.g. with extras, but sometimes also with non-conflicting environment markers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah great question. I was careful to make Dependencies::fork infallible so that it didn't try to be too eager about reporting errors. The idea is that if there aren't any requirements with the same package with non-overlapping markers, then the result returned by Dependencies::fork is unchanged from what it was. It's just a flat list of requirements. That is, no forks. So something like the scenario you outlined will work just as it always has.

One hiccup here though is if you have this:

Requires-Dist: bar >= 1 ; sys_platform == 'linux'
Requires-Dist: bar < 2 ; sys_platform == 'darwin'

Even though the version constraints are non-conflicting, because the marker expressions are non-overlapping, this will actually provoke a fork. The fork means that the bar >= 1 requirement is free to choose, say, bar==2.0.0 if it's available because the fork exists in isolation from the bar < 2 requirement. However, if no fork were to happen, then both the bar >= 1 and bar < 2 requirements would be active simultaneously such that bar==2.0.0 couldn't be chosen. So in this case, there exists a resolution without forking, but it is distinct from the resolution we get if we fork.

I think we could detect this case by looking at whether the version constraints themselves are conflicting. That is, if there exists an overlap in the version constraints, then we don't fork even if the marker expressions are non-overlapping. I think a decision like that would be predicated on the idea that it could be possible to find one resolution that works across multiple platforms. But it might not be possible. It feels like forking in this case, even if it isn't always necessary, results in more freedom for the resolver to find a resolution. But... maybe it's surprising to users in real cases. I'm not sure.

I added tests for some of these cases here: astral-sh/packse#196

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an issue for the non-conflicting requirements with non-overlapping markers scenario: #4357

crates/uv-resolver/src/resolver/mod.rs Outdated Show resolved Hide resolved
There are some key invariants that I had to re-learn by reading the
code. This hopefully makes those invariants easier to discover by future
me (and others).
This commit adds marker expressions to our `Fork` type, which are in
turn passed down into `PubGrubDependencies::from_requirements` to filter
our any dependencies with markers that are disjoint from the fork's
marker expression.

This is necessary to avoid visiting packages in the dependency graph
that can never actually be installed. This is because when a fork is
created in the resolver, it always happens when there are two sibling
dependency specifications on a package with the same name, but with
non-overlapping marker expressions. Each fork corresponds to each
such conflicting dependency specification, and each fork assumes the
the corresponding marker expression as a pre-condition for any future
dependencies considered by it. That is, since the fork represents an
installation path that can only be taken when the corresponding
dependency specification (and its marker expression) is actually used,
it also therefore follows that the marker expression is true. Therefore,
any dependency visited in that fork with a marker expression that cannot
possibly be true when the markers of the fork are true can and ought to
be completely ignored.
I found this while testing the tracking of marker expressions across
resolver forks. Namely, given

    sys_platform == 'darwin' and implementation_name == 'pypy'

And:

    sys_platform == 'bar' or implementation_name == 'foo'

These should be disjoint, but the disjointness checker was reporting
them as overlapping. I fixed this by giving handling of disjunctions
higher precedence than conjunctions, although I am not 100% confident
that this is correct for all cases.
BurntSushi added a commit to astral-sh/packse that referenced this pull request Jun 17, 2024
BurntSushi added a commit to astral-sh/packse that referenced this pull request Jun 17, 2024
@BurntSushi BurntSushi merged commit e264637 into main Jun 17, 2024
47 checks passed
@BurntSushi BurntSushi deleted the ag/universal-forked-markers branch June 17, 2024 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

universal-lock: ignore dependencies that are disjoint with a forked resolver state's marker expressions
3 participants