Skip to content

Validate lockfile (rather than re-resolve) in uv lock#6091

Merged
charliermarsh merged 4 commits intomainfrom
charlie/val
Aug 15, 2024
Merged

Validate lockfile (rather than re-resolve) in uv lock#6091
charliermarsh merged 4 commits intomainfrom
charlie/val

Conversation

@charliermarsh
Copy link
Member

@charliermarsh charliermarsh commented Aug 14, 2024

Summary

Historically, in order to "resolve from a lockfile", we've taken the lockfile, used it to pre-populate the in-memory metadata index, then run a resolution. If the resolution didn't match our existing resolution, we re-resolved from scratch.

This was an appealing approach because (in theory) it didn't require any dedicated logic beyond pre-populating the index. However, it's proven to be really hard to get right, because it's a stricter requirement than we need. We just need the current lockfile to satisfy the requirements provided by the user. We don't actually need a second resolution to produce the exact same result. And it's not uncommon that this second resolution differs, because we seed it with preferences, which fundamentally changes its course. We've worked hard to minimize those "instabilities", but they're still present.

The approach here is intended to be much simpler. Instead of resolving from the lockfile, we just check if the current resolution satisfies the state of the workspace. Specifically, we check if the lockfile (1) contains all the relevant members, and (2) matches the metadata for all dependencies, recursively. (We skip registry dependencies, assuming that they're immutable.)

This may actually be too conservative, since we can have resolutions that satisfy the requirements, even if the requirements have changed slightly. But we want to bias towards correctness for now.

My hope is that this scheme will be more performant, simpler, and more robust.

Closes #6063.

@charliermarsh charliermarsh force-pushed the charlie/val branch 19 times, most recently from 09d926a to 4cdd532 Compare August 14, 2024 21:16
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should move forward in this direction. I still think there's probably an opportunity to limit what we need to write to the lock file, but I'd rather be in a position where our lock file is more bloated than it needs to be but we're more confident in its stability.

//
// // Run a final clean resolution without a lockfile to ensure identical results.
// let _ = fs_err::remove_file(&$context.temp_dir.join("uv.lock"));
// $($x)*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a STOPSHIP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, going back to doing deterministic checking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced all the usages of this in lock with a second call to .lock with --locked. The deterministic! macro just makes it really hard to update the snapshots since you have to manually edit the tests, and --locked will fail more aggressively since it doesn't just require that the string contents changed.

(I've uncommented this though, and left it in the ecosystem checks.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does removing the with_duplicates fix the snapshot updating issue?

);
return Ok(false);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are all fine, but I think will result in reporting "not satisfied" in cases where the lock file does actually satisfy the requirements. But I think this is something we can improve in follow-up PRs. Specifically, even if the dependency specification in the pyproject.toml changes, the version that's in the lock file can still satisfy it such that no actual updates are needed. But I think this implementation will say an update is needed if any change occurs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this may actually be too conservative... I mostly care about this capability to ensure that uv run is really fast / cheap, so I'm ok with having to do a resolution if the user edits the dependencies.

@charliermarsh charliermarsh force-pushed the charlie/val branch 2 times, most recently from 4b91b76 to a00d621 Compare August 14, 2024 21:25
@charliermarsh charliermarsh added internal A refactor or improvement that is not user-facing lock Related to universal resolution and locking labels Aug 14, 2024
@charliermarsh charliermarsh marked this pull request as ready for review August 14, 2024 21:30
@charliermarsh
Copy link
Member Author

I think the code changes here might be roughly a wash? But I added a lot of tests.

@charliermarsh
Copy link
Member Author

#6063 no longer occurs.

Copy link
Member

@ibraheemdev ibraheemdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little unfortunate that we can't rely on the lockfile for metadata since that had a lot of potential in terms of performant incremental resolves (though I'm not sure this PR necessarily removes that as a future option). However, I agree that this approach is more obviously correct and probably our best option.

@charliermarsh
Copy link
Member Author

It's possible that we could still support that... But the framing would have to be a little different. Like, our goal would be to use metadata from the lockfile in the normal resolve, not the noop resolve.

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunate, but I think this is the right way to go.

Also, you did this crazy fast.

}),
// TODO(charlie): The use of `CWD` here is incorrect. These should be resolved relative
// to the workspace root, but we don't have access to it here. When comparing these
// sources in the lockfile, we replace the URL anyway.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtly, this is the worst part of the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal A refactor or improvement that is not user-facing lock Related to universal resolution and locking

Projects

None yet

Development

Successfully merging this pull request may close these issues.

uv lock sometimes changes the lock file when run for a second time

3 participants