Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: lake update should update the lean-toolchain #2752

Closed
eric-wieser opened this issue Sep 20, 2023 · 24 comments · Fixed by #5684
Closed

RFC: lake update should update the lean-toolchain #2752

eric-wieser opened this issue Sep 20, 2023 · 24 comments · Fixed by #5684
Labels
enhancement New feature or request Lake Lake related issue

Comments

@eric-wieser
Copy link
Contributor

eric-wieser commented Sep 20, 2023

Current behavior

lake update updates all inherited: false dependencies, even if doing so would move to a newer toolchain.
It does not change the value in lean-toolchain, not does it emit any diagnostic to indicate that this might need doing manually.

This frequently bites users of projects which depend on mathlib, as lake update gives them a mathlib version that does not compile with (or at least, has no cache for) their active toolchain

Expected behavior

lake update should attempt to match the lean-toolchain of the new dependency versions. The design could be any of:

  1. Look at all the versions of inherited: false dependencies. If there is just one entry in this set, use it. If there is more than one, prompt the user.
  2. Look at all the versions of inherited: false dependencies. Attempt to pick the most recent of the options.
  3. Add a tracks_toolchain key to the manifest, to indicate precisely which dependency the toolchain should be tracked from. For many users, this would be set to mathlib4.
@Shreyas4991
Copy link

About the second option, is it always true that a later toolchain can compile code written for an earlier toolchain?

@eric-wieser
Copy link
Contributor Author

No, but nor is the reverse true, and if you run lake update you probably are more willing to risk moving forwards than backwards. Certainly a diagnostic is in order for any case except "If there is just one entry in this set".

@Shreyas4991
Copy link

Shreyas4991 commented Sep 20, 2023

Per the discussion on the thread, why can't this be implemented on the vscode extension as a behaviour specific to projects where mathlib is a non-inherited dependency and there are no others? Eventually lake would have some versioning for dependencies that makes this problem go away right?

@tydeu tydeu added the enhancement New feature or request label Sep 20, 2023
@tydeu
Copy link
Member

tydeu commented Sep 21, 2023

@Shreyas4991 Dependency versioning is very difficult in Lean. I think it will be a long while before we have something like semantic versioning. This RFC strikes me as potentially a pretty good general solution for the short term.

@Shreyas4991
Copy link

Shreyas4991 commented Sep 21, 2023

I agree with that. I am aware that dependency management is going to take a while. Which is why I was just wondering if this could be implemented as a vscode command called "update mathlib" as discussed in the thread. (Ref: the first message in the thread linked in my previous comment here). This is fundamentally a mathlib specific workflow request that can be accommodated in a vscode command. The gist of that thread is:

A GUI panel for lake commands is already planned. It is feasible to add mathlib specific options. The people most likely to be unable to solve the issue with lake update are also likely to be uncomfortable with CLIs. So GUI commands that perform safe default actions might be a shorter fix. It would also address in a limited way, the concerns of others that lake is doing many things, including very mathlib specific ones.

@eric-wieser
Copy link
Contributor Author

This is fundamentally a mathlib specific workflow request that can be accommodated in a vscode command

I don't agree with this at all. The problem as described will affect anyone who depends on a package that changes lean version.

@Shreyas4991
Copy link

Shreyas4991 commented Sep 21, 2023

This is fundamentally a mathlib specific workflow request that can be accommodated in a vscode command

I don't agree with this at all. The problem as described will affect anyone who depends on a package that changes lean version.

Then this could be part of the update command on vscode.

My issue with the current RFC is that for any project with more than one non-inherited dependency (which we must not discount, just because it might be a fraction of current projects), it it not clear that it will work. As Mac points out, dependency management will take time. So, for however long this takes, we have no guarantees that such a project can be built with lake. Further, the 99% who do benefit from this are also likely to be mostly GUI users.

If this RFC is implemented, options 1 and 2 seem to be the best, but option 1 can confuse CLI-averse users with its options. Implementing 2 or 3 as a GUI command on vscode hides the scary CLI stuff for those who don't like CLIs. This is also a temporary fix, so eventually the GUI button can be withdrawn/modified.

EDIT: On second thoughts, if we do take the GUI route for option 2 or 3, then implementing option 1 in the lake CLI will be great for CLI users.

@tydeu
Copy link
Member

tydeu commented Sep 21, 2023

@Shreyas4991

My issue with the current RFC is that for any project with more than one non-inherited dependency (which we must not discount, just because it might be a fraction of current projects), it it not clear that it will work.

My intended resolution to this RFC would be to check the lean-toolchain of each of the root dependencies and then the following:

  • If there is only one distinct toolchain, update the workspace's. If it is clearly newer than the workspace's, print an info that this occurred; otherwise, print a warning.
  • If there are multiple toolchains, find the maximum, up the workspace's toolchain with and print a warning.
  • If Lake cannot produce a total ordering of the toolchains, do nothing and print a warning.
  • Use a new tracks_toolchain require setting to filter which dependencies are checked. (Optional add-on)

However, it is worth nothing that none of these solutions will fix the case where lake update fails because Lake is too old. I am not sure what you and @eric-wieser think about that case.

@Shreyas4991
Copy link

Shreyas4991 commented Sep 21, 2023

Overall, sounds good.
Two details:

If lake cannot produce a total ordering...

Is a total ordering necessary here? Isn't what we need a partial ordering with a least upper bound?

none of these solutions will fix the case where lake update fails because Lake is too old.

Could lake be decoupled from the toolchain and managed by elan? cargo version appears to be managed by rustup for example. Why is it currently bundled with the toolchain?

@tydeu
Copy link
Member

tydeu commented Sep 21, 2023

@Shreyas4991

Could lake be decoupled from the toolchain and managed by elan? cargo version appears to be managed by rustup for example. Why is it currently bundled with the toolchain?

It is managed by elan, the toolchain version is how it is managed.

@Shreyas4991
Copy link

What I mean is whether we could decouple the two: Let elan manage lake separately from the rest of the toolchain.

@eric-wieser
Copy link
Contributor Author

and print a warning.

I'm happy with this compromise as long as the warning includes all the lean-toolchain versions (and the packages that they come from) as part of the warning. It would be especially great if it could show the versions before and after the update).

@tydeu
Copy link
Member

tydeu commented Sep 21, 2023

@Shreyas4991 We want Lake to be part of the toolchain, though. Otherwise, breaking changes would leave old Lake projects unbuildable.

@eric-wieser
Copy link
Contributor Author

I am not sure what you and @eric-wieser think about that case.

I have no opinions on this case, as I have not seen it happen.

@tydeu
Copy link
Member

tydeu commented Sep 21, 2023

@eric-wieser

the warning includes all the lean-toolchain versions (and the packages that they come from) as part of the warning. It would be especially great if it could show the versions before and after the update).

Why does it need this much information? If Lake has no clue, I doubt the user will either. In such a case, they will likely have do some manually checking across the packages for what changes were made and which one needs what. Or, if they just forgot to update their toolchain to specific version already in mind, a simple warning should be enough to make them remember.

@tydeu
Copy link
Member

tydeu commented Sep 21, 2023

More specifically, I was thinking of a warning of the following form (if the dependencies are incomparable):

warning: could not construct an ordering of the dependencies' toolchains; they may be incompatible; you should check their documentation to figure out what toolchain is necessary to make them work together (if any exists)

@eric-wieser
Copy link
Contributor Author

eric-wieser commented Sep 21, 2023

If Lake has no clue, I doubt the user will either.

Lake has easy access to information that the user doesn't; the version of the toolchains that the dependencies were using before they ran lake update. I would be happier with something more like:

warning: the direct dependencies have possibly-incompatible toolchains:
  mathlib          : lean-4.0.0 -> lean-4.1.0
  my_other_package : lean-4.0.0 -> lean-4.1.0-my-hacky-pr
The current package's toolchain has been updated as
  this_package     : lean-4.0.0 -> lean-4.1.0
which may be incompatible; you should check the documentation of the dependencies to figure out what toolchain is necessary to make them work together (if any exists).

where this tells the user "oh, I probably want to paste lean-4.1.0-my-hacky-pr into my lean-toolchain".

@tydeu
Copy link
Member

tydeu commented Sep 21, 2023

@eric-wieser Fair enough. Minor note: in the situation for this warning, the workspace toolchain would not be updated.

@eric-wieser
Copy link
Contributor Author

Minor note: in the situation for this warning, the workspace toolchain would not be updated.

Indeed; the template was intended as inspiration for all the other "print a warning" options in your list, some of which I believe do involve an update.

@tydeu tydeu transferred this issue from leanprover/lake Oct 25, 2023
@tydeu tydeu added the Lake Lake related issue label Oct 25, 2023
@kim-em
Copy link
Collaborator

kim-em commented Nov 8, 2023

I think this can now be closed in favour of leanprover-community/mathlib4#7846

@tydeu
Copy link
Member

tydeu commented Nov 8, 2023

@semorrison That is just mathlib-specific. The goal for this issue is to provide a general solution toolchain updates as part of lake update.

@eric-wieser
Copy link
Contributor Author

In particular, anyone using Std but not mathlib will still experience this issue

@Kha
Copy link
Member

Kha commented Nov 8, 2023

There are other situations where a lean-toolchain sync would be important as well, such as adding new dependencies to the lakefile. Should this issue be generalized or should this be tracked separately?

@kim-em
Copy link
Collaborator

kim-em commented Jul 1, 2024

Updating lean-toolchain from Lake is now on the critical path to moving Mathlib from a lakefile.lean to a lakefile.toml. The only obstacle remaining is Mathlib's post-update script. (We've removed the meta if for doc-gen4 already.)

github-merge-queue bot pushed a commit that referenced this issue Nov 4, 2024
Lake will now update a package's `lean-toolchain` file on `lake update`
if it finds the package's direct dependencies use a newer compatible
toolchain. To skip this step, use the `--keep-toolchain` CLI option.

Closes #2582. Closes #2752. Closes #5615.

### Toolchain update details

To determine "newest compatible" toolchain, Lake parses the toolchain
listed in the packages' `lean-toolchain` files into four categories:
release , nightly, PR, and other. For newness, release toolchains are
compared by semantic version (e.g., `"v4.4.0" < "v4.8.0"` and
`"v4.6.0-rc1" < "v4.6.0"`) and nightlies are compared by date (e.g.,
`"nightly-2024-01-10" < "nightly-2014-10-01"`). All other toolchain
types and mixtures are incompatible. If there is not a single newest
toolchain, Lake will print a warning and continue updating without
changing the toolchain.

If Lake does find a new toolchain, Lake updates the workspace's
`lean-toolchain` file accordingly and restarts the update process on the
new Lake. If Elan is detected, it will spawn the new Lake process via
`elan run` with the same arguments Lake was initially run with. If Elan
is missing, it will prompt the user to restart Lake manually and exit
with a special error code (4).

### Other changes

To implement this new logic, various other refactors were needed. Here
are some key highlights:

* Logs emitted during package and workspace loading are now eagerly
printed.
* The Elan executable used by Lake is now configurable by the `ELAN`
environment variable.
* The `--lean` CLI option was removed. Use the `LEAN` environment
variable instead.
* `Package.deps` / `Package.opaqueDeps` have been removed. Use
`findPackage?` with a dependency's name instead.
* The dependency resolver now uses a pure breadth-first traversal to
resolve dependencies. It also resolves dependencies in reverse order,
which is done for consistency with targets. Latter targets shadow
earlier ones and latter dependencies take precedence over earlier ones.
**These changes mean the order of dependencies in a Lake manifest will
change after the first `lake update` on this version of Lake.**
JovanGerb pushed a commit to JovanGerb/lean4 that referenced this issue Nov 4, 2024
Lake will now update a package's `lean-toolchain` file on `lake update`
if it finds the package's direct dependencies use a newer compatible
toolchain. To skip this step, use the `--keep-toolchain` CLI option.

Closes leanprover#2582. Closes leanprover#2752. Closes leanprover#5615.

### Toolchain update details

To determine "newest compatible" toolchain, Lake parses the toolchain
listed in the packages' `lean-toolchain` files into four categories:
release , nightly, PR, and other. For newness, release toolchains are
compared by semantic version (e.g., `"v4.4.0" < "v4.8.0"` and
`"v4.6.0-rc1" < "v4.6.0"`) and nightlies are compared by date (e.g.,
`"nightly-2024-01-10" < "nightly-2014-10-01"`). All other toolchain
types and mixtures are incompatible. If there is not a single newest
toolchain, Lake will print a warning and continue updating without
changing the toolchain.

If Lake does find a new toolchain, Lake updates the workspace's
`lean-toolchain` file accordingly and restarts the update process on the
new Lake. If Elan is detected, it will spawn the new Lake process via
`elan run` with the same arguments Lake was initially run with. If Elan
is missing, it will prompt the user to restart Lake manually and exit
with a special error code (4).

### Other changes

To implement this new logic, various other refactors were needed. Here
are some key highlights:

* Logs emitted during package and workspace loading are now eagerly
printed.
* The Elan executable used by Lake is now configurable by the `ELAN`
environment variable.
* The `--lean` CLI option was removed. Use the `LEAN` environment
variable instead.
* `Package.deps` / `Package.opaqueDeps` have been removed. Use
`findPackage?` with a dependency's name instead.
* The dependency resolver now uses a pure breadth-first traversal to
resolve dependencies. It also resolves dependencies in reverse order,
which is done for consistency with targets. Latter targets shadow
earlier ones and latter dependencies take precedence over earlier ones.
**These changes mean the order of dependencies in a Lake manifest will
change after the first `lake update` on this version of Lake.**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Lake Lake related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants