-
Notifications
You must be signed in to change notification settings - Fork 453
Reuse dependencies between project and tools #12526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse dependencies between project and tools #12526
Conversation
|
Do you mind rebasing? I'll do a more complete review then. |
0838ad3 to
f5ceafc
Compare
|
I'm still in the process of updating tests, but I'm interested in feedback of my general approach. |
95ef51a to
09f1449
Compare
Leonidas-from-XIV
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of making a package be represented by some kind of hashed representation seems like a quite good solution.
One could probably even go a step further and make the hash the only representation of the package in the .pkg folder.
There's a bit of nitpicking about names (I don't think introducing new exotic terms is a good idea), but the main concern from the code is that the way the summaries are computed seems to be in the wrong place as that leads to cycles which need to be broken and rather fragile-seeming code that does some side-effects to make it work.
src/dune_pkg/lock_dir.ml
Outdated
| [Digest_feed.compute_digest_with_hasher] because the hasher is a | ||
| singleton. *) | ||
| iter_all_versions_of_non_dune_dependencies (fun dep -> | ||
| let _dep_slug : Pkg_slug.t = Pkg.slug dep in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is odd and the comment does not make it clearer. Why is it calculating the summary only to discard it? If this is necessary I'd say that the hasher API needs to be revised to avoid having code that seemingly does nothing to trigger a side effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it calculating the summary only to discard it?
This is moot now anyway since this code has been removed, but here's why it was necessary if you're curious:
It was so that the lazy cell would be guaranteed to be populated when the information is used later on while computing a a Dune_digest.t. Computing a slug requires computing a Dune_digest.t, and while computing a Dune_digest.t we reuse a singleton "hasher" (for performance reasons) but this means that you can't nest the computation of a digest. This meant we had to be careful about the order in which these lazy computations were forced.
src/dune_pkg/lock_dir.ml
Outdated
| ; ( "exported_env" | ||
| , Dyn.list (Action.Env_update.to_dyn String_with_vars.to_dyn) exported_env ) | ||
| ; "enabled_on_platforms", Solver_env_disjunction.to_dyn enabled_on_platforms | ||
| ; "slug", Dyn.option Dyn.opaque slug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it opaque here? Pkg_slug.to_dyn seems to exist and not have issues with the conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was because the slug field contained a lazy computation that computed the slug, and I didn't want to force the computation just to compute the Dyn.t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This code is removed now.)
src/dune_rules/pkg_rules.ml
Outdated
| ~human_readable_description:(fun () -> | ||
| Pp.text | ||
| "A map associating package slugs with package metadata with entries for all \ | ||
| dev tools with lockdirs in the current project.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dev tools with lockdirs in the current project.") | |
| dev tools with lock directories in the current project.") |
| then | ||
| (* Dev tools are built in the default context, so allow their | ||
| dependencies to be shared with the project's if it too is being | ||
| built in the default context. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder, if the build is not in the default context, does it work when both the project and dependency depend on a package with the same slug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each context has a different directory under _build/_private/ where package source and compiled artifacts are placed, and dune doesn't share dependencies across contexts. If the context is not the default context the dune won't share built artifacts with dev tools, even if the dependencies have the same slug. This is something we might consider changing in the future since package digests should make it possible to share dependencies across contexts. It would require restructuring the build directory further to remove context names from paths.
7dbe4dd to
ab33cba
Compare
|
Thanks for the review @Leonidas-from-XIV! I'll address your comments tomorrow, but regarding:
That would certainly be possible however I think there's still value in keeping the package names and versions in the directory names for easier debugging. |
rgrinberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of a "slug" is sound, but I think you need to change where and how it is computed. My suggestion is to compute in the rules after you've evaluated the lock file for the package. The slug should be computed from the evaluated representations of dependencies rather than their AST form. This will also you to avoid the various awkwardness around handling multi platform lock directories.
|
I agree that it's awkward to have the package slug be a field in For example given the target I'm going to remove the slug field from |
ab33cba to
e1c34a7
Compare
|
Ok I've moved the computation of slugs to pkg_rules and renamed them to The only feedback I've applied thus far is moving the slug computation into pkg_rules and the consequences of this change. I'd appreciate another look from @rgrinberg and @Leonidas-from-XIV that the general structure is ok before I address the more fine-grained feedback and continue fixing the tests. I'll also note that I've added a new command |
src/dune_rules/pkg_rules.ml
Outdated
| |> Option.value ~default:[] | ||
| |> List.filter_map | ||
| ~f:(fun { Dune_pkg.Lock_dir.Dependency.name; loc = dep_loc } -> | ||
| if Package.Name.equal name Dune_pkg.Dune_dep.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you be removing all "provided" deps here?
rgrinberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM apart from some minor issues
f8fea2d to
689ce38
Compare
src/dune_rules/pkg_rules.ml
Outdated
| { pkg; deps; pkg_digest }) | ||
| in | ||
| Package.Name.Map.values pkgs_by_name | ||
| |> List.map ~f:(compute_entry ~seen_set:Package.Name.Set.empty ~seen_list:[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should should use Name.Map.map here and just return a map. The caller can transform the map into the digest map they need. That would also allow you to get rid of the linear search elsewhere.
7247eff to
acbb8db
Compare
|
The remaining test failures are because some tests create lockfiles containing a path inside the test sandbox, which means their package digests will be influenced by the sandbox path and thus different on different machines. Many test currently print the path to a package's contents which includes the digest. I'm investigating how best to solve this and I'm open to suggestions. |
|
@gridbugs I would sanitise the hashes so they aren't shown. We do this in multiple places especially for things that look like digests. |
|
Good idea @Alizter. I'll try that. |
acbb8db to
95050e8
Compare
This allows the dependencies of the project to be shared with dev tools, preventing unnecessarily rebuilding dependencies while building a tool if an identical version of the package has already been built or vice versa. This requires storing build artifacts of dependencies of dev tools and the project (when built in the "default" context) in a single directory, since their relative path within the _build directory is part of the key used to cache build artifacts. Previously the build artifacts of a package were placed in a directory named after that package, however since dev tools and the project may depend on different versions of the same package, directories needed to be renamed to avoid collisions. Even if the project and dev tools rely on the same version of some package, it may still need to be rebuilt on occasion due to the presence/absence of depopts, or due to manual modification to lockfiles. For this reason, the directories containing package build artifacts contain in their name, a hash of the contents of the lockfile of the metadata in the package's transitive dependency closure. This change introduces the concept of a "package digest" which is a unique directory name for a package comprising its name, version, and dependency closure hash. The DB type used to store metadata while building packages has been updated to be keyed based on package digests, and to store package metadata for dev tools and the project's default build context in the same table to allow dependencies to be shared. Signed-off-by: Stephen Sherratt <[email protected]>
95050e8 to
7047ce0
Compare
|
The ocaml 5.4.0 test is failing on other PRs so I doubt it's affected by this change. Merging now. |
* 'main' of github.com:/ocaml/dune: (147 commits) cram test: test only parameter flags in merlin generation fix(oxcaml): import eta-expansion changes from opam-repo (ocaml#12563) address review comments Mask the path to the stdlib fix(oxcaml): generate merlin config for library parameters fix(melange + include_qualified): track correct `.cmj` dependencies in emit (ocaml#12531) refactor: remove some unused code in [Path] (ocaml#12558) dep_rules: don't run (transitive) `ocamldep` on single module buildables (ocaml#12555) fix(pkg): ignore project settings for building packages test(pkg): reproduce ocaml#12131 melange: add a test for module cycle checks (ocaml#12554) chore: lint check for new changes entries (ocaml#12553) feature(cram): allow for conflict detection (ocaml#12538) ci: update for ocaml 5.4 release (ocaml#12552) chore(script): generate changelog from structure (ocaml#12516) Reuse dependencies between project and tools (ocaml#12526) Introduce Io.overwrite_file test: fix dune install requiring a mandir Enable package management for more tests Add a `dune tools env` command to add dev tools to PATH (ocaml#12521) ...
This allows the dependencies of the project to be shared with dev tools, preventing unnecessarily rebuilding dependencies while building a tool if an identical version of the package has already been built or vice versa.
This requires storing build artifacts of dependencies of dev tools and the project (when built in the "default" context) in a single directory, since their relative path within the _build directory is part of the key used to cache build artifacts. Previously the build artifacts of a package were placed in a directory named after that package, however since dev tools and the project may depend on different versions of the same package, directories needed to be renamed to avoid collisions. Even if the project and dev tools rely on the same version of some package, it may still need to be rebuilt on occasion due to the presence/absence of depopts, or due to manual modification to lockfiles. For this reason, the directories containing package build artifacts contain in their name, a hash of the contents of the lockfile of the metadata in the package's transitive dependency closure.
This change introduces the concept of a "package slug" which is a unique directory name for a package comprising its name, version, and dependency closure hash. The DB type used to store metadata while building packages has been updated to be keyed based on slugs, and to store package metadata for dev tools and the project's default build context in the same table to allow dependencies to be shared.
Fixes #12491