Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashed dependencies of metadata into the metadata of a lib #4469

Merged
merged 6 commits into from
Sep 9, 2017

Conversation

nipunn1313
Copy link
Contributor

This fixes one part of #3620. To my understanding, the more fundamental fix is more challenging

@rust-highfive
Copy link

r? @matklad

(rust_highfive has picked a reviewer for you, use r? to override)

dep_metadatas.sort();
for metadata in dep_metadatas {
metadata.hash(&mut hasher);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it cleaner to just do dep_metadatas.hash(&mut hasher) and rely on Vec's impl of Hash?

As a follow up, would it be nicer to use a sorted datastructure (like BTreeSet) instead of calling .sort() on a Vec? I'm certain it doesn't matter from a perf perspective, so whatever the team likes stylistically is fine to me.



// Build caller1. should build the dep library. Because the features
// are different than the full workspace, it rebuilds.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be beautiful if building the entire workspace simply covered this, but it doesn't as is. According to #3620, this is difficult to fix? I couldn't find an easy way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just left this comment which I think is related to this. I do believe it'll be a relatively invasive fix to fix this.

@nipunn1313
Copy link
Contributor Author

This got a bit nastier after getting the doctests to pass because the fn dep_targets() produces different targets for doctests vs regular tests (reasonably so), but the target_metadata is expected to be the same.

Here, I pull the central code inside dep_targets sans filtering up to the metadata calculation layer, and things worked out. There's probably a way to factor this cleaner, but running through CI / requesting feedback before embarking on that.

@matklad
Copy link
Member

matklad commented Sep 5, 2017

Huh, oddly enough I've hit this issue myself a couple of days ago: #4463.

However I feel that the problem here is deeper then just spurious rebuilds, because you actually get different artifacts depending on how you build your workspace(see #4463), and this seems pretty bad to me.

@alexcrichton , if workspaces share the same dep graph, perhaps we should always activate the union of all features of the workspace? Currently for features we loop over requested summaries, but it shouldn't be that difficult to loop over all of them?

@nipunn1313 I think that if you add 'cargo test --all --no-run' before the crate loop in your CI script, you won't get rebuilds.

@nipunn1313
Copy link
Contributor Author

@matklad unfortunately running cargo test --all --no-run is insufficient. The test I added in this diff actually highlights the issue (it runs cargo build on entire workspace, and then in the sub crates). The test fails on master, but is ok with my diff.

I also thought about having cargo build within a crate activating the union of all features in the workspace, but that might increase the startup time for cargo build in large workspaces? It's not too bad, but it's on the order of 20 seconds for us from the root, but near instantaneous within individual crates. We have 70 crates in our workspace with 207 vendored deps (cargo-vendor style)

@matklad
Copy link
Member

matklad commented Sep 5, 2017

it runs cargo build on entire workspace, and then in the sub crates

cargo build won't build the whole workspace, cargo bulid --all will, so I still think that cargo test --all --no-run should help here :)

@matklad
Copy link
Member

matklad commented Sep 5, 2017

I still think that cargo test --all --no-run should help here :)

Huh, I am 100% wrong, sorry for the noise then :)

@matklad
Copy link
Member

matklad commented Sep 5, 2017

that might increase the startup time for cargo build in large workspaces?

I think in theory this should not be the case, because the workspace shares the dependency graph and the lockfile anyway, so even if you compile a single package, the whole workspace needs to be loaded. And that makes me think that the huge difference in start up time between single crate and whole workspace you are observing probably indicates some issue in Cargo.

However I would say that the primary problem here is correctness (producing different artifacts for foo for cargo bulid --p foo versus cargo build --all), and that we probably should fix it first, and then try to regain lost ground in terms of performance, if any. For example, I can imaging caching the result of feature selection just like another build artifact.

@alexcrichton
Copy link
Member

Thanks for the PR @nipunn1313! I've long wanted to implement this!

@matklad I think we'll want this solution no matter what for a number of reasons. Let's say you're working on just one crate and you do:

cargo build 
# edit files ...
cargo build --features a
# edit files ...
cargo build 

I'd personally expect the third build (second usage fo cargo build --features a) to essentially do an incremental compilation based on what's edited. What happens today, though, is that this could trigger a full compilation starting with some super deep dependency working its way back up the tree. The solution here, I believe, is to separately cache artifacts based on feature sets. That is, the cargo build --features a will recompile everything it needs to, but all compiled crates will be cached in different locations based feature sets activated to ensure that we don't oscillate on what's being cached.

This has come up a good number of times in a whole slew of situations! Now the solution a lot of the time is to stop oscillating on features and instead just unify what's used everywhere, but that's not always possible. In any case though I think this boils down to "not a workspace problem" and instead a "how Cargo caches dependencies" problem. (in that workspaces aren't required to reproduce this issue, they just make it worse sometimes)

// Mix in the target-metadata of all the dependencies of this target
if let Ok(deps) = self.used_deps(unit) {
let mut deps_metadata = deps.into_iter().map(|dep_unit| {
self.target_metadata(&dep_unit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So runtime-wise we've had a lot of issues in the past with this sort of recursive calculation causing an exponential amount of code to run instead of linear (wrt the dependency graph). For example here I think that if we call target_metadata for all targets this goes from linear (currently) to exponential (after this PR), right?

Perhaps we can introduce a cache for target_metadata? (that's what we do for everything else!)

This tends to not show up much in small crate graphs but projects like Servo (and in theory Dropbox w/ 200+ crates) may end up showing this pretty badly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. That sounds like a great solution. Envisioning a big hashmap from Unit -> Metadata in the ctx? We could probably even precalculate it if we walk the units in dependency order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great! I'm fine with doing the caching lazily or doing it all at once when we walk in dependency order. I know we have a few "prepopulate passes" at the beginning to walk the graph, but I sort of forget what they're doing. Basically whatever's easiest is fine by me

return self.doc_deps(unit);
}

fn used_deps(&self, unit: &Unit<'a>) -> CargoResult<Vec<Unit<'a>>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a somewhat unfortunate addition in the slew of already-too-many ways to calculate dependencies :(

I didn't quite follow your explanation earlier, mind detailing a bit more what was going on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. At least this one is private only. The issue here is the fork on
if unit.profile.run_custom_build and if unit.profile.doc && !unit.profile.test

Specifically, OUT_DIR was set incorrectly in the build phase of doctests because doctests had a different dependency tree, but OUT_DIR was expected to be the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'm still not 100% following... In any case, I'll check this out locally and poke around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Easier if you poke around. I'll give it another shot though.

Without this refactor:

  • When you compile the build.rs script for a doctest vs compiling the build.rs script for a regular test, it ends up with different metadata. This causes OUT_DIR to get set to a different (nonexistent) directory during doctests. It shows up as a test failure.
  • https://travis-ci.org/rust-lang/cargo/jobs/271875681

With this refactor

  • Doctests and regular tests have the same used_deps despite having different dep_targets.

Overall, it definitely does feel like some elements are repeated here and architecturally there is some unnecessary complexity, but I don't understand it well enough right now to find a way out. I think we need one function that returns deps for the package vs deps for the unit of build?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok that sounds very surprising! The build script should be constant here and shouldn't change metadata, but let me poke around to see if I can't find where the difference is arising from

@alexcrichton
Copy link
Member

@matklad

@alexcrichton , if workspaces share the same dep graph, perhaps we should always activate the union of all features of the workspace?

I don't think this'd be too hard to implement, but I'm not sure if this is what we'd want implemented per se. If one target of a workspace doesn't want a particular feature activated, wouldn't it be surprising if some other target present in a workspace far away activated the feature?

@nipunn1313

It's not too bad, but it's on the order of 20 seconds for us from the root, but near instantaneous within individual crates

20 seconds in Cargo definitely sounds like a bug to me! I'd love to help investigate this and speed that up if it's a problem, but we can probably take that off this PR :)

@matklad
Copy link
Member

matklad commented Sep 5, 2017

@alexcrichton yeah, totally agree that the fix here is needed in general!

@nipunn1313
Copy link
Contributor Author

@alexcrichton We actually do already cache the target separately if the features change. The issue is that we cache the target the same if the features of a dep change (see discussion here #3620). It's a bit more subtle, but still in the same realm as the issue you're describing.

Eg
(x -> y). Features for which x calls y change.
y compiles to a different hash
x compiles to the same hash but links against the new y

Copying my eg from #3620 where x=itertools, y=either

Running `rustc --crate-name itertools itertools-0.6.2/src/lib.rs
--crate-type lib --emit=dep-info,link -C debuginfo=2
-C metadata=4ed3e3cf3bc8df3d -C extra-filename=-4ed3e3cf3bc8df3d
--out-dir target/debug/deps
-L dependency=target/debug/deps
--extern either=target/debug/deps/libeither-f93178e8a5af0b1d.rlib
--cap-lints allow`

vs

Running `rustc --crate-name itertools itertools-0.6.2/src/lib.rs
--crate-type lib --emit=dep-info,link -C debuginfo=2
-C metadata=4ed3e3cf3bc8df3d -C extra-filename=-4ed3e3cf3bc8df3d
--out-dir target/debug/deps
-L dependency=target/debug/deps 
--extern either=target/debug/deps/libeither-4dcab0f19fb09534.rlib
--cap-lints allow`

FWIW, I think #3620 and #4463 may be duplicates. Lots of good discussion in both.

@@ -483,6 +483,15 @@ impl<'a, 'cfg> Context<'a, 'cfg> {
// when changing feature sets each lib is separately cached.
self.resolve.features_sorted(unit.pkg.package_id()).hash(&mut hasher);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've always mixed in features for the package itself. Just not for the deps. See this line

Copy link
Contributor Author

@nipunn1313 nipunn1313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll work on the cache to replace the recursive call.

// Mix in the target-metadata of all the dependencies of this target
if let Ok(deps) = self.used_deps(unit) {
let mut deps_metadata = deps.into_iter().map(|dep_unit| {
self.target_metadata(&dep_unit)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. That sounds like a great solution. Envisioning a big hashmap from Unit -> Metadata in the ctx? We could probably even precalculate it if we walk the units in dependency order.

return self.doc_deps(unit);
}

fn used_deps(&self, unit: &Unit<'a>) -> CargoResult<Vec<Unit<'a>>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. At least this one is private only. The issue here is the fork on
if unit.profile.run_custom_build and if unit.profile.doc && !unit.profile.test

Specifically, OUT_DIR was set incorrectly in the build phase of doctests because doctests had a different dependency tree, but OUT_DIR was expected to be the same.

@alexcrichton
Copy link
Member

@nipunn1313 ah yeah I believe you and I are worried about the same case! I forgot long ago when you added "hash the feature selection into the metadata" to also account for the transitive case :(

@nipunn1313
Copy link
Contributor Author

Cool. Just worked out the cache. Realized as I was writing it that I worked on one of the other caches (for target_filenames) last year. Had forgotten heh.

@alexcrichton
Copy link
Member

alexcrichton commented Sep 5, 2017

Ok so one (existing) bug I've found is that the target_metadata for a build script changes over time. Basically when you have internal mutability then things go wrong.

I've fixed that with this diff. There's one failure, however, remaining with the doctest issue like you were mentioning, looking into that now.

@alexcrichton
Copy link
Member

Ok turns out the next bug is actually the same. After we've compiled everything a Compilation structure is built up which saves off OUT_DIR, but at that point the build_state is all filled in so there's a different set of listed dependencies for build scripts than before, causing the OUT_DIR that rustdoc --test uses to be different from the main compilation.

I fixed that test in specific by moving this line below this line, but that unfortunately breaks the output_depinfo line just above it.

I think the tl;dr; here is that the internal mutability causing a difference in dep_targets is the "root of all evil" here. Perhaps the Context build up, very early on, a map of what units are overridden and then deterministically skip or not skip them all in invocations of dep_targets? I think the custom_build::build_map function is likely the best place to build such a map (and feel free to shove it anywhere on Context)

nipunn1313 and others added 6 commits September 9, 2017 13:46
Previously it depended on dynamic state that was calculated throughout a
compilation which ended up causing different fingerprints showing up in a few
locations, so this makes the invocation deterministic throughout `cargo_rustc`.
@alexcrichton
Copy link
Member

@bors: r+

Alright I pushed up some small fixes, let's see what @bors thinks

@bors
Copy link
Contributor

bors commented Sep 9, 2017

📌 Commit f90d21f has been approved by alexcrichton

bors added a commit that referenced this pull request Sep 9, 2017
Hashed dependencies of metadata into the metadata of a lib

This fixes one part of #3620. To my understanding, the more fundamental fix is more challenging
@bors
Copy link
Contributor

bors commented Sep 9, 2017

⌛ Testing commit f90d21f with merge 921c4a5...

@bors
Copy link
Contributor

bors commented Sep 9, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 921c4a5 to master...

@bors bors merged commit f90d21f into rust-lang:master Sep 9, 2017
@nipunn1313
Copy link
Contributor Author

Nice find Alex!
Thanks for patching it up and pushing it through.

@nipunn1313 nipunn1313 deleted the workspace_features branch August 6, 2021 18:03
@ehuss ehuss added this to the 1.22.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants