Do not lock the artifact-dir for check builds + fix uplifting #16307

ranger-ross · 2025-11-27T12:48:11Z

What does this PR try to resolve?

This PR un-reverts the changes from #16230 as well as adds a bug fix for a crash reported in #16297

Note: The bug fix relies on behavior in the reverted code, so I added that back before the bug fix

How to test and review this PR?

I added a new check test that verifies the existing behavior (crash) and is updated to verify no files are added to the target dir with no crash.

r? @epage
cc: @stewartadam (thanks, again for the bug report)

Fixes #16305

Update cargo submodule 14 commits in 2a7c4960677971f88458b0f8b461a866836dff59..bd979347d814dfe03bba124165dbce9554d0b4d8 2025-11-25 19:58:07 +0000 to 2025-12-02 16:03:50 +0000 - fix(completion): Put host-tuple before actual tuples (rust-lang/cargo#16327) - fix(lints): use plural form correctly (rust-lang/cargo#16324) - fix(completions): include `all` in `cargo tree --target` candidates (rust-lang/cargo#16322) - fix(lints): show lint error number (rust-lang/cargo#16320) - chore(deps): update compatible (rust-lang/cargo#16318) - chore(deps): update crate-ci/typos action to v1.40.0 (rust-lang/cargo#16316) - Do not lock the artifact-dir for check builds + fix uplifting (rust-lang/cargo#16307) - Properly validate crate names in `cargo install` (rust-lang/cargo#16314) - Support --filter-platform=host for cargo metadata rust-lang/cargo#9423 (rust-lang/cargo#16312) - Update to mdbook 0.5 (rust-lang/cargo#16292) - refactor(clean): Better divide old / new layout (rust-lang/cargo#16304) - update: silent failure on non-matching package specs with --breaking (rust-lang/cargo#16276) - fix(log): break timing-info message to multiple (rust-lang/cargo#16303) - fix(clean): Clean hosts builds with new layout (rust-lang/cargo#16300)

### What does this PR try to resolve? Previously if rustc version mismatches, Cargo removes all `doc` directories including target platforms that are not part of the build. This PR makes it `--target` aware and stops excessive cleanup, by putting `.rustdoc_fingerprint.json` in each target platform directory. ### How to test and review this PR? I'd like to reuse this file to track doc parts directories in <#16309>, and noticed that this file is for the entire workspace rather than per-target, hich is not compatible with mergeable cross-crate info JSONs. For concurrent safety, `build-dir` should be locked already since Cargo locks every intent except check (see <#16307>). This file is touched only under `UserIntent::Doc`.

This PR adds fine grain locking for the build cache using build unit level locking. I'd recommend reading the design details in this description and then reviewing commit by commit. Part of #4282 Previous attempt: #16089 ## Design decisions / rational - Still hold `build-dir/<profile>/.cargo-lock` - to protect against `cargo clean` (exclusive) - changed from exclusive to shared for builds - Using build unit level locking with a single lock per build unit. - Before checking fingerprint freshness we take a shared lock. This prevents reading a fingerprint while another build is active. - For units that are dirty, when the job server queues the job we take an exclusive lock to prevent others from reading while we compile. - This is done by dropping the shared lock and then acquiring an exclusive lock, rather than downgrading the lock, to protect against deadlock, see #16155 (comment) - After the unit's compilation is complete, we downgrade back to a shared lock allowing other readers. - All locks are released at the end of the entire build process - artifact-dir was handled in #16307. For the rational for this design see the discussion [#t-cargo > Build cache and locking design @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/246057-t-cargo/topic/Build.20cache.20and.20locking.20design/near/561677181) ## Open Questions - [ ] Do we need rlimit checks and dynamic rlimits? #16155 (comment) - [ ] Proper handling of blocking message (#16155 (comment)) - Update Dec 18 2025: With updated impl, we now get the blocking message when taking the initial shared lock, but we get no message when taking the exclusive lock right before compiling. - [ ] Reduce parallelism when blocking - [x] How do we want to handle locking on the artifact directory? - We could simply continue using coarse grain locking, locking and unlocking when files are uplifted. - One downside of locking/unlocking multiple times per invocation is that artifact-dir is touch many times across the compilation process (for example, there is a pre-rustc [clean up step](https://github.com/rust-lang/cargo/blob/master/src/cargo/core/compiler/mod.rs#L402) Also we need to take into account other commands like `cargo doc` - Another option would to only take a lock on the artifact-dir for commands that we know will uplift files. (e.g. `cargo check` would not take a lock artifact-dir but `cargo build` would). This would mean that 2 `cargo build` invocations would not run in parallel because one of them would hold the lock artifact-dir (blocking the other). This might actually be ideal to avoid 2 instances fighting over the CPU while recompiling the same crates. - Solved by #16307 - [ ] What should our testing strategy for locking be? - My testing strategy thus far has been to run cargo on dummy projects to verify the locking. - For the max file descriptor testing, I have been using the Zed codebase as a testbed as it has over 1,500 build units which is more than the default ulimit on my linux system. (I am happy to test this on other large codebase that we think would be good to verify against) - It’s not immediately obvious to me as to how to create repeatable unit tests for this or what those tests should be testing for. - For performance testing, I have been using hyperfine to benchmark builds with and without `-Zbuild-dir-new-layout`. With the current implementation I am not seeing any perf regression on linux but I have yet to test on windows/macos. --- <details><summary>Original Design</summary> - Using build unit level locking instead of a temporary working directory. - After experimenting with multiple approaches, I am currently leaning to towards build unit level locking. - The working directory approach introduces a fair bit of uplifting complexity and I further along I pushed my prototype the more I ran into unexpected issues. - mtime changes in fingerprints due to uplifting/downlifting order - tests/benches need to be ran before being uplifted OR uplifted and locked during execution which leads to more locking design needed. (also running pre-uplift introduces other potential side effects like the path displayed to the user being deleted as its temporary) - The trade off here is that with build unit level locks, we need a more advanced locking mechanism and we will have more open locks at once. - The reason I think this is a worth while trade of is that the locking complexity can largely be contained to single module where the uplifting complexity would be spread through out the cargo codebase anywhere we do uplifting. The increased locks count while unavoidable can be mitigated (see below for more details) - Risk of too many locks (file descriptors) - On Linux 1024 is a fairly common default soft limit. Windows is even lower at 256. - Having 2 locks per build unit makes is possible to hit with a moderate amount of dependencies - There are a few mitigations I could think of for this problem (that are included in this PR) - Increasing the file descriptor limits of based on the number of build units (if hard limit is high enough) - Share file descriptors for shared locks across jobs (within a single process) using a virtual lock - This could be implemented using reference counting. - Falling back to coarse grain locking if some heuristic is not met ### Implementation details - We have a stateful lock per build unit made up of multiple file locks `primary.lock` and `secondary.lock` (see [`locking.rs`](http://locking.rs) module docs for more details on the states) - This is needed to enable pipelined builds - We fall back to coarse grain locking if fine grain locking is determined to be unsafe (see `determine_locking_mode()`) - Fine grain locking continues to take the existing `.cargo-lock` lock as RO shared to continue working with older cargo versions while allowing multiple newer cargo instances to run in parallel. - Locking is disabled on network filesystems. (keeping existing behavior from #2623) - `cargo clean` continues to use coarse grain locking for simplicity. - File descriptors - I added functionality to increase the file descriptors if cargo detects that there will not be enough based on the number of build units in the `UnitGraph`. - If we aren’t able to increase a threshold (currently `number of build units * 10`) we automatically fallback to coarse grain locking and display a warning to the user. - I picked 10 times the number of build units a conservative estimate for now. I think lowering this number may be reasonable. - While testing, I was seeing a peak of ~3,200 open file descriptors while compiling Zed. This is approximately x2 the number of build units. - Without the `RcFileLock` I was seeing peaks of ~12,000 open fds which I felt was quiet high even for a large project like Zed. - We use a global `FileLockInterner` that holds on to the file descriptors (`RcFileLock`) until the end of the process. (We could potentially add it to `JobState` if preferred, it would just be a bit more plumbing) See #16155 (comment) for proposal to transition away from this to the current scheme </details>

ranger-ross added 2 commits November 27, 2025 21:34

Revert "Revert "feat: Do not lock the artifact-dir for check builds""

a261f5a

test(check): Added regression test for check build dylib uplifting

933fa40

rustbot assigned epage Nov 27, 2025

rustbot added A-layout Area: target output directory layout, naming, and organization Command-clean S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 27, 2025

fix(check): Do not uplift files to artifact-dir for check builds

a96e747

ranger-ross force-pushed the dylibs branch from 49b9714 to a96e747 Compare November 27, 2025 16:29

epage approved these changes Dec 1, 2025

View reviewed changes

epage added this pull request to the merge queue Dec 1, 2025

Merged via the queue into rust-lang:master with commit 2a1789f Dec 1, 2025
26 checks passed

rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 1, 2025

This was referenced Dec 2, 2025

Update cargo submodule rust-lang/rust#149555

Merged

fix(fingerprint): clean doc dirs for only requested targets #16331

Merged

rustbot added this to the 1.93.0 milestone Dec 3, 2025

This was referenced Dec 29, 2025

Implement fine grain locking for build-dir #16155

Merged

More granular locking in cargo_rustc #4282

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not lock the artifact-dir for check builds + fix uplifting #16307

Do not lock the artifact-dir for check builds + fix uplifting #16307

Uh oh!

ranger-ross commented Nov 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Do not lock the artifact-dir for check builds + fix uplifting #16307

Do not lock the artifact-dir for check builds + fix uplifting #16307

Uh oh!

Conversation

ranger-ross commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR try to resolve?

How to test and review this PR?

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ranger-ross commented Nov 27, 2025 •

edited

Loading