Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider RUSTFLAGS artifact caching. #8716

Open
ehuss opened this issue Sep 19, 2020 · 13 comments
Open

Reconsider RUSTFLAGS artifact caching. #8716

ehuss opened this issue Sep 19, 2020 · 13 comments
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting A-rustflags Area: rustflags E-hard Experience: Hard S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@ehuss
Copy link
Contributor

ehuss commented Sep 19, 2020

When RUSTFLAGS is changed between invocations, this can trigger a complete rebuild. This can be annoying, and the only workaround is to use a separate target directory for each RUSTFLAGS setting. I think we should revisit how this works, and see if there is a way to address the problems.

Motivation:

  • Changing RUSTFLAGS can be very expensive time-wise, and frustrating if your project relies on being able to swap between different settings.
  • Supporting a shared build directory cache would be thwarted any time someone changes RUSTFLAGS.

History:

Solutions?
I am uncertain what a good approach would be. Some rough ideas:

  • Go back to hashing RUSTFLAGS, but only in the -C extra-filename, and not -C metadata. This should (probably) avoid the PGO problems, since the symbols won't change. However, this could still cause problems with reproducible builds if the rlib filename somehow leaks into the resulting binary (I don't have a reason to think it does, but it seems risky).
  • Add a mechanism for specifying RUSTFLAGS so that the user can decide whether or not it gets hashed. (Maybe a separate env var, or some string in the current one, or a side setting?) I feel like there is a problem that specifying RUSTFLAGS is already too complicated (considering the multiple overrides) and at the same time not flexible enough.
  • Provide first-class support for some of the things that people use RUSTFLAGS for today. For example, Cargo could have a "reproducible" mode where it tries to create a reproducible build, making whatever sacrifices are necessary. First class support for remap-path-prefix and PGO could also alleviate some of the problems.
@ehuss ehuss added A-rebuild-detection Area: rebuild detection and fingerprinting A-rustflags Area: rustflags labels Sep 19, 2020
@jyn514
Copy link
Member

jyn514 commented Sep 19, 2020

Provide first-class support for some of the things that people use RUSTFLAGS for today.

My main pain point with RUSTFLAGS is xd009642/tarpaulin#416 (comment), which has some niche flags and wouldn't be helped much by this. AFAIK all of those flags are codegen/linking related though, and don't affect .rlibs.

Add a mechanism for specifying RUSTFLAGS so that the user can decide whether or not it gets hashed.

Maybe there can be a way for rustc itself to tell cargo whether an option requires a re-compile? Something like rustc -Z needs-recompile "$RUSTFLAGS". I agree this shouldn't be left up to the user, my main complaint about make other build systems is they require you to specify build dependencies yourself.

I don't know enough about -C metadata and -C extra-filename to have an opinion.

@alexcrichton
Copy link
Member

However, this could still cause problems with reproducible builds

I believe this does because the filenames of object files show up as the codegen unit names in dwarf debug info (and probably PDB?). I suspect that everything else in the binary will be the same, but builds differing only in extra-filename are likely to still produce different binaries.

Add a mechanism for specifying RUSTFLAGS so that the user can decide whether or not it gets hashed

I think this will work but I don't think it'll be too feasible if the default is to not hash RUSTFLAGS. The pain of rebuilding too much you only discover after-the-fact that you should have been hashing rustflags all along. I don't think we can hash-by-default though since it breaks the use cases you've mentioned.

Provide first-class support for some of the things that people use RUSTFLAGS for today

I do think a --reproducible flag or --pgo is something we should add to Cargo no matter what, regardless of how this RUSTFLAGS issue is settled.


Another possible solution could be adding a new variable like RUSTFLAGS_HASHED? That way if tooling sets it the weird env name is fine (maybe the case for tarpaulin?)

tony84727 added a commit to tony84727/youki that referenced this issue Jun 14, 2022
building static-linked runtimetest requires additional RUSTFLAGS env
var. Unfortunately, the additional env var invalids the build cache,
leading cargo rebuilding everytime.

See rust-lang/cargo#8716

A temporary solution is to build runtimetest in a seperate
`runtimetest-target` directory.

Signed-off-by: Tony Duan <[email protected]>
tony84727 added a commit to tony84727/youki that referenced this issue Jun 14, 2022
building static-linked runtimetest requires additional RUSTFLAGS env
var. Unfortunately, the additional env var invalids the build cache,
leading cargo rebuilding everytime.

See rust-lang/cargo#8716

A temporary solution is to build runtimetest in a seperate
`runtimetest-target` directory.

Signed-off-by: Tony Duan <[email protected]>
jschwe added a commit to jschwe/corrosion that referenced this issue Jul 12, 2022
For some scenarios it is better to not set Rustflags for all crates
in the dependency graph and instead only set it for the top-level
crate.
For example rust-lang/cargo#8716
can be avoided in some scenarios by setting the rustflags via
rustc, which allows for faster rebuilds in such cases.
jschwe added a commit to jschwe/corrosion that referenced this issue Sep 6, 2022
For some scenarios it is better to not set Rustflags for all crates
in the dependency graph and instead only set it for the top-level
crate.
For example rust-lang/cargo#8716
can be avoided in some scenarios by setting the rustflags via
rustc, which allows for faster rebuilds in such cases.
jschwe added a commit to jschwe/corrosion that referenced this issue Sep 7, 2022
For some scenarios it is better to not set Rustflags for all crates
in the dependency graph and instead only set it for the top-level
crate.
For example rust-lang/cargo#8716
can be avoided in some scenarios by setting the rustflags via
rustc, which allows for faster rebuilds in such cases.
jschwe added a commit to jschwe/corrosion that referenced this issue Sep 10, 2022
For some scenarios it is better to not set Rustflags for all crates
in the dependency graph and instead only set it for the top-level
crate.
For example rust-lang/cargo#8716
can be avoided in some scenarios by setting the rustflags via
rustc, which allows for faster rebuilds in such cases.
jschwe added a commit to jschwe/corrosion that referenced this issue Sep 10, 2022
For some scenarios it is better to not set Rustflags for all crates
in the dependency graph and instead only set it for the top-level
crate.
For example rust-lang/cargo#8716
can be avoided in some scenarios by setting the rustflags via
rustc, which allows for faster rebuilds in such cases.
luckysori added a commit to get10101/10101 that referenced this issue Mar 8, 2023
We've used a similar pattern in past projects.

One thing that has changed is that we use
`[target.'cfg(feature = "cargo-clippy")']` instead of `[build]`. This
allows us to enforce this lint configuration locally and not just on
CI.

Using `rustflags` does mean that devs might encounter unexpected full
rebuilds when switching between IDE and command line when using `cargo
clippy`. This is a known issue[1]. For emacs, I just solved this by
setting a different Cargo `target-directory` to the one used by
default. A similar approach should work for other IDEs.

[1]: rust-lang/cargo#8716.
luckysori added a commit to get10101/10101 that referenced this issue Mar 8, 2023
We've used a similar pattern in past projects.

One thing that has changed is that we use
`[target.'cfg(feature = "cargo-clippy")']` instead of `[build]`. This
allows us to enforce this lint configuration locally and not just on
CI.

Using `rustflags` does mean that devs might encounter unexpected full
rebuilds when switching between IDE and command line when using `cargo
clippy`. This is a known issue[1]. For emacs, I just solved this by
setting a different Cargo `target-directory` to the one used by
default. A similar approach should work for other IDEs.

[1]: rust-lang/cargo#8716.
bors bot added a commit to get10101/10101 that referenced this issue Mar 9, 2023
235: Add Clippy lint workspace configuration r=luckysori a=luckysori

We've used a similar pattern in past projects.

One thing that has changed is that we use `[target.'cfg(feature = "cargo-clippy")']` instead of `[build]`. This allows us to enforce this lint configuration locally and not just on CI.

Using `rustflags` does mean that devs might encounter unexpected full rebuilds when switching between IDE and command line when using `cargo clippy`. This is a known [issue]. For emacs, I just solved this by setting a different Cargo `target-directory` to the one used by default. A similar approach should work for other IDEs.

[issue]: rust-lang/cargo#8716.

Co-authored-by: Lucas Soriano del Pino <[email protected]>
luckysori added a commit to get10101/10101 that referenced this issue Mar 9, 2023
We've used a similar pattern in past projects.

One thing that has changed is that we use
`[target.'cfg(feature = "cargo-clippy")']` instead of `[build]`. This
allows us to enforce this lint configuration locally and not just on
CI.

Using `rustflags` does mean that devs might encounter unexpected full
rebuilds when switching between IDE and command line when using `cargo
clippy`. This is a known issue[1]. For emacs, I just solved this by
setting a different Cargo `target-directory` to the one used by
default. A similar approach should work for other IDEs.

[1]: rust-lang/cargo#8716.
bors bot added a commit to get10101/10101 that referenced this issue Mar 9, 2023
235: Add Clippy lint workspace configuration r=luckysori a=luckysori

We've used a similar pattern in past projects.

One thing that has changed is that we use `[target.'cfg(feature = "cargo-clippy")']` instead of `[build]`. This allows us to enforce this lint configuration locally and not just on CI.

Using `rustflags` does mean that devs might encounter unexpected full rebuilds when switching between IDE and command line when using `cargo clippy`. This is a known [issue]. For emacs, I just solved this by setting a different Cargo `target-directory` to the one used by default. A similar approach should work for other IDEs.

[issue]: rust-lang/cargo#8716.

Co-authored-by: Lucas Soriano del Pino <[email protected]>
@jyn514
Copy link
Member

jyn514 commented Apr 22, 2023

Maybe there can be a way for rustc itself to tell cargo whether an option requires a re-compile? Something like rustc -Z needs-recompile "$RUSTFLAGS".

Another option is for cargo to inspect RUSTFLAGS and avoid rebuilding for flags it knows can only affect leaf crates or diagnostics, like -C link-args or -W lint flags. I know cargo has a policy of not inspecting RUSTFLAGS but I don't know the history or rationale for that decision.

@stephanemagnenat
Copy link

stephanemagnenat commented Jun 22, 2023

A note from a user: this issue also appears when one wants to build multiple target flavors of WASM, for example to have two versions, one basic and one with -C target-feature=+simd128 to be imported on runtime depending on the browser support.

@Tokarak
Copy link

Tokarak commented Jul 12, 2023

Maybe there can be a way for rustc itself to tell cargo whether an option requires a re-compile? Something like rustc -Z needs-recompile "$RUSTFLAGS".

Another option is for cargo to inspect RUSTFLAGS and avoid rebuilding for flags it knows can only affect leaf crates or diagnostics, like -C link-args or -W lint flags. I know cargo has a policy of not inspecting RUSTFLAGS but I don't know the history or rationale for that decision.

I agree that something like this should be done. As rust-lang/rust#110654 points out, current behaviour can be annoying.

First of all, one of the main focuses on the PR — changes in rustflags should not invalidate the cache. Bullet point 1 seems like the only way to solve it.

Go back to hashing RUSTFLAGS, but only in the -C extra-filename, and not -C metadata. This should (probably) avoid the PGO problems, since the symbols won't change. However, this could still cause problems with reproducible builds if the rlib filename somehow leaks into the resulting binary (I don't have a reason to think it does, but it seems risky).

As pointed out:

However, this could still cause problems with reproducible builds

I believe this does because the filenames of object files show up as the codegen unit names in dwarf debug info (and probably PDB?). I suspect that everything else in the binary will be the same, but builds differing only in extra-filename are likely to still produce different binaries.

There seems to be no road forward without refactoring something or adding something new - but what? For over 99% of users (it seems like just those who need reproducible builds AND care about the filenames of the units) the solution is one small tweak away. Can we expect the user to know if they care about filenames? Maybe this flag can propagate from the dependencies to the leaves?

What if units were simlinked (or hardlinked) to something not including the extra-filename? The links can be overwriten at neglible cost, while debuggers get a constant filename to work with. Maybe I'm missing the point of extra-filename.

Secondly, not all RUSTFLAGS values need to be fingerprinted.

Another option is for cargo to inspect RUSTFLAGS and avoid rebuilding for flags it knows can only affect leaf crates or diagnostics, like -C link-args or -W lint flags.

The great thing about this is that building a hashing blacklist is incremental and non-breaking. Even if some values are forgotten, it would still be better than now. It does not solve the problem for when something rebuilds, but it does make rebuilds happen less often. I see no downsides.

@epage
Copy link
Contributor

epage commented Jul 12, 2023

I know cargo has a policy of not inspecting RUSTFLAGS but I don't know the history or rationale for that decision.

CLI parsing is complex and having cargo parse another program's CLI means cargo needs to duplicate the knowledge of that other programs CLI. Trying to parse a subset of known flags has its own caveats that for when it falls down flat. Compared to rustup, we at least have the advantage that these CLIs ship together.

@epage
Copy link
Contributor

epage commented Jul 12, 2023

Another angle to this is that for #5931, I'd like us to avoid dealing with UX problems of the cache being poisoned, so I'd favor accidentally including too much in the fingerprint than not enough.

@Tokarak
Copy link

Tokarak commented Jul 12, 2023

I know cargo has a policy of not inspecting RUSTFLAGS but I don't know the history or rationale for that decision.

CLI parsing is complex and having cargo parse another program's CLI means cargo needs to duplicate the knowledge of that other programs CLI. Trying to parse a subset of known flags has its own caveats that for when it falls down flat. Compared to rustup, we at least have the advantage that these CLIs ship together.

Does this mean that currently changing the order of rustflags, or changing -C to --codegen, etc., will change the fingerprint?

@epage
Copy link
Contributor

epage commented Jul 12, 2023

Does this mean that currently changing the order of rustflags, or changing -C to --codegen, etc., will change the fingerprint?

From my reading of the code, yes, superficial changes like that will cause rebuilds.

One of the potential solutions in the original issue was to provide first-class support for some of the features. #12115 is an example of doing that and came from the pain I had from trying to use RUSTFLAGS for lint levels. Granted, that protects against order changes and -A vs --allow changes but we still fingerprint lints. For some, they would rather not. For others, they would expect that adding a new lint would fail the build if tit should trigger so we need to force a rebuild. One nice aspect is that it isn't recursive so for #5931 it won't cause extra dependencies to rebuild if you change your lint levels.

@RobJellinghaus
Copy link

RobJellinghaus commented Sep 6, 2023

Given that the current semantics of RUSTFLAGS is "this affects every crate you ever build anywhere" down to the leafmost level, it's hard to see how to change that without breaking everyone.

But my question is, isn't the current behavior the biggest possible hammer? Could a solution involve providing a limited-scope version of RUSTFLAGS? For example, if you wanted your RUSTFLAGS setting to affect only your mybinary crate, perhaps something like RUSTFLAGS_CRATE_MYBINARY to set flags for that crate only? All other crates would ignore that environment variable.

I don't know whether this scales to many of the RUSTFLAGS scenarios, and I'm handwaving about things like multiple registries, but certainly for anything affecting rootmost binary linking, it seems viable?

Edit: Thinking about it more, this is architecturally orthogonal to the issue of not wanting some particular flags to affect the symbol hashes of the crate, because that's an issue for the root binary. So this idea maybe helps with reducing the "blast zone" of RUSTFLAGS, but not the hash stability issue.

@RobJellinghaus
Copy link

RobJellinghaus commented Sep 7, 2023

Another variant of the above idea would be RUSTFLAGS_BINARY which would be RUSTFLAGS that apply to binary crates only. This would cover cases like PGO where you're just altering the final link.

We could potentially even say that ordinary RUSTFLAGS are included in the metadata/hash again, but that RUSTFLAGS_BINARY are not, and are treated like PGO flags today. This would let us get back to side-by-side cacheability for the relatively few cases where you really need RUSTFLAGS that apply to every crate you're building, while preserving the current property that PGO flags don't affect the metadata/hash.

I know it's another knob and we hate those, but really we need something less pervasive than RUSTFLAGS don't we?

@weihanglo
Copy link
Member

weihanglo commented Sep 7, 2023

Another variant of the above idea would be RUSTFLAGS_BINARY which would be RUSTFLAGS that apply to binary crates only. This would cover cases like PGO where you're just altering the final link.

cargo rustc can already achieve that, no?

Edit: perhaps this as well rust-lang/rfcs#3310? But if you only concern binaries not final artifacts, my point is not relevant.

@RobJellinghaus
Copy link

RobJellinghaus commented Sep 7, 2023

You're absolutely right, that RFC is very much related to this idea. I'm not sure which one would work better for more scenarios, but I'll consider it.

cargo rustc can already achieve that, no?

It seems that it can, but I didn't know about it and it seems little used in the community, or else why would the "PGO options in RUSTFLAGS" be such a problem? Is there some reason why cargo rustc hasn't already been used for PGO options more commonly?

Your point's well taken though, and I wonder whether cargo rustc is actually a better substitute for the "RUSTFLAGS for the root crate" RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting A-rustflags Area: rustflags E-hard Experience: Hard S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
None yet
Development

No branches or pull requests

8 participants