Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add target configuration #2991

Closed
wants to merge 2 commits into from
Closed

Conversation

nvzqz
Copy link
Contributor

@nvzqz nvzqz commented Sep 27, 2020

This proposes a new cfg: target, which matches the entire target triple string (e.g. arm-unknown-linux-gnueabihf). This also adds a CARGO_CFG_TARGET environment variable for parity with other CARGO_CFG_* variables.

Rendered

increases compile time and makes a crate incompatible with certain build
systems.

Otherwise, all available components would need to be specified separately:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vaguely remember that the target isn't necessarily the concatenation of the various target_{foo} parts. That said, I'm having trouble finding the part of the docs that says this...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x86_64-fuchsia is an example (it doesn't concatenate unknowns into the result).

cc @petrhosek @tmandry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang/LLVM treats empty components, none and unknown as equivalent, so for example x86_64-fuchsia, x86_64-unknown-fuchsia and x86_64-none-fuchsia are all considered equivalent and internally normalized to x86_64-unknown-fuchsia.

We prefer the shortest spelling for convenience, but if Rust prefers always using normalized triples, we could switch to using x86_64-unknown-fuchsia, I'd be fine with that. This wasn't possible in the past when LLVM normalized empty components inconsistently, but that issue has since been resolved.

@jonas-schievink jonas-schievink added A-cfg Conditional compilation related proposals & ideas A-target Target related proposals & ideas T-lang Relevant to the language team, which will review and decide on the RFC. labels Sep 27, 2020
- `"x86_64-pc-windows-msvc"`
- `"x86_64-unknown-linux-gnu"`

# Drawbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another drawback might be that technically this could be a breaking change with anybody that conditionally adds a --cfg target=... cfg either via RUSTFLAGS or cargo:rustc-cfg=target=... from a build script.

This goes for your target_abi PR also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so if a person's build.rs had lines like say

  let target = env::var("TARGET").expect("Couldn't read `TARGET`");
  println!("cargo:rustc-env=TARGET={}", target);

Then technically they'd hit a small break ;P

Copy link
Member

@thomcc thomcc Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that should be fine. The code would have to be:

let target = env::var("TARGET").expect("Couldn't read `TARGET`");
println!("cargo:rustc-cfg=target={}", target);

in order to have a break... Actually, this would define the target identically to what it was before, so they'd have to define it as something different (unless their definition would take priority?), or only in some cases, I guess.

This is a problem adding any new predefined cfg faces, and stagnating the set of cfgs feels pretty undesirable to me, so I don't know if it's a case worth caring about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string is equal to the TARGET environment var anyway, so target would not change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I meant by:

Actually, this would define the target identically to what it was before, so they'd have to define it as something different (unless their definition would take priority?), or only in some cases, I guess.

It might be a breaking change for:

let target = env::var("CARGO_CFG_TARGET_ARCH").expect("Couldn't read `CARGO_CFG_TARGET_ARCH`");
println!("cargo:rustc-cfg=target={}", target);

or

if some_condition {
    let target = env::var("TARGET").expect("Couldn't read `TARGET`");
    println!("cargo:rustc-cfg=target={}", target);
}

But these are pretty obscure cases...

string (e.g. `arm-unknown-linux-gnueabihf`). This also adds a `CARGO_CFG_TARGET`
environment variable for parity with other `CARGO_CFG_*` variables.

# Motivation
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the motivation section provides enough of a compelling argument for why this is desirable. Fleshing this out with specific examples of existing crates that use workarounds to achieve this and why they do so would be useful.

# Drawbacks
[drawbacks]: #drawbacks

- Configuring against specific targets can be overly strict and could make
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the biggest drawback in my experience. People will often reach for the most accessible way to target a platform even if it's the wrong one. We saw this a lot in the Firefox codebase where people would use the #ifdef for "using the GTK widget implementation" when they really meant "this is Linux" or other similar things.

`aarch64-unknown-none-softfloat`, yet one would likely want to include ABI
variants. The same concern applies to the target vendor.

A potential solution would be to allow glob matching (e.g.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How feasible would it be to allow specifying individual components of the target triple string by keyword inside a target cfg, something like:

#[cfg(target(arch="aarch64", os="none"))]

which would be equivalent to:

#[cfg(all(target_arch="aarch64", target_os="none"))]

?

I definitely agree with the assertion that "matching the entire target with the current set of cfg options is extremely verbose", but I'm not convinced that "allow matching the entire target string directly" is the right way to address that.

@ehuss
Copy link
Contributor

ehuss commented Oct 1, 2020

One problem with matching against target names is that it doesn't work for JSON targets. This is a problem with the standard library which has several build scripts when look at the target name, and that is something we'd specifically like to move away from. I think it would be a problem if it was allowed to use them in cfg expressions, as that makes it harder to forbid using target names.

It might be helpful to collect data on how common it is to have cfg expressions that match more than 3 target fields or otherwise needs to match the full target name. In my experience, that seems rare.

@thomcc
Copy link
Member

thomcc commented Oct 1, 2020

it doesn't work for JSON targets

Can they not specify their own target name? Or even if they do does it not work (which would sound like a bug to me)

@ehuss
Copy link
Contributor

ehuss commented Oct 1, 2020

The problem is that the target name is whatever the filename is minus the .json extension. You can name it similarly to whatever you are trying to use (for example, x86_64-unknown-linux-gnu.json), but that is awkward and can be misleading if it isn't exactly the same as the named target (and something few people know about). To make JSON targets work properly, nothing should be inspecting the target name.

@withoutboats
Copy link
Contributor

@rfcbot merge

We discussed this in lang today and we feel this is a pretty obvious extension based on the features we currently have. That is, we let you cfg on each member of the triple, we should also let you cfg on the triple as a whole.

We also wanted to tag @rust-lang/cargo to be aware of this RFC because it adds a cargo env variable.

@rfcbot
Copy link
Collaborator

rfcbot commented Oct 28, 2020

Team member @withoutboats has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Oct 28, 2020
@nvzqz
Copy link
Contributor Author

nvzqz commented Oct 30, 2020

@withoutboats did the team discuss @luser's suggestion of #[cfg(target(arch="aarch64", os="none"))]? If so, I'm curious about what the conclusion was. I'm still in favor of matching the entire target triple (as described in the RFC).

@newpavlov
Copy link
Contributor

Relevant issue: rust-lang/rust#63217

@nikomatsakis
Copy link
Contributor

@nvzqz

We didn't talk about that, actually, and I think I'd like to see the RFC extended to cover the alternative. but before that, I'd like to hear your thoughts on why you prefer to match the entire target string (I have some thoughts but I'd be curious to hear yours first).

I see @luser also raised the concern that this makes matching the entire target string too accessible, and that folks will reach for it first. Plausible, though I think ultimately I'm not too worried about it. It seems like something to solve in documentation, by emphasizing the more accessible forms first, and presenting "match the precise target ABI" as a kind of fallback.

@nikomatsakis
Copy link
Contributor

@rfcbot concern match-target-string
@rfcbot reviewed

I'm going to go ahead and register a concern that we ought to include @luser's proposed form in the alternatives (at least) and describe why we chose not to adopt it.

@luser
Copy link

luser commented Nov 2, 2020

I see @luser also raised the concern that this makes matching the entire target string too accessible, and that folks will reach for it first. Plausible, though I think ultimately I'm not too worried about it. It seems like something to solve in documentation, by emphasizing the more accessible forms first, and presenting "match the precise target ABI" as a kind of fallback.

On further thought, cargo does already allow matching on the full target for specifying platform-specific dependencies, so perhaps it's not a huge problem in practice. Given that we do already have most of the components of the target already available for #[cfg(...)] usage it seems likely that most people will continue to use the individual component that meets their needs.

@thomcc
Copy link
Member

thomcc commented Nov 14, 2020

Worth noting that the arch isn't really the first part of the target string always. e.g. target_arch = "x86" is true for i586-unknown-linux-gnu and i686-unknown-linux-gnu.

@ehuss
Copy link
Contributor

ehuss commented Dec 4, 2020

On further thought, cargo does already allow matching on the full target for specifying platform-specific dependencies, so perhaps it's not a huge problem in practice.

As I mentioned above, the fact that Cargo allows matching on the target name causes problems today. I personally think this would be an anti-pattern (that is, I think it should be forbidden in libstd for example). I don't see any data or examples for a motivation for this. The vast majority of cfg expressions I've seen only match on one or two components.

@newpavlov
Copy link
Contributor

I don't see any data or examples for a motivation for this. The vast majority of cfg expressions I've seen only match on one or two components.

One example for which I wanted target cfgs is special-casing the wasm32-unknown-unknown target in the getrandom crate and in the attempt to integrate it into std. Unfortunately this target is quite "special" in the sense that, in my opinion, it should not have had std implemented for it, since strictly speaking it's a bare bone target without any information about execution environment. To be feature-proof against potential future WASM targets (such as WASI), it would be really nice to properly special-case its handling.

@joshtriplett
Copy link
Member

Following up on this: I share the same concern as @nikomatsakis, and I think I'd prefer that alternative. In particular, if this is primarily about wasm32-unknown-unknown, I would rather see that matched via a combination of target_family="wasm" and target_os="none", with some shorthand to make that simpler.

@joshtriplett
Copy link
Member

@rfcbot concern shorthand-for-target-by-components

I would prefer the proposed alternative that provides a shorthand.

@thomcc
Copy link
Member

thomcc commented Jun 1, 2021

As mentioned the shorthand is similar, but not exactly the same. The first component isn't quite the target arch, and it's valuable to match on it in many cases.

@nikomatsakis
Copy link
Contributor

@thomcc are you referring to this comment? Can you elaborate on what is enabled by this approach that would not otherwise work? Sorry if you're repeating yourself.

@m-ou-se
Copy link
Member

m-ou-se commented Jun 9, 2021

For most cross-platform libaries in the ecosystem it's probably an antipattern to match on a specific target triple. But for other software it's quite common that it's built for two or three specific targets (e.g. a company making software for specific hardware). In those cases, it would be a lot more readable to match exactly on the few target triples that are supported, rather than having to translate those into separate components.

(And stabilizing cfg(target = "a-b-c") would still allow a future cfg(target(x = "a", y = "b", z = "c")).)

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/lang meeting.

Our consensus was that we'd like to see both the ability to match on full target strings as proposed in this RFC, and the shorthand to match components of the target.

We'd like to see the shorthand proposal added to the RFC: cfg(target(a="x", b="y")) should be equivalent to cfg(all(target_a="x", target_b="y")). With that added, I think all the concerns can be dropped and this RFC should get merged.

@m-ou-se
Copy link
Member

m-ou-se commented Jun 15, 2021

Though, that means that cfg(target_somenewproperty = "123") will keep working (just evaluate to false) in older versions before that flag was added, while cfg(target(somenewproperty = "123")) would break older versions. But I suppose it already breaks in the period during which target_somenewproperty is behind a nightly-only feature gate, and produces an error on stable.

@joshtriplett
Copy link
Member

We'd like to see the shorthand proposal added to the RFC: cfg(target(a="x", b="y")) should be equivalent to cfg(all(target_a="x", target_b="y")).

Note that it should probably only do that for target_ flags that actually exist. cfg!(target_a = "hey") compiles fine right now, without even a warning.

Agreed. I was trying to briefly describe the shorthand, but yes, this should only work for target_ predicates that exist.

@nikomatsakis
Copy link
Contributor

My assumption is that this would be true only for the target_foo properties that the compiler defines, not some random properties the user may have defined on their own.

@nagisa
Copy link
Member

nagisa commented Jun 20, 2021

Presuming we added an alias to x86_64-unknown-linux-gnu that looks like x86_64-pc-linux-gnu, would #[cfg(target="x86_64-unknown-linux-gnu")] still match the latter? Would x86_64-pc-linux-gnu match the former?

If we made, within the compiler, the target code actually attempt to understand the triple string, and e.g. --target=x86_64-linux would select the usual linux target, would all the various cfg(target="...") values correctly get set, still?


I would also like to see the concern about json targets from above figured out before this merges.

@joshtriplett
Copy link
Member

joshtriplett commented Jun 20, 2021

@nagisa I would propose that x86_64-pc-linux-gnu become an indistinguishable alias to x86_64-unknown-linux-gnu as early as possible, such that Rust code can't tell the difference, as if the target was passed as x86_64-unknown-linux-gnu to begin with. target_vendor="unknown" should match and target_vendor="pc" shouldn't. target="x86_64-unknown-linux-gnu" should match and target="x86_64-pc-linux-gnu" shouldn't.

I'm not necessarily suggesting we should start elaborating/canonicalizing target strings in general; this (and similarly i686-pc-linux-gnu -> i686-unknown-linux-gnu) is a specific target alias that I think we should add because it's widespread and other tools understand it, and some Linux distributions use it and expect it to work.

@nagisa
Copy link
Member

nagisa commented Jun 20, 2021

The pc ←→ unknown was just an example I had on hand, since I remembered the RFC during the discussion about potentially making these aliases.

My concern is broader though – I would like this RFC to clarify what the plan would be if we made the relationship between target string and target definition not a (usually) one-to-one mapping, but rather a (usually) many-to-one mapping.

In reaction to your response, I would also like to posit that it would be quite confusing if #[cfg(target="x86_64-pc-linux-gnu")] did not activate the code this attribute is associated with even after user has specified --target=x86_64-pc-linux-gnu. It'd feel a lot like a leak of an implementation detail, to be honest.

@thomcc
Copy link
Member

thomcc commented Jun 20, 2021

It'd feel a lot like a leak of an implementation detail, to be honest.

Perhaps, but it's really the only way that this RFC stays viable — otherwise there are a potentially unbounded number of target strings that you'd have to match. I think there needs to be a notion of a "canonical" name for the target, which the other names are normalized to (as Josh said) as early as possible.

That said, I kind of feel like it's plausible that this is not this RFC's responsibility, so much as the responsibility of whatever RFC causes the currently one-to-one mapping to become a many-to-one mapping. That said, I don't feel that strongly on that point

@chorman0773
Copy link

chorman0773 commented Jun 20, 2021

@nagisa I would propose that x86_64-pc-linux-gnu become an indistinguishable alias to x86_64-unknown-linux-gnu as early as possible, such that Rust code can't tell the difference, as if the target was passed as x86_64-unknown-linux-gnu to begin with. target_vendor="unknown" should match and target_vendor="pc" shouldn't. target="x86_64-unknown-linux-gnu" should match and target="x86_64-pc-linux-gnu" shouldn't.

I'm not necessarily suggesting we should start elaborating/canonicalizing target strings in general; this (and similarly i686-pc-linux-gnu -> i686-unknown-linux-gnu) is a specific target alias that I think we should add because it's widespread and other tools understand it, and some Linux distributions use it and expect it to work.

This wouldn't work if you were matching target strings for toolchain purposes, though (however, at that point, it may or may not be reasonable to just use/forward the TARGET environment variable).

@joshtriplett
Copy link
Member

I honestly think it'd be a leak of an implementation detail if you could detect which non-canonical alias were used. I think if we're going to have aliases at all (rather than just saying that you must pass the canonical name of the target), then those aliases should be translated to the canonical target name as early as possible, and then only that canonical target name and its components should ever be exposed to Rust code.

@chorman0773
Copy link

chorman0773 commented Jun 21, 2021

I honestly think it'd be a leak of an implementation detail if you could detect which non-canonical alias were used. I think if we're going to have aliases at all (rather than just saying that you must pass the canonical name of the target), then those aliases should be translated to the canonical target name as early as possible, and then only that canonical target name and its components should ever be exposed to Rust code.

What would you do, then, if you wanted to access the toolchain corresponding to the actual target? If I'm invoking with --target x86_64-linux-gnu (hypothetically), and I want ld for host, the program I need to find is x86_64-linux-gnu-ld not x86_64-unknown-linux-gnu-ld. Though I guess at this level it would be something for reading the TARGET environment variable from cargo and possibly passing it into the compilation process (though, depending on choices made for aliasing targets, using CARGO_CFG_target could be a better choice).

@joshtriplett
Copy link
Member

joshtriplett commented Jun 21, 2021

What would you do, then, if you wanted to access the toolchain corresponding to the actual target?

Cargo has a target.(target string).linker option; use that. If we need to expose that in more places, we should. We should also have an equivalent for the C compiler. And we could improve our handling of that to automatically find it in more cases. But build scripts should not need to guess.

If I'm invoking with --target x86_64-linux-gnu (hypothetically), and I want ld for host, the program I need to find is x86_64-linux-gnu-ld not x86_64-unknown-linux-gnu-ld.

If those both exist and refer to different programs, something has gone terribly wrong. Until we have automatic detection, it may make sense for a build script (or more likely its helper library, such as cc-rs) to simply check for names in order until it finds one that exists. For instance, try x86_64-unknown-linux-gnu-ld and fall back to x86_64-pc-linux-gnu-ld.

@chorman0773
Copy link

chorman0773 commented Jun 21, 2021 via email

@joshtriplett
Copy link
Member

joshtriplett commented Jun 21, 2021 via email

@chorman0773
Copy link

I'd also note that this applies to things other than the basic toolchain (linker, assembler, c/++ compiler). I highly doubt cargo has need to expose things like objcopy, or custom tools.

@luser
Copy link

luser commented Jun 21, 2021

What would you do, then, if you wanted to access the toolchain corresponding to the actual target? If I'm invoking with --target x86_64-linux-gnu (hypothetically), and I want ld for host, the program I need to find is x86_64-linux-gnu-ld not x86_64-unknown-linux-gnu-ld.

This feels very much out of scope for this RFC. It would be great for cargo to provide access to the various parts of the toolchain (I'm quite sure I've filed issues for similar things in the past), but that's an entirely separate problem.

@joshtriplett
Copy link
Member

We talked about this in today's @rust-lang/lang meeting. We decided that this RFC doesn't depend on any specific decision about how to handle target aliases, and we can discuss how to handle target aliases with target_vendor and target matchers in the future if and when we add target aliases.

@nikomatsakis
Copy link
Contributor

@rfcbot resolve match-target-string

@cramertj
Copy link
Member

@joshtriplett did you intend to resolve your concern?

@joshtriplett
Copy link
Member

joshtriplett commented Jun 25, 2021

@cramertj No, per #2991 (comment) I'm waiting for the RFC to be updated to cover the cfg(target(os="xyz", arch="abc", family="def")) shorthand.

@nikomatsakis
Copy link
Contributor

@rfcbot fcp cancel

This hasn't moved in a long enough time that I am going to cancel the FCP. I'd love for someone to pick this up and make the edits that @joshtriplett suggested, perhaps in a separate PR, it seems like we're very close to getting resolution here!

@rfcbot
Copy link
Collaborator

rfcbot commented Oct 19, 2021

@nikomatsakis proposal cancelled.

@rfcbot rfcbot removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Oct 19, 2021
@GuillaumeGomez
Copy link
Member

We just found a use case for it in rust-lang/log#490. I'll try to pick this up because we need it.

@GuillaumeGomez
Copy link
Member

I opened #3239 and added what was requested.

@nvzqz
Copy link
Contributor Author

nvzqz commented Mar 18, 2022

I am closing this in favor of #3239 since I have not been shepherding this RFC very well, and I trust @GuillaumeGomez to follow it through to acceptance.

@nvzqz nvzqz closed this Mar 18, 2022
@nvzqz nvzqz deleted the cfg-target branch March 18, 2022 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cfg Conditional compilation related proposals & ideas A-target Target related proposals & ideas T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.