-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add exhaustive_match_output
RFC
#3340
Conversation
I love it. It's simple, self-explanatory even if you haven't seen it, and helps with fiddly parsers/decoders. |
Co-authored-by: Josh Triplett <[email protected]>
The alternatives section misses the obvious current solution: the dead code lint, which warns that "variant Also, since it doesn't affect the match semantics and is just a lint, it should be introduced as a lint. Perhaps an allow-by-default How does it interact with |
Good point, I'll mention it. As you say, the dead code lint doesn't cover all such scenarios like this so I think this attribute is still useful.
I considered this, but it seemed like a poor fit for clippy. Most clippy lints seem like you're intended to turn them on for the whole project, not just one match statement.
Perhaps this should be made incompatible with external |
A built-in attribute is a significant burden. If something can be made just a lint, I see no reason not to make it a lint. It's true that I can't think of any lints which are expected to be attached to a specific expression, but that's mostly because you don't normally know where an error could occur. In principle, I don't see a reason why a lint couldn't work that way. In fact, now that you mention it, I see even stronger reasons to make it a lint. Why not attach the attribute to the whole function, so that several individual matches inside of it (e.g. in different branches) can be checked for exhaustive outputs? Why not attach it to a module, or even to the crate? There are low-level crates and modules which mostly consist of several huge definitions of data enums, and their parsing/serialization logic. We could enable the lint once at the crate root, and perhaps disable it on several specific functions which are not expected to give exhaustive outputs. |
Is there something I can read to understand why this is? |
Ok, maybe "significant" is an over-exaggeration on my part (but I really think so). My opinion is that the attributes are accepted too lightly, without giving proper weight to the fact that each attribute is a small language extension, a burden on all readers of the code (which can't be expected to know by heart all attributes in the Rust docs, and there is no way to search for attribute definition in code) and that there is no way to manage the complexity of growing number of attributes, since they all exist at the global scope and can't be abstracted in any way. In particular, all built-in attributes can conflict with any users' custom attributes (either proc macros, or their auxiliary attributes). For that reason, if it's possible to scope the attribute, then it's best to do it. For example, if I see In my opinion, it's just too easy to dump any language extensions into attributes, since you don't need to invent extra syntax or think about interaction with alternative parsers (in macros or tooling). This causes a tendency to push all language extensions into attributes, turning them into that weird meta-Rust, with horrible syntax and potentially unforseen interactions with other language features (it's just so easy to forget about some obscure attribute which may not be present in code, rather than stable syntax, which is at least explicitly defined in the parsers and the reference). |
This is true of most abstractions yes? Abstractions, be they functions, macros, attributes, syntax, or lints have to earn their place. The overhead involved in understanding them needs to be worth the power or expressiveness that they introduce. I can express the documentation for this particular attribute in a paragraph or two, and it helps with some really common use cases, namely parsers/decoders of any sort. In order to use mental burden as justification for rejecting the RFC I think we need to demonstrate two things.
I don't believe we have justification for point 2, parsers/decoders are very common. As for point 1, I don't have much ideological issue with the lint living in some other tool, such as I agree it'd be useful to namespace |
lints without a prefix are implicitly rustc lints. it's not a great system, but it's the system. As to this being in clippy or rustc, in general it's always fine for any lint of a more narrow nature to start in clippy and graduate to rustc if it turns out to be essential. Or it can live forever in clippy, there's not much harm in that either. |
That makes sense to me and I'm willing to transition this to a clippy lint, if someone involved with a |
Josh T, the first to reply in this PR, is on T-lang if that's what you mean. I'm sure they'll chime in again when available. The contribution guide for clippy is here: https://github.com/rust-lang/rust-clippy/blob/master/CONTRIBUTING.md And the most important detail of clippy development, at least for this discussion, is that it does not have tightly controlled requirements for new lints. You can just open an issue in the repo about a new lint idea, and generally it can be added to clippy without any huge fanfare or oversight process, just a fairly normal PR review cycle. |
I like it! My only concern is that, as an attribute, it's not very discoverable but that is something that needs to be fixed in other places anyways imo, so it shouldn't be a blocker. |
one reason to not just have this be a lint is that a lint would apply to all match statements rather than just the one match statement with the attribute, e.g.: pub enum MyEnum {
ZeroZero,
ZeroOne,
One,
Two,
Other,
}
pub fn demo(iter: &mut impl Iterator<Item = u8>) -> Option<MyEnum> {
Some(#[warn(exhaustive_match_output)]
match iter.next()? {
0 => match iter.next()? {
// undesirably warns on inner match, because lint attributes apply to
// everything inside rather than just what's directly attributed.
0 => MyEnum::ZeroZero,
1 => MyEnum::ZeroOne,
_ => MyEnum::Other,
},
1 => MyEnum::One,
2 => MyEnum::Two,
_ => MyEnum::Other,
})
}
pub fn demo2(iter: &mut impl Iterator<Item = u8>) -> Option<MyEnum> {
Some(#[exhaustive_match_output]
match iter.next()? {
0 => match iter.next()? {
// doesn't warn on inner match -- as desired
0 => MyEnum::ZeroZero,
1 => MyEnum::ZeroOne,
_ => MyEnum::Other,
},
1 => MyEnum::One,
2 => MyEnum::Two,
_ => MyEnum::Other,
})
} |
Can't we just make it an allow by default lint if it shouldn't apply to all match statements by default? |
I think the issue here is that when you apply the deny attribute to the outer match, it also cascades into the inner match. |
@programmerjake I'd say it's very debatable whether your example is working as intended in either case. |
That may be, but the point they were trying to illustrate still stands. Here's a different example pub enum MyEnum {
Zero(Inner)
One,
Two,
Other,
}
pub enum Inner {
Zero,
One,
Two,
Other,
}
pub fn demo(iter: &mut impl Iterator<Item = u8>) -> Option<MyEnum> {
Some(#[warn(exhaustive_match_output)]
match iter.next()? {
0 => MyEnum::Zero(match iter.next()? {
// undesirably warns on inner match, because lint attributes apply to
// everything inside rather than just what's directly attributed.
0 => Inner::Zero,
1 => Inner::One,
_ => Inner::Other,
}),
1 => MyEnum::One,
2 => MyEnum::Two,
_ => MyEnum::Other,
}
)
}
pub fn demo2(iter: &mut impl Iterator<Item = u8>) -> Option<MyEnum> {
Some(#[exhaustive_match_output]
match iter.next()? {
0 => MyEnum::Zero(match iter.next()? {
// doesn't warn on inner match -- as desired
0 => Inner::Zero,
1 => Inner::One,
_ => Inner::Other,
}),
1 => MyEnum::One,
2 => MyEnum::Two,
_ => MyEnum::Other,
}
)
} |
This is basically the other half of https://doc.rust-lang.org/nightly/rustc/lints/listing/allowed-by-default.html#non-exhaustive-omitted-patterns -- that was about matching things, and this is about producing things -- so my default intuition here is that this should work exactly the same: allow-by-default lint that will never be on-by-default, but can be turned on in specific situations as needed. I agree we should avoid attribute additions in general, especially when they're about static analysis and not semantics. There's an infinite number of possible things that might want attributes to mark them, so those things would be best as lints or tool attributes or ghost code whenever possible. |
# Motivation | ||
[motivation]: #motivation | ||
|
||
Oftentimes it is necessary to convert unrestricted values into enum variants. Unrestricted values can include integers, strings, UUIDs, even other much larger enums. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that integers are exhaustive -- we even use that in matching, where you don't need wildcards to match integers (or char
s, for that matter!).
So shouldn't this use that same exhaustiveness checking? What if I wrote
#[exhaustive_output]
match my_u8 {
0 => 255,
x @ 1..=254 => x,
255 => 1, // Oops, I meant `0`
}
Could that give me a warning that I wasn't exhaustive?
More plausibly, if we get something like #[niche]
, then arguably anything we can support for enums we should be supporting for niched integers too -- otherwise it's arguably better to #[repr(u8)] enum Foo { _1 = 1, _2 = 2, _3 = 3 }
than to use RefinedU8<1..=3>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose so, though from an implementation perspective this complicates things a lot as the compiler has to infer the possible outputs of x @ 1..=254 => x,
, and pretty much every other integer pattern.
This has been requested as a clippy lint previously. As mentioned in that issue as a lint it's hard to know what to lint on specifically, and the only suggested solution was some sort of attribute like proposed here. |
|
edit: nevermind |
This was suggested as an attribute specifically because the behavior it introduces is undesirable for many match statements. I don't see how it's possible or desirable to implement this without specifically marking the |
The RFC uses the phrase "arm(s) for (some) variant" (emph mine). I don't think this concept of an arm being "for" a variant is properly defined in the RFC. It should be. I guess the intended meaning is something like "the arm is precisely an enum struct expression for the variant, possibly in curly brackets". The handling of With that definition, the use cases are fairly limited. For example, if any of the arms need to contain an inner Under "alternatives", the possibility should be discussed of solving the problem ("we might forget to add an arm") through a A limitation of this proposal is that it can't be used where the match wants to return the enum value and also some other information (a separate categorisation, or some additional data, or something). An alternative which would do this would be a way to annotate an expression with a requirement that somewhere lexically within that expression, all variants of a nominated enum must be constructed. (I don't think this is a particularly good idea, but it should be in "alternatives" probably.) Ie, not limit it to |
Agreed! I consider an arm to be "for" the variant if the arm contains a return expression with that variant. This means it's technically possible for an arm to contain multiple variants, I think this is good. Now, my next question is "how far should the compiler go in trying to determine what variants might be returned by a given expression?" i.e. if an arm contains a function call which returns a variant, do we count all possible variants that function might return as well? I think there isn't a clear answer here. If we say "yes, the compiler should recursively enter functions in return position to determine what variants they return" then we have a few problems
However if we say "no, the compiler should consider a function call as though it were no variants" then the lint suddenly isn't compatible with helper methods or enum constructors. This feels like a pretty significant limitation, but maybe the better option is to just accept that this is how the lint must work.
I think probably the best answer here is "ignore the presence or absence of a guard when evaluating an arm". I believe the lint should be satisfied that it is possible to output a variant. It's not the responsibility of the lint to check your logic in the guard.
Right, and in the event that multiple enums are being returned it's not clear which enum this lint should apply to. We could arbitrarily pick one, but any choice must be arbitrary. So, I think the correct answer is "refuse the question". The return type of the match must be an enum, not a type containing an enum. I think it's likely that's why you say "I don't think this is a particularly good idea" Thanks for your feedback! |
What happens if:
It seems that either (1) the attribute must be placed on a match of the exact form |
My first thought on that is only supporting constant-expression right hands, since most cases like that can be rewritten like match (int, bool) {
(1, true) => Variant1,
(2, false) => Variant2,
...
} Which seems like it could be supported |
Actually apparently that was already discussed. For whatever reason my first look here had GitHub not listing any comments. But now my wider objection is that this is "warn if you don't cover all variants but only if there are no function calls, you don't return anything else, you never return multiple variants in a branch, and you don't use if guards" This seems more like something that should just be a normal lint, "it looks like you are trying to convert a whatever to this enum but you are missing..." wherever it can be applied already, rather than something where a human figures out where it might be applied and remembers to do so. If a human user is expected to remember it, that seems to me to be the same human-memory-space of remembering that strum exists, or that maybe you should search for a crate etc. So instead if we want this, I'd make the human user either globally allow it if it's on by default, allow it in specific cases if it's on by default, or let the specialized subset of people who are doing this frequently just enable it for their code. The only other Rust feature I can think of offhand with that many edge cases is const generics and I guess also GAT, but both of those are that way because they're not "finished" in some sense and not by design. Now all that said I would actually use this. I see the utility in having a mechanism, I just don't know that I agree with the mechanism because the mechanism is weirdly restricted and doesn't have a good model. I suppose one last alternative is to make the compiler warn on any match statement it can't fully handle, but that kind of feels like punting on the problem. Lastly, if we do this attribute, how is it being taught? This seems like the kind of thing that gets buried in the reference or something, though at this point perhaps someone has a complete C FFI guide that it'd fit into. |
So, I've thought about this some more and I've come to a few conclusions.
I recognize that I've described an incredibly complex feature here, and it will almost certainly need further clarification and definition. For starters, I'd like to get some feedback on how the community feels about my general ideas. |
i think it should not do analysis across function boundaries, however, if it does, imho there is no good reason to not analyze |
Okay, small clarification, that statement isn't meant to apply to |
The group should know I have no current plans to progress this RFC as-is. It's clearly not refined enough to describe clearly exactly what it is going to do and my attempts to refine it have led to multiple approaches with no clear winners. I am left wondering if this issue would be better resolved by macro crates in the crates io ecosystem because any possible solution to these issues is bound to be controversial for some reason or another. |
Rendered