-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promote !
to a type.
#1216
Promote !
to a type.
#1216
Conversation
language they should apply equally well to `Never`/`Void` so I assume the old | ||
`ty_bot` was trying to be something crazier than this RFC's `!` (such as a | ||
subtype of all types, given the name). Could someone who was around back then | ||
clarify this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It had to be handled everywhere to prevent all sorts of weird errors and crashes, and it eventually became unmanageable.
Right now we mark an inference variable as diverging, to default it to ()
if unconstrained.
But it otherwise is a regular inference variable, and I believe you could fully emulate it after defaulting is implemented, e.g:
fn give_up<T = ()>() -> T {
panic!("I give up")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What sort of weird errors and crashes and why don't we have the same problem with Never
/Void
? Am I right that ty_bot
was a bottom type in the sense of it being a subtype of all types?
It's the defaulting to ()
that I don't like. If an expression doesn't return any value then it doesn't return the value ()
. So how does it make sense to pretend it has type ()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to find dozens of rust-lang/rust issues mentioning ty_bot
.
Yes, it was a bottom type, that's what "bot" and the bang sign refer to ("!" is supposed to look like "⊥").
Void
doesn't have significant issues because the compiler doesn't care: passing values of it around does not count as unreachable control-flow, nor can Void
be used in lieu of any other type.
Defaulting to ()
only happens if nothing constraints the type.
I guess you do run into trouble if you have, e.g. panic!() + 0
, as (): Add<i32>
does not hold.
We could prune obligations that do not need to be satisfied due to diverging inference variables and remove those paths from the CFG to avoid attempts at translating code that doesn't otherwise type-check.
But you have to be extra careful there, and you'd likely just run into corner cases everywhere, like we used to, although there's more tests now to get feedback from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was a bottom type, that's what "bot" and the bang sign refer to ("!" is supposed to look like "⊥").
It's important to note that the !
I'm advocating here is not a bottom type. It behaves exactly like Void
except that it can be silently cast to any other type (which IMHO is still ugly but it's necessary for backwards-compatibility). I can only imagine a bottom type being a pain in the ass. You'd need to have Vec<!> <: Vec<i32>
, fn() -> ! <: fn() -> i32
etc.
Void doesn't have significant issues because the compiler doesn't care: passing values of it around does not count as unreachable control-flow,
It probably should. If we know code is dead it would be good to be able to prune it.
Defaulting to () only happens if nothing constraints the type.
Well it shouldn't, that doesn't make sense. It should default to an empty type like Void
or !
.
I guess you do run into trouble if you have, e.g.
panic!() + 0
, as(): Add<i32>
does not hold.
With this RFC you could implement !: Add<i32>
. More generally:
impl<T> Add<T> for ! {
type Output = !;
fn add(self, rhs: T) -> ! {
self
}
}
and panic!() + 0
would compile fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subtyping was the easiest part of it all, just fwiw. It's like one line in the right type-relating match block.
Your magical silent cast cannot be achieved through coercion because coercions cannot be triggered in every possible corner case.
You're left with subtyping, either directly or through inference variables.
But really, what is the value in having unreachable code type-check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ericson2314 Resolution is not "propagated", inference variables are being created and updated during the traversal of a function, and then the types of all the nodes and side-table miscellanea are resolved (which queries the state of the inference variables where necessary).
I don't understand why <rvalue> as _
is necessary, when just creating a new diverging inference variable would have the same effect, if diverging inference variables are always defaulted to !
instead of ()
.
However, that might be backwards-incompatible if any code is depending on the current default of ()
(via trait impls).
This is testable: create an enum BottomDefault {}
lang item with no impls and use that as the default instead of ()
.
Optionally, modify rustc::middle::traits::select
to consider BottomDefault: Trait
as ambiguous (because unconstrained blanket impls would otherwise match).
That is a pretty small change. Once you've made it, ask @brson to do a crater run to evaluate the impact on the ecosystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@canndrew "Keep track of where every !
came from" - types don't carry around any such identifiable information, except for inference variables.
Also, the type of nodes cannot change after they've been type-checked, again due to information loss and a need for efficiency: you can't re-evaluate every constraint in the function to ensure nothing broke due to an introduced coercion.
Most constraints are transient, resulting from the expressions being checked, and thus are not kept anywhere.
Maybe a MIR desugaring would have an easier time with after-the-fact mutations, but they still would be quite inefficient.
What you are describing is almost exactly the current implementation using inference variables, but with defaulting to !
. See my above comment for more on that scheme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolution is not "propagated", inference variables are being created and updated during the traversal of a function, and then the types of all the nodes and side-table miscellanea are resolved (which queries the state of the inference variables where necessary).
Ah, so quasi-unification in a side table as the traverse happens.
I don't understand why
<rvalue> as _
is necessary, when just creating a new diverging inference variable would have the same effect, if diverging inference variables are always defaulted to ! instead of ().
There is two things going on here. The first is limiting whatever magic we do to rvalues, which I think is good idea no matter what:
struct Baz;
struct Foo { bar: Baz }
fn asdf() {
let x: ! = panic();
x.bar = Baz; // WTF
}
The second is adding this coercion under the hood. I agree that is not necessary, and basically an implementation strategy that shouldn't be noticeable. But I suspect it might actually help with the implementation: as opposed to have seemingly normal expressions be type checked rather oddly, we associated all the magic with one AST/IR node (the conversion).
Also, I'd immediately like to immediately deprecate the coercion once ! becomes a real type. Perhaps if panic!
and friends can be changed to be for<T> fn() -> T
, there ergonomic fallout will be acceptable. Pruning the no-op ! -> !
conversions from the AST/IR, and then warning on the others seems like the easiest way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would x.bar = Baz;
compile, given that x
has an undetermined type?
Maybe you could create such an example with a trait that has only one impl, as we assume that impl applies unconditionally.
Again, I don't see the point of introducing conversions to inference variables instead of instantiating !
as inference variables that default to !
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, didn't realize it wouldn't default to Foo
, but remain unconstrained. How about this?
struct Baz;
fn foo(_: &Baz) { }
fn bar(_: Baz) { }
fn asdf() {
let x: ! = panic!();
foo(&x) // WTF
bar(x) // still arguably WTF, but needed for back-compat
}
Overall I'm super for this! One thing to keep in mind: let x = SOMETHING as *mut !;
let y = unsafe { *x }; I suppose it's UB, but this could be some especially bad UB if the code is considered unreachable. |
@Ericson2314 It would be more like let x = unsafe { SOMETHING } as *mut !;
let y = *x; If you have a You could write something like: let x: () = ();
let y: ! = unsafe { mem::transmute(x) };
match y { // undefined behaviour! Where are we executing now!?
} (Assuming rust continues to treat empty types as having size 0). But you can already do that with enum Wub {
A = 0,
B = 1,
}
let x: u8 = 123;
let y: Void = unsafe { mem::transmute(x) };
match y { // undefined behaviour! Where are we executing now!?
Wub::A => ...
Wub::B => ...
} |
@canndrew I think @Ericson2314's I'm also +1000 for this proposal, presuming we can achieve some clarity on what the actual problems with the old |
Sorry, that was careless reading. I saw |
From a type-theoretic perspective, I absolutely agree Rust should have an empty type just like it has a singleton type. It should also be consistent about not calling |
@RalfJung Where is |
@eddyb : Certainly above someone wrote "Currently empty types are represented as ()". I did not check the documentation, and if Rust is already consistent about this, and it's just about doing more education, even better! |
@RalfJung Ah, that explains the confusion. The phrase "X is represented as Y" is not the same as "X is synonymous with Y". In particular, it is a particular implementation artifact that empty types are represented as having zero-size (and thus have the same quote-unquote 'representation in memory" (which is a bit of a joke since they do not actually have any bits in the memory at all)). This means for example that:
Update: see also for examples of such "fun": |
I would say that the phrase "representation in memory" is pretty much meaningless for empty types. A representation says for each value of the type how it is stored (a function from values to lists of bytes, for example). If there is no value, there's not much to talk about (a function from the empty set to something, there is exactly one unique function like that). In particular, A If this is all done consistently, there shouldn't be any problems. In fact, there's an entire class of languages that crucially relies on this kind of behavior for empty types, namely proof systems like Coq or Agda. These languages don't have unsafe operations, but by making the creation of an inhabitant of the empty type UB, that part is taken care of. |
I think the sentiment of this RFC is reasonable, but it is underspecified and not a particularly pressing problem. I agree with the idea that a bottom type might be an interesting extension, but reasoning thoroughly about it seems to add significant complexity, with relatively little benefit -- particularly since one can model it using your own enum. Moreover, the coercion/subtyping semantics from the RFC seem quite underspecified. In general, the code for our coercion and subtyping is too complex as it is, and I am very reluctant to try adding extensions for this use case, at least until we done a better job of formalizing and isolating the coercions we already do. |
I don't understand the details of the compiler well enough to specify how the coercion/subtyping stuff could work. Either someone who knows more would have to expand the RFC or I'll have to have a crack at implementing it myself and see what I can come up with - but I'll need help with that. Also, from my (probably niave) point of view I'm not convinced that this makes things more complicated. Rust already has empty types but it also this |
|
My understanding of type theory is limited, but it seems to me you're confusing the bottom type and the void type. The bottom type is for functions that never return, thus where The return type of I see the value of your proposal to allow |
I would argue that the void type and the unit type are both the same: They are inhabited (functions returning it can return), but there's no information to be gained from an inhabitant of the type. (The Void type is not a common term in type theory, but the empty type and the unit type are.) The bottom type is the type that has every type as a subtype, whereas the top type is the type that is a subtype of every type. A priori, both of these terms are only about the subtyping relation, and not at all about how or whether the type is inhabited.
EDIT:
Well, actually, both of these are fine. The point is, the code will never come to the point where |
And Also it's worth noting that the type proposed in this RFC is not the bottom type. The bottom type is a subtype of all types so making |
Regardless of where the "fault" lies, it seems clear that adding further coercions to the existing system will increase complexity. At the end of the day, I don't think there is sufficient motivation for such a change at this time -- particularly as we are unsatisfied with the existing coercion code and rules today, which are not capable of inferring all the coercions we would like. This matter of inference is relevant to this RFC as well: in particular, this change would be a breaking change unless we can reliably infer the necessary coercions, which I am pretty sure we would not. Therefore, I am inclined to close this RFC until we have improved the inference situation around coercions and/or a stronger motivation arises. |
This doesn't seem clear at all. The whole point of this RFC is to simplify the type system. Being able to write stuff like The fact that Is there a reason that That's not rhetorical. The guys who made C kept their language simple by not making
If you're thinking about adding more inference then that's all the more reason to think about adding
I hope you're wrong. Can you give a concrete example where we couldn't? |
TL;DR If you think this RFC is about adding super-useful features at the expense of more complexity then you've got everything backwards. |
Why would anyone want to write
Yes,
Empty type, bottom type, zero type, all refer to the bottom type which by definition is the subtype of all types. The empty type is theoretically any type you want it to be, by the same logic that on an empty set you can make any statement true. I just reread your RFC and I'm still failing to understand the advantages of |
I am sorry, but you are wrong here. Assigning the empty type to "break", as "calling , without ever returning here". This analogy is known to work perfectly, one can build compilers using CPS internally and translating that to assembly pretty easily. Clearly, "break" and "continue" are jumps in assembly, and indeed on CPS, they become continuations. So their natural return type is "!". Actually, if you use CPS, then all these high-level language features
There is no reason to add "! <= T" to the subtyping relationship. The |
They almost never would. They might want to write macro code that expands to something like that.
|
@Virtlink If I understand your reasoning correctly, you're arguing that @canndrew mentions that This outlines in what a weird state The way I understand it, the coercion mechanics proposed are merely a simplification. I think the compiler's existing special treatment of diverging expression could be kept entirely instead (the coercions would make it obsolete). This could address @nikomatsakis's concerns about the problems regarding correct inference of the required coercions, but I'm not nearly familiar enough with the compiler to assess this. Edit: Spent a few minutes writing this, so sorry for the overlap with those who were faster than me. |
By the way, how would this interact with I probably misunderstood what exactly the coercions are supposed to achieve. |
Well, I'm not sure if my argument is strong, but here it is. |
@alexbool But " |
Oddly enough, I consider this to be the best argument in favor of It makes many of those "poor questions" (from recent |
Agreed - fundamentally, a function marked with |
@alexbool |
You can now play with this on rust nightly! Just add It still has bugs, so please find them and report them. |
…komatsakis Add docs for "!" Never type (rfc 1216) Pull Request: rust-lang/rfcs#1216 Tracking Issue: rust-lang#35121
Promote
!
to be a full-fledged type equivalent to anenum
with no variants.Rendered
[edited to link to final version]