-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why can consts not refer to statics? #11
Comments
I think the root issue is static FOO: AtomicUsize = AtomicUsize::new(42);
// hides the static behind a regular reference
const BAR: &AtomicUsize = &FOO;
const fn bar() -> usize {
BAR.swap(99, Ordering::Relaxed)
} Multiple calls to |
Additionally const WAT: usize = BAR.load(Ordering::Relaxed); seems very weird, because |
Okay, so this is actually again the same problem as with promoteds: It's about statics that could have been mutated. Once again a rule for promoteds is actually a rule for statics. Maybe we should make the CTFE engine error out on such accesses, just to make sure the static checks didn't miss anything? Not sure how to do that, though. |
Oh, and this does not explain why the following is not allowed: static FOO: usize = 13;
const BAR: &usize = &FOO; Is that just a limitation of the analysis? |
Maybe a desirable limitation. If that was lifted, we'd get a backwards-compatibility hazard where the lack of privates with interior mutability becomes a non-obvious part of your public API, right? (though I guess it's fine for primitives like |
@Ixrec I am talking about taking the address of a static without ever using it. That seems fine no matter the type. Of course the question is how we can be sure that it won't be used. Between ruling out |
I just hit a use case where some way for a My goal was to convert a string interner from handing out indices, to handing out direct pointers. This would have been a performance optimization for programs that frequently do more than simply compare string identities- e.g. printing them out, checking metadata. Pointers from consts to static come into this because I already had a large collection of consts for "well-known" interned strings, which I used in patterns. With indices, those consts were just (wrapped) integers, the interner constructor inserted the strings in the same order, and everything worked fine. With pointers, all the tests started failing, because separate uses of a const would point to separate copies of the string! Currently, the only way in Rust to enforce that all parts of a program share a common address in memory is to use a static. But statics don't work in patterns (nor would that really make sense, AIUI), and consts can't point to them, so I'm faced with a frustrating choice: either give up the consts (and the pattern matching), or give up the optimization. |
@rpjohnst to be clear, computing the value of your constants would not actually read from the static, they would just reference it? I think that should be soundly possible. We even already should have the const-eval engine checks in place which ensure that you do not read through those references. The main problem however is that I do not see a good way to do such a more precise check (allow references but not reads from statics) pre-monomorphization -- so this would result in more post-monomorphization errors, which is not great. |
Yes, I could write it such that computing the value of a constant does not read from a static and only takes its address (or the address of one of its fields, the way I wrote it, which should be just as sound). (Not related to this issue, but that makes it slightly uglier unless we also get unsized statics or limited type inference for statics, because their type would be To confirm my understanding of why this would need to be a post-monomorphization check, is this the sort of thing you meant by a choice between "ruling out I wonder if we couldn't lift the restriction only for non-generic consts, at least in the short term. Lifting it for generic consts would require some sort of trait-like tracking, for which I can imagine two possibilities at the moment:
|
We could even allow that, reads from immutable statics (no
Yes, that is a great example. Another way to say this is that we are looking for the property "does this read from a static" in the const-initializer, and that const-initializer is written in a turing-complete language. Thus whether or not the property holds is undecidable (in the "halting problem" sense). So, even without associated consts, basically the only thing we can do is to just run the initializer and see if it does the bad thing. That is in contrast to now where we do an overapporixmative analysis and reject referring to statics, which then also rules out any attempts to read from them.
We could, but that would be a very odd "split" to have, I think. There is no precedent for a split like that as far as I know, and I am undecided if this is better or worse than a post-monomorphization check. |
Wait, if we can allow reads from truly-immutable statics then couldn't we just always allow that, even around the generic cases? The results of "does this reference a Does that sort of rule feel any better than a post-monomorphization check or the odd "non-generic" split? It should remove the need to actually run the initializer, though it does add a bit of extra information that needs to be propagated across statics (though not consts). Edit: this would also rely on not having associated/generic statics, I think? We don't currently, but if we added them this sort of analysis would have to become post-monomorphization again, or else we'd get the weird "non-generic" split again. |
I think this is mostly legacy. And you raise a good point, but I need to explain some background to translate this proposal into something more concrete. Some backgroundWe have two layers of defense against misbehaving consts currently: we have some static checks (in the sense of static analysis, not the Over time, we added a second line of defense. When the code is "run" in the interpreter that evaluates const/static initializes, we have some dynamic checks that look at what the code actually does, and when it does something bad, we stop the code. This analysis is precise, meaning it should catch all errors and essentially make the static analysis "unnecessary" (modulo what is tracked in #17 and any other concerns that we missed...), but it runs late, after monomorphization, because it needs all the data to actually compute the initializer. Current statusThe dynamic check currently rejects all reads from statics. This is slightly conservative, we could permit reads from immutable statics, but @oli-obk wanted to start conservative here and I didn't push back. I think we can relax this. The static check rejects even mentioning any static, because that is the only way we know to be sure that a static won't be read from later. Permitting the code to create a reference to a static, but then ruling out that the reference is actually read from, is not feasible. Your proposalI think you are proposing to
I think that should be fine, and it should not introduce any new post-monomorphization errors. @rust-lang/wg-const-eval what do you think? |
This relies on the extensions to Rust's `const`s described in issue rust-lang/const-eval#11, and will not compile today.
This relies on the extensions to Rust's `const`s described in issue rust-lang/const-eval#11, and will not compile today.
The problem is that any static (even a truly immutable one) can refer to other statics (which may then have interior mutability). So we'd need to enforce transitive immutability. Of course we can do that, but the last time I tried this became a non-fun excercise. Basically we run into query cycles for cyclic statics whenever we try to do such an analysis based on the value of a static. We could do a conservative type based analysis though. Basically any static containing only |
Last attempt: rust-lang/rust#66302 |
I would love to try implementing this, even just the conservative type-based version. |
Cool, so I guess the first step is to give https://github.com/rust-lang/rust/blob/57e1da59cd0761330b4ea8d47b16340a78eeafa9/src/librustc_mir/transform/check_consts/ops.rs#L341 a feature gate that turns it on. Then you need to make it more fine grained by giving https://github.com/rust-lang/rust/blob/57e1da59cd0761330b4ea8d47b16340a78eeafa9/src/librustc_mir/transform/check_consts/ops.rs#L340 a field with the type of the static, so the I was initially considering doing the type check via an Instead we can copy what #[lang = "conservative_transitive_freeze"]
pub(crate) unsafe auto trait ConservativeTransitiveFreeze {}
impl<T: ?Sized> !Freeze for UnsafeCell<T> {}
impl<T: ?Sized> !Freeze for *const T {}
impl<T: ?Sized> !Freeze for *mut T {}
// `PhantomData` is guaranteed immutable.
unsafe impl<T: ?Sized> Freeze for PhantomData<T> {} Then we repeat all the boilerplate that exists for |
@rpjohnst so nowhere in your use-case is there interior mutability, even in the types, that could become a problem here? @oli-obk ah good point, I had not considered transitive accesses. With unsafe code, type-based analyses can probably be circumvented, but I presume having post-monomorphization errors when people write "unconst" code is acceptable? |
Yes, I don't think we can ever protect us from unconst errors without post monomorphization errors. We can't even protect us from post monomorphization integer math overflows. |
Sounds good to me, I was holding off on updating rust-lang/rust#71332 until I could better understand those anyway. |
rust-lang/rust#71655 landed and I am out of "panic mode" here -- discovering a new soundness constraint like this so late was something of a shock for me.^^ (I also somehow missed that using these consts in patterns is a key feature for you.) So here's, I think, roughly what would need to happen to allow consts-pointing-to-immutable-statics (and even reading from them) in our dynamic checks: currently we have a global invariant that pointers to (potentially mutable) statics can only exist in statics. Once we allow consts to read from statics, that could be used to leak pointers to mutable statics into a const evaluation. We could try to control that leakage, but honestly that seems like playing with fire to me. Instead I think we should ditch the global invariant, and directly check the property that we need: references in consts must point to read-only allocations. This is easy to add to reference validation, but the annoying part is that we currently stop const validation when encountering a static defined in another crate. That could miss references to mutable memory. So we have to keep going. (If this is a foreign static, we should probably just error, as we cannot check their body.) If this is too expensive for performance, we could have an analysis that remembers for each static "is mutable memory reachable from this". Then we could easily allow consts to point to statics where that is not the case. The trouble is, that is rather coarse-grained, so just going on recursively would be better. @oli-obk what do you think? |
If I understand this correctly, the "new" constraint here is that using a const in a pattern recursively dereferences all references it contains, const-ly. Or in other words, this is a bigger version of the same problem we discussed above- if a const holds a reference to mutable memory, some other unsuspecting const context may dereference it. There, we assumed references in a const could only be dereferenced in two ways:
But there is a third way:
So... do I understand this (how patterns work) correctly? That is, shouldn't the dynamic check catch any attempts to read mutable state while const-eval-ing consts in patterns? And you're instead talking about validation, and using it to ensure all consts can be used in patterns? Assuming I'm not horribly misunderstanding any of that, this new validation check sounds like what @oli-obk described in #11 (comment), where a value-based analysis must somehow deal with cycles. Given that an end user should only ever see evaluation-time or validation errors if they are using |
I believe it would be fine to go with pattern expansion time checks right now. In the future, we may allow associated constants as patterns, at that point we need to revisit this problem. Though I worry that by allowing it now, we can't fix it in the future for associated constants without monomorphization time errors. |
We need to do this anyway, as the reference may not point at the static directly, but at a field of the static. So if we point to static FOO: (AtomicUsize, usize) = ...; then we encounter a mutable allocation, but not something that will ever be mutable. So we basically just continue with the type based analysis, and if the user did some unsound stuff in order to get a |
That is a way to put it, yes.
The evaluation doesn't happen in const-eval though. The evaluation happens at run-time using
No, now you are confusing the static checks and the dynamic checks. I was only talking about dynamic checks. "Value-based" vs "type-based" refers to the static checks. See this post for some more background on those two independent checks that we have. I was solely talking about how to relax the dynamic checks to allow your use-case. Once we agree on what the dynamic checks should permit or not, we can try to design static checks that approximate the dynamic checks as good as we can. But currently relaxing the static checks would be pointless as the program would still be rejected by the dynamic checks.
Are you referring to static or dynamic checks here?
I would not permit that case. It is basically impossible to have a dynamic check that allows this. Or maybe it is possible if we adopt Stacked Borrows into CTFE, but... (a) do we really want that, (b) maybe let's do the simpler thing first. |
neither. What I mean is to relax the dynamic checks for constants referring to statics and only do the pattern checks in the pattern match logic, in order to allow as much as possible in regular constants while preventing the use of some constants in patterns. |
Hm... yes I guess that would be an alternative. However, unless there are also plans to somehow do this in the static checks (so that consts pointing to interior mutable statics could actually become stable one day), I am not sure if the discrepancy is worth it? This approach poses the risk that patterns (or const generics) could use an "unsuited" const and forget to run the stricter check on it.
Btw, this strongly reminds me of rust-lang/rust#67534. |
For const generics, we must run a stricter check anyway, as it can't refer to statics at all afaik. Unless, maybe it would be ok to print relocations into statics by the ID of the static. That won't work for pointers into the interior though, but we don't have that right now anyway. Patterns could indeed use a constant wrongly, because they fall back to
ugh... that one. Ok back to the topic at hand. Let's teach validation (and maybe interning needs some changes, too?) that constants can refer to statics by removing the check for that and adding checks that constants only refer to immutable allocations. This may require interning to overwrite the mutability of some allocations to immutable during interning, but that's fine. |
Why would const generics not be able to point to immutable statics? So far I assumed const generics and patterns (with exhaustiveness checking) have basically the same soundness concern. In particular, both require a form of "identity"/"equality" on constants that most behave properly for things to be sound.
Okay we agree then. :)
I think it already does that -- I'd be surprised if we needed any more overrides. |
I need to start with some background: In const generics, a constant of type In order for rustc to be able to handle this, that means that if you have const FOO: &i32 = &42;
const fn bar() -> &'static i32 { &42 }
const BAR: &i32 = bar(); we need If we allow constants to refer to statics, that means we can end up with a constant static MEH: i32 = 42;
const MOP: &i32 = &MEH; which most definitely does not have the same value as Now we could teach const generics about statics and specifically state that we want to be able to choose a different impl depending on which static we pass as a reference. That seems like a totally reasonable feature to me, but the const generics RFC does not mention statics at all, not even in the future work section, so this would need a second RFC. What we can do is to simply reject any constant that (directly or indirectly) refers to a static in const generics. This is similar to my suggestion to reject constants that refer to statics during match pattern checking.
As you yourself found out in rust-lang/rust#71316 (comment), we don't do this everywhere/correctly. |
That's for raw pointers, which doesn't matter here (I hope neither pattern matching nor const generics dereference raw pointers :D ). Besides, we actually do reset mutability there -- and on top of that we |
Makes sense. The same is true for pattern matching.
Oh I see, that's how you want to do this. My first thought on this is: I am not sure if it's worth trying to shoehorn Maybe that same representation can even be used by pattern matching so we don't need all these special cases in
So we seem to agree that it makes sense to treat patterns and const generics consistently? |
Oh but this reminds me of something...
Note that when you use references in patterns, from what I understood here, they will be compared with So, I am not sure if this pattern matching on references is really what you are looking for. |
Well, yes, but I'm not using references in patterns. My Anything I've said about references in patterns was not specific to the symbol-interning use case that led me here, just trying to understand what rustc has to consider in general. |
Oh I see... so we allow pattern matching on raw pointer consts? But we don't allow them in const generics? That's... messy. :/ @rpjohnst I think what would help me is a small self-contained example of the kind of thing you should be allowed. Just so that we have a testcase that demonstrates "fait accompli". |
Here's a reduced playground version: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=048389010d2a718bfbf2771c8fa64b02 |
For reference, C++ approaches this by requiring pointer/reference-typed const generic parameters to point to (its equivalent of) statics, and then respecting pointer identity. So Details for pedantsActually,&42 is syntactically invalid to start with; C++ doesn't let you directly create a pointer from a literal. But this is just one of C++'s pointless limitations, since you can create a reference from a literal, and references are just dressed-up pointers (they have pointer identity). However, a reference created from a literal can't be used as a template argument.
I wouldn't suggest copying C++ in this regard; its requirement means, among other things, you can't even use a string literal as a generic parameter. On the other hand, Rust's behavior of erasing pointer identity IMO has some surprising consequences even without involving statics. If I compile this as a dylib: #![feature(const_generics)]
pub struct Wrapper<const REF: &'static u32>;
impl<const REF: &'static u32> Wrapper<REF> {
pub const PTR: *const u32 = REF;
}
pub fn foo() {
dbg!(Wrapper::<{&42}>::PTR);
} then link another crate to it which also does If I understand correctly, in reality, |
Don't we have this "issue" already for regular generics? I think we should do whatever regular generics do for dylibs. |
I think noone is very happy with this status quo. Pattern matching pointers is super weird. It's not even deterministic across builds for function pointers, because it totally depends on codegen units, optimization levels and arbitrary decisions in the compiler around inlining and deduplication. |
I like this, but I think we need to turn it around. |
This might even suggest a solution to the weirdness of pattern matching on pointers. The behavior of function pointers is similar to the behavior of pointers in consts- it's not even stable across a single program, let alone across builds. But the behavior of pointers to statics is, since that's the entire point of statics. So a normalization step like that could offer a convenient point to warn on patterns with non-static pointers, on top of filtering them out for types/generics. |
Thanks!
I am totally open to bikeshedding the names. ;)
I would have proposed to start linting against pattern matching raw pointers and function pointers... but it seems like that would close the door for @rpjohnst's use case. :/ Raw pointers that are integers ( Maybe we need a |
It's a bit more than just bikeshedding. I was informed that if we had two representations for the same
Yes.
I think we already do that, because you can have a
It's not unsound to use them, just super suprising just like if you wrote a big But this is getting off topic imo. Pointers are a different issue. I'm worried about static S: &str = "foo";
const X: &&str = &S;
const Y: &&str = &"foo";
match some_str {
X => {},
Y => {},
_ => {}
} Because I have no idea what we're supposed to do here. |
Yeah, that seems good enough for a lint. ;) It's not super off-topic because it is also about pointer equality and pattern matching, but sure, we can let this sub-thread lie for now.
Basically we "just" have to make a decision, right? Either choice works, but we have to be consistent. |
I just had another use-case. Some data structure, which has a However, there are some places (not in the hot path) where I do need to differentiate the sentinel. The only way I can think of to do that is by comparing the pointer to its known address (if I have had a way to differentiate based on its contents, I could use static promotion without guaranteed address equality). That means |
Rust rust-for-linux team also has a usecase for promoteds that point to statics. They don't actually need to read from the static's value inside the const, just be able to point to it. This would be fairly easy to do, we just need to remove the checks that block mentioning a static. The existing checks we already have in place will ensure any attempt to read the static fails. However for some programs that currently fail during type/const-checking time, this will mean we get post-monomorphization errors instead, so someone will have to convince the lang team that this is worth it. |
Ralf, are you arguing for expanding the scope of promoteds? I need to check with a physicist, but pigs are still not flying afaict 😃 (helicopters don't count) For real though: seems an odd thing to do to get immutable generic statics, but I guess it doesn't have the issue that duplication must be prevented. Maybe we could do some shenanigans and restrict other things when you mention a static item? |
Fair question.^^ No, I only want to expand the scope of consts. ;) Inline const blocks should do for what the kernel folks need I think. They are basically using this to do user-defined vtables, if I understand correctly.
Which things would it make sense to restrict? |
Okay turns out what the kernel people are actually running into is rust-lang/rust#118447. Still, seems worth to at least have a nightly feature for "const refer to static", so that one can start experimenting with it. |
rust-lang/rust#119614 proposes that we can unstably allow consts to refer to statics. |
Is there some deep fundamental reason why that can never work, or is it a limitation of the current (or a former) implementation?
The text was updated successfully, but these errors were encountered: