-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CTFE: there is no way to compute the difference between two ptrs in the same allocation if they might be out-of-bounds #92512
Comments
One thing worth noting here again is that you can't actually use regular I'd also want to point out that the result of the function discussed in the issue you linked is mathematically correct in 100% of "runtime context where If actually called in a error[E0080]: evaluation of constant value failed
--> src/main.rs:11:19
|
11 | 0 => unsafe { ptr_offset_from(dest as *const u8, origin as *const u8) as usize },
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| ptr_offset_from cannot compute offset of pointers into different allocations. Which is fine with me as all I care about with regards to it is just "this is a single function that simultaneously works properly for ZST array element pointers at runtime and also non-ZST array element pointers at compile time". The original implementation (which no longer compiles of course) served that purpose just as well, producing the exact same valid results for ZSTs at runtime: /// An internal function for calculating pointer offsets as usizes, while accounting
/// directly for possible ZSTs. This is used specifically in the iterator implementations.
#[inline(always)]
pub(crate) const fn distance_between<T>(dest: *const T, origin: *const T) -> usize {
match size_of::<T>() {
0 => (dest as usize).wrapping_sub(origin as usize),
// Safety: this function is used strictly with linear inputs
// where dest is known to come after origin.
_ => unsafe { ptr_offset_from(dest, origin) as usize },
}
} Edit: actually, I wasn't entirely correct above: my #![feature(core_intrinsics, const_ptr_offset_from, const_ptr_offset)]
use core::intrinsics::*;
#[inline(always)]
pub(crate) const fn distance_between<T>(dest: *const T, origin: *const T) -> usize {
// Safety: this function is used strictly with linear inputs where `dest` is known to come after
// `origin`.
match size_of::<T>() {
// If `T` is a ZST, we cannot use typed pointers as it would either return an incorrect value
// or in some cases segfault.
0 => unsafe { ptr_offset_from(dest as *const u8, origin as *const u8) as usize },
// For all other sizes of `T`, however typed pointers are just fine.
_ => unsafe { ptr_offset_from(dest, origin) as usize },
}
}
struct S;
const X: [S; 4] = [S, S, S, S];
const Y: *const S = X.as_ptr();
// The call to `distance_betweeen` will return 0 *unless* we set this pointer
// up as a u8. Also, const eval fails and aborts the compilation if regular
// `add` is used below instead of `wrapping_add`.
const Z: *const S = (Y as *const u8).wrapping_add(2) as *const S;
const W: usize = distance_between(Z, Y);
fn main() {
// Prints "2".
println!("{}", W);
} |
The internal intrinsic
No it is not -- it is UB. The UB just happens to give the correct result for current compilers.
Yes, that's what I would expect: when the two pointers are derived from the same pointer in a provenance-preserving way, CTFE will correctly compute their difference. However, it only does so because it fails to check for UB. Once that UB check is implemented, your last example would also error. |
It's a completely logical contexual result, at the very least. My actual code does solely consist of what you called the "provenance-preserving way" when it comes to ZSTs and I would not expect anything else to work. Don't get me wrong though, it's not like I don't want to do "the right thing" (which I think I was in the first place with my original What that did for ZST pointers in practice is exactly the same thing as what It doesn't seem like any valid future solution for this overall issue could actually be any different than the current behaviour while still producing desirable results, also. |
No, that won't work -- then one can cast an arbitrary I see two solutions to the problem laid out in the issue description:
|
That seems like the best way to go to me, although I should be clear that my own use case for this does not extend beyond "something that works properly for ZST array element pointers (either directly or indirectly) without panicking".
It seems to already be as optimized as it could be if you ask me, so you're likely right. |
FWIW, the optimizations I alluded to would be optimizations of the surrounding code that could exploit the fact that these pointers are in-bounds. But that, too, seems rather speculative, and is probably not the best way for the programmer to make such a guarantee to the compiler. |
I'm not sure there actually even are too many useful scenarios where a person would call Like, I don't know what you'd expect to do with the value returned from calling those functions on arbitrary unrelated pointers. Until this all gets figured out though, should I bother changing the function you initially linked to something more like the following? It does the same thing to the extent I really care about, while objectively not being UB (as far as I'm aware): #![feature(
const_eval_select,
core_intrinsics,
const_ptr_offset,
const_ptr_offset_from,
inline_const
)]
use core::hint::unreachable_unchecked;
use core::intrinsics::*;
#[inline(always)]
pub(crate) fn runtime_zst_handler<T>(dest: *const T, origin: *const T) -> usize {
(dest as usize).wrapping_sub(origin as usize)
}
#[inline(always)]
pub(crate) const fn compiletime_zst_handler<T>(_dest: *const T, _origin: *const T) -> usize {
match size_of::<T>() {
0 => panic!("`distance_between` is only currently usefully `const` for non-ZSTs!"),
_ => unsafe { unreachable_unchecked() },
}
}
#[inline(always)]
pub(crate) const fn distance_between<T>(dest: *const T, origin: *const T) -> usize {
// Safety: this function is used strictly with linear inputs where `dest` is known to come after
// `origin`.
match size_of::<T>() {
0 => unsafe {
const_eval_select((dest, origin), compiletime_zst_handler, runtime_zst_handler)
},
_ => unsafe { ptr_offset_from(dest, origin) as usize },
}
}
struct S;
const X: [S; 4] = [S, S, S, S];
const Y: *const S = X.as_ptr();
const Z: *const S = (Y as *const u8).wrapping_add(2) as *const S;
// The next line does the panic at compile time if uncommented.
// const W: usize = distance_between(Z, Y);
pub fn main() {
// This one works properly.
println!("{}", distance_between(Z, Y));
} |
Yeah, that would be the Technically Correct thing to do. I'd recommend to call the // Safety: this function is used strictly with linear inputs where `dest` is known to come after
// `origin`. Note that it does not have to "come after", but both do have to point into the same allocation (the same array, in your case). |
Yeah, I think the original reason I was using it is that the method-like
I think I wrote that comment moreso in reference to the fact I'm casting the |
@Gankra since you spoke up around Knowing your stance on |
Ah I see, the offsets going wonky is "fine" for miri as long as the provenance is known, because then everything is well-defined offsets? I definitely don't like encouraging wrapping_offset stuff, but I believe this would work even under CHERI, because they specifically guarantee that the address is always right, even if compression scheme falls over. (The "slice" in the metadata is effectively defined to be relative to the address of the "actual" pointer, so the slice is the thing that goes wonky, but in a way that also self-heals if you go back in-bounds. Unfortunately the "corrupt" bit is sticky, so you lose if you actually load/store with it.) |
Yeah I think for CHERI none of this matters since we're ultimately not doing any memory accesses and producing an integer. |
I ran into this when trying to implementing const @RalfJung would it be possible to somehow allow wrapping offsets only for ZST pointers for now? |
The type of the pointer doesn't really matter, so I don't see how it helps for them to be ZST. |
Well I thought since ZST pointers cant be used to access memory anyways it would not be a problem if they would be considerd invalid if they ever go out of bounds |
You can (safely) cast them back to a non-ZST pointer though. And anyway this issue here is not about accessing memory to begin with, it's about the Can you explain in more detail what kind of code you'd like to write? |
I want to implement the |
All right, that's similar to what prompted this issue in the first place. But the ZST part won't help. What this code needs is basically to remove this requirement from offset_of
We have to still keep this one
and we'll have to say something about |
Looking at the current code in libcore, I think slice iterators can handle ZST without ever doing a pointer subtraction in the ZST case, so I think the only concrete motivation we had for this issue vaporized? Only the non-ZST path uses |
offset_from
is the only way to do pointer subtraction during CTFE (since casting pointers to integers is not possible). However,offset_from
requires both pointers to be in-bounds of the same allocation, making it unsuited for code that handles possibly out-of-bounds pointers. The need for this arises, for example, in the typical way that array/slice iterators support ZSTs, see e.g. slightlyoutofphase/staticvec#48.We should provide some way for
const
code to subtract pointers to the same allocation even if they are not in-bounds. Example code:Also see this Zulip discussion.
Cc @rust-lang/wg-const-eval
The text was updated successfully, but these errors were encountered: