Tracking Issue for layout information behind pointers #69835

CAD97 · 2020-03-08T20:56:32Z

The feature gate for the issue is #![feature(layout_for_ptr)].

This tracks three functions:

core::mem::size_of_val_raw<T: ?Sized>(val: *const T) -> usize
core::mem::align_of_val_raw<T: ?Sized>(val: *const T) -> usize
core::alloc::Layout::for_value_raw<T: ?Sized>(t: *const T) -> Layout

These provide raw-pointer variants of the existing mem::size_of_val, mem::align_of_val, and Layout::for_value.

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also uses as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Unresolved Questions

What should the exact safety requirements of these functions be? It is currently possible to create raw pointers that have metadata that would make size wrap via ptr::slice_from_raw_parts. Trait vtable pointers are currently required to always be valid, but this is not guaranteed and an open question whether this is required of invalid pointers.
- [A]Rc's current implementation requires calling Layout::for_value_raw on a pointer that has been drop_in_place'd, so this likely needs to stay possible.
How should this interact with extern types? As this is a new API surface, we could potentially handle them properly whereas the _of_val cannot. Additionally, extern types may want to access the value to determine the size (e.g. a null terminated cstr).

rust-lang/lang-team#166 is tangentially related, as it serves to document what requirements currently exist on pointee types and getting a known layout from them.

Implementation history

Allow calculating the layout behind a pointer #69079 implemented these functions, including the intrinsic adjustments required to support them.

The text was updated successfully, but these errors were encountered:

…ruppe re-add Layout::for_value_raw Tracking issue: rust-lang#69835 This was accidentally removed in rust-lang#70362 56cbf2f. Originally added in rust-lang#69079.

clarfonthey · 2021-08-31T03:21:58Z

Potential bikeshed: maybe these methods should be moved to the ptr module instead of mem, where they can drop the _raw suffix. There already is some precedent for methods in these modules sharing names (e.g. ptr::swap versus mem:;swap) and I think it feels more natural.

LegionMammal978 · 2022-04-23T19:35:07Z

The safety documentation on these functions is somewhat inaccurate. It states:

This function is only safe to call if the following conditions hold: [...]

If the unsized tail of T is:

a slice, then the length of the slice tail must be an initialized integer, and the size of the entire value (dynamic tail length + statically sized prefix) must fit in isize.

But the size of a custom slice DST is not necessarily the sum of the size of its prefix and the size of its slice tail. If the alignment of the prefix is greater than the alignment of the slice type, the compiler will insert additional padding following the slice, which is counted in the full DST size.

I stumbled on this issue when investigating custom repr(Rust) slice DSTs. Except when the slice type is a generic parameter subject to an unsizing coercion, such a DST cannot be constructed at all, since there is no sound way for users to query its layout. These functions nearly allow users to determine the size of such DSTs, but the safety requirements prevent this. We cannot directly get the layout of a null pointer with the desired length, since we have no way to determine the size of the prefix. We cannot even extrapolate from the layout of a 0-length DST pointer, since the compiler could add arbitrary padding once the length is increased. It would be nice if there were a fallible way to query the layout information, since this would trivially allow such DSTs to be allocated and initialized.

CAD97 · 2022-04-23T22:27:33Z

But the size of a custom slice DST is not necessarily the sum of the size of its prefix and the size of its slice tail. If the alignment of the prefix is greater than the alignment of the slice type, the compiler will insert additional padding following the slice, which is counted in the full DST size.

Well, the padding is statically sized, so in that sense, it's part of the statically sized prefix.

I also agree that there should be a fallible way to query layout information from pointer metadata, and have an experimental PR #95832 open to determine the cost of making size_of sound to call for arbitrarily sized slice tails (by saturating).

LegionMammal978 · 2022-04-24T01:11:06Z

Well, the padding is statically sized, so in that sense, it's part of the statically sized prefix.

In what sense is it statically sized? Consider this test program (playground):

#![feature(layout_for_ptr)]

use std::{mem, ptr};

const PREFIX: usize = 4;

#[repr(C)]
struct DST {
    align: [u64; 0],
    prefix: [u8; PREFIX],
    slice: [u8],
}

fn main() {
    for len in 0..32 {
        let ptr = ptr::slice_from_raw_parts(ptr::null::<()>(), len);
        let ptr = ptr as *const DST;
        let size = unsafe { mem::size_of_val_raw(ptr) };
        print!("{} ", size - PREFIX - len);
    }
    println!();
}

You can adjust the size of the prefix. With a 4-byte prefix, this is the output:

4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5

These are the differences between the PREFIX + len and the total size for each len. Clearly, this padding is dependent on the length of the slice.

CAD97 · 2022-04-24T07:47:21Z

Ah, right, there is trailing padding to the alignment. I just read your comment wrong the first time.

And also, if it's `#[repr(Rust)], it's good to reiterate that there are formally no layout guarantees anyway, so the compiler is within its rights to do wackier things if it wanted to.

The trailing slice is always at the same offset.

#![feature(layout_for_ptr)]

use std::{mem, ptr};

const PREFIX: usize = 4;

#[repr(C)]
struct DST {
    align: [u64; 0],
    prefix: [u8; PREFIX],
    slice: [u8],
}

fn main() {
    for len in 0..32 {
        let ptr = ptr::slice_from_raw_parts(ptr::null::<()>(), len);
        let ptr = ptr as *const DST;
        let offset = unsafe {
            ptr::addr_of!((*ptr).slice)
                .cast::<u8>()
                .offset_from(ptr.cast())
        };
        print!("{offset} ");
    }
    println!();
}

prints all 4s.

You could argue that "dynamic tail length" includes the dynamic padding to alignment, or we could just add a "+ alignment padding" clause.

One way around the issue of #[repr(Rust)] with nongeneric slice tails not being usable is to always go through #[repr(C)] SliceTail<T, U> { prefix: T, tail: [U] } so you can rely on being able to determine/precalculate the layout, but this is obviously not ideal.

LegionMammal978 · 2022-04-24T12:12:06Z

That makes sense wrt. repr(Rust); I was mainly surprised to learn that this trailing padding can exist also in repr(C). The safety comments on these functions are the only real mentions of an "unsized tail", so I'd assumed that the tail is placed fully after and independently of the prefix. Even in repr(Rust), every field in the prefix must have a fixed offset, so padding is the only thing that can occur after the tail.

CAD97 · 2022-04-26T06:49:31Z

Potential bikeshed: maybe these methods should be moved to the ptr module instead of mem, where they can drop the _raw suffix. [@clarfonthey]

What about Layout::for_value_raw? The feature is layout_for_ptr, and my original pre-PR draft used Layout::for_ptr, but that has a high chance of being misread as the layout to store the pointer, rather than for the pointee. Otherwise, I think I fully agree, but without a good name for the Layout function, keeping the parallel between of_val[_raw] and for_value[_raw] seems beneficial.

zopsicle · 2022-06-14T17:57:04Z

Perhaps it should be called Layout::for_value_raw_unchecked, so that a fallible version Layout::for_value_raw can be added later. (Although it is confusing to have an unchecked version without a checked version.)

joshtriplett · 2022-07-13T17:21:07Z

We reviewed this in today's @rust-lang/lang meeting.

It seems like these fit into the general story of pointee metadata, and we should consider them together with that. They may potentially want to make use of that metadata rather than operating directly on the pointers. cc #81513, which tracks the more general question.

CAD97 · 2022-07-13T17:27:41Z

cc also rust-lang/lang-team#166 which includes an argument that it may make sense to restrict custom pointee layout to be calculable from the pointer metadata and not allow it to be address/pointee-sensitive.

Jules-Bertholet · 2022-10-21T18:40:57Z

Shouldn't these functions guarantee that they are safe to call at least when casting the pointer to a reference and calling size_of_val/align_of_val on that would be safe?

CAD97 · 2022-10-21T21:22:00Z

It's not explicitly stated, but it the case that if the pointer is valid to reborrow as a reference that these functions are safe to call. The listed conditions should be a proper subset of "pointer valid to reborrow."

Jules-Bertholet · 2022-10-21T21:36:46Z

@CAD97 they aren't really a proper subset, as they don't account for new unsized types that might be added to Rust in the future.

Jules-Bertholet · 2022-10-21T21:49:19Z

I made a PR to fix this: #103372

joshlf · 2023-10-25T15:56:44Z

Unresolved Questions

In the interest of stabilizing something, would it be possible to choose the most conservative interpretations for these questions, leaving the door open to future work to loosen the requirements? Ie:

What should the exact safety requirements of these functions be? It is currently possible to create raw pointers that have metadata that would make size wrap via ptr::slice_from_raw_parts. Trait vtable pointers are currently required to always be valid, but this is not guaranteed and an open question whether this is required of invalid pointers.

For this, add a safety precondition that the pointer not result in size wrapping. Continue to require as a safety precondition that vtable pointers are always valid.

[A]Rc's current implementation requires calling Layout::for_value_raw on a pointer that has been drop_in_place'd, so this likely needs to stay possible.

IIUC, inside the stdlib it's okay to rely on behavior that is not promised publicly, right?

How should this interact with extern types? As this is a new API surface, we could potentially handle them properly whereas the _of_val cannot. Additionally, extern types may want to access the value to determine the size (e.g. a null terminated cstr).

For this, document that size_of_val_raw may panic on any input, and that the caller must handle this possibility. Probably worth calling out explicitly that the conditions under which it panics may be narrowed in the future, and potentially eliminated entirely.

jswrenn · 2023-10-25T20:48:21Z

I've filed a PR with the above recommendations: #117185

This only required documenting the overflow preconditions, since the possibility of panics was already documented and the implementation details of Rc should expressly not be publicly documented.

CAD97 · 2023-10-26T02:52:56Z

For this, add a safety precondition that the pointer not result in size wrapping. Continue to require as a safety precondition that vtable pointers are always valid.

It's already a requirement for size_of_val_raw and friends that "the entire size must fit in isize". Unless you mean to add the requirement to ptr::slice_from_raw_parts; that's a breaking change and not happening. That function's not even unsafe.

Those conditions are listed there as the reasons the function needs to be unsafe, and the functionally minimum precondition for this functionality.

IIUC, inside the stdlib it's okay to rely on behavior that is not promised publicly, right?

Yes, but I think you somehow have something backwards. Only permitting size_of_val_raw on pointers to a valid allocation (but which may contain an invalid value) is the most conservative precondition, and is significantly more conservative than the currently documented one. Functionally the only more conservative precondition possible is that it would be sound to do size_of_val(&*ptr) instead, at which point these functions aren't even necessary. Plus, the minimal precondition of "points to a valid allocation" is the one T-lang has signed off that they're comfortable committing to always being possible; guaranteeing more would require T-lang signoff again. (Or maybe T-opsem now.)

For this, document that size_of_val_raw may panic on any input, and that the caller must handle this possibility.

If you're going for the most conservative, it would be to just forbid usage with extern types initially. It's quite common in fact for "raw" unchecked versions of functions to be UB in cases where the safe function would panic or return a safe, if potentially misleading, default result.

In the interest of stabilizing something, would it be possible to choose the most conservative [answers to] these questions

I will agree there. To do so, my proposed replacement for the safety section:

If T is Sized, this function is safe to call, and is equivalent to calling size_of::<T>(). The pointer provided is irrelevant, and may even be null.

If the unsized tail of T is a [slice] or [trait object], then the pointer must be validly dereferencable to an [allocated object] capable of containing an entire value as described by the fat pointer (statically sized prefix plus dynamically sized tail). However, no actual reads are done, meaning that the validity of other pointers/references are not impacted. The most common way to fulfill this requirement is by pointing to a T value that did exist at some point, although it may have since been dropped, overwritten, or otherwise invalidated since.

Other unsized tail kinds can exist, such as (unstable) [extern type]s. It is not permitted to call this function with unsized T with these tail kinds. This means that having &T is not a sufficient precondition to call this function. In that case, you should use [size_of_val] instead.

[allocated object]: super::ptr#allocated-object

This is very slightly more permissive than "valid pointer to T or dropped T", but in a way consistent with the preexisting reality of untyped memory and pointer reinterpret casting being valid.

I'm not able to make a PR for this right now, but I should be able to sometime tomorrow.

RalfJung · 2023-10-26T05:27:19Z

What is the motivation for requiring the pointer to point to an actual allocation (or for ptr.offset_bytes(size) to not wrap)? I don't see a good reason why we would add that precondition.

CAD97 · 2023-10-26T06:52:12Z

The motivation is merely in being conservative and only stabilizing the minimal capability which the language team has already signed off on. It's also a condition marginally simpler to explain than a requirement closer to the actual operational requirement. No other API exists (on stable yet) which has a requirement on pointers weaker than "dereferencable" but stronger than the implicit safety invariant of pointers (whatever that is¹).

I would much prefer the condition as it's currently (and I originally) specified it, though I might rephrase it a bit given post-2019 understanding of safety/validity requirements. But the underlying presupposition is that a stricter requirement would be easier to stabilize.

I'm not quite sure which teams stabilizing the functions as currently specified would fall under. If it's exclusively the domain of libs-api, I could honestly open a stabilization PR a year ago with justifications that the current choices are the correct ones. It's the presumed need to get lang signoff on extending what's possible in stable Rust (and having hacked together partially functional workarounds for code where I've wanted to use this) that's precluded me actively trying to force this through to stable.

Since this safety requirement fundamentally needs to talk about the pointer metadata in order to be weaker than "dereferencable," it's thus also entangled with feature(ptr_metadata). And that's a much bigger gate to try to stabilize (and I still need to revive a change request for it). Removing that implicit dependency is a benefit to specifying the requirement here in terms of dereferencability.

Relevant: feature(trait_upcasting) makes it necessarily a safety invariant of pointers to dyn Trait that the vtable pointer references a valid vtable. I believe this got lang signoff as a requirement at some point, and as such could be removed from the requirements for size_of_val_raw. But given the nature of being a low level pointer function which shouldn't be doing such an upcast, it's probably still good form to mention it. ↩

RalfJung · 2023-10-26T07:44:57Z

Stabilization should definitively involve t-lang, since this extends what can be expressed in Rust.

Since this safety requirement fundamentally needs to talk about the pointer metadata in order to be weaker than "dereferencable," it's thus also entangled with feature(ptr_metadata).

The current spec doesn't require ptr_metadata I think? You need to recurse over the type to find the tail field, but I don't think it's needed to talk about ptr_metadata.

Dealing with dangling pointers is kind of the point of raw pointers, so I would find it somewhat odd that we would require dereferenceability here.

joshlf · 2023-10-26T14:24:46Z

Without getting into the weeds of which other features are implicated, I will say that from a user's perspective, this feature wouldn't really be useful if the memory needs to be dereferenceable. The reason I'm advocating for stabilization is to support this design, which synthesizes a raw pointer to a slice DST type in order to determine its minimum size (ie, the size of the value with 0 trailing slice elements). If dereferenceability is a safety precondition, then that use case isn't supported. Speaking more generally, most (maybe all?) cases in which a user has a raw pointer which satisfies the dereferenceability condition, it is possible to call size_of_val(&*ptr) or something similar, and get the same effect as size_of_val_raw.

This commit implements the recommendation of [1] to make the safety preconditions of the raw pointer layout utilities more conservative, to ease the path towards stabilization. In the future, we may (if we choose) remove some of these restrictions without breaking forwards compatibility. [1]: rust-lang#69835 (comment)

LegionMammal978 · 2023-10-26T23:40:17Z

The reason I'm advocating for stabilization is to support google/zerocopy#29, which synthesizes a raw pointer to a slice DST type in order to determine its minimum size (ie, the size of the value with 0 trailing slice elements). If dereferenceability is a safety precondition, then that use case isn't supported.

One problem, as I was saying above, is that the use case still isn't supported without either a) the guarantee that a repr(Rust) slice DST with 0 elements fits within isize::MAX, or b) a fallible version of size_of_val_raw() that can indicate that the layout wouldn't fit within isize::MAX. As far as I can tell, a) is not guaranteed, since such repr(Rust) slice DSTs don't necessarily fit within isize::MAX in the general case, so they wouldn't necessarily fit in the 0-length case either.

joshlf · 2023-10-27T02:53:06Z

The reason I'm advocating for stabilization is to support google/zerocopy#29, which synthesizes a raw pointer to a slice DST type in order to determine its minimum size (ie, the size of the value with 0 trailing slice elements). If dereferenceability is a safety precondition, then that use case isn't supported.

One problem, as I was saying above, is that the use case still isn't supported without either a) the guarantee that a repr(Rust) slice DST with 0 elements fits within isize::MAX, or b) a fallible version of size_of_val_raw() that can indicate that the layout wouldn't fit within isize::MAX. As far as I can tell, a) is not guaranteed, since such repr(Rust) slice DSTs don't necessarily fit within isize::MAX in the general case, so they wouldn't necessarily fit in the 0-length case either.

Ah good point. I've asked about (a) here: rust-lang/unsafe-code-guidelines#465 (comment)

CAD97 · 2023-11-16T22:18:20Z

Recording here: the fact that valid vtable pointer is considered a safety precondition of *const dyn Trait means that it would be possible to provide a safe fn checked_size_of_pointee<T: ?Sized>(val: *const T) -> Option<usize> for all current unsized tail kinds which returns None for isize overflow. I would very much like to have this, and its viability has an impact on these "unchecked" versions. The main question then is what we think we might want to permit future/custom unsized kinds to do (e.g. the theoretical MetaSized bound).

CAD97 added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Mar 8, 2020

jonas-schievink added A-allocators Area: Custom and system allocators B-unstable Blocker: Implemented in the nightly compiler and unstable. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Mar 8, 2020

dtolnay added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Jul 18, 2020

dtolnay mentioned this issue Jul 18, 2020

Allow Weak::as_ptr and friends for unsized T #74160

Merged

CAD97 mentioned this issue Jul 18, 2020

re-add Layout::for_value_raw #74488

Merged

KodrAus added Libs-Tracked Libs issues that are tracked on the team's project board. A-raw-pointers Area: raw pointers, MaybeUninit, NonNull labels Jul 29, 2020

Wolvereness mentioned this issue Feb 7, 2021

Stable API without UB Wolvereness/dst#2

Open

repnop mentioned this issue Jun 24, 2021

Are the #![feature(layout_for_ptr)] functions sound to call on invalid data pointers? #86602

Open

joshtriplett added the S-tracking-design-concerns Status: There are blocking ❌ design concerns. label Jul 13, 2022

CAD97 mentioned this issue Jul 13, 2022

Replace feature(layout_for_ptr) with feature(layout_for_meta) rust-lang/libs-team#67

Closed

joshlf mentioned this issue Oct 7, 2022

Support KnownLayout trait and custom DSTs google/zerocopy#29

Closed

34 tasks

Jules-Bertholet mentioned this issue Oct 21, 2022

Clarify preconditions of raw size/align methods #103372

Closed

jswrenn mentioned this issue Oct 25, 2023

constrain safety preconditions of layout_for_ptr functionality #117185

Closed

joshlf mentioned this issue Oct 27, 2023

Where to document allocation size upper bound? rust-lang/unsafe-code-guidelines#465

Closed

CAD97 mentioned this issue Nov 16, 2023

Clarify safety of layout_for_ptr #117991

Open

joshlf mentioned this issue Mar 6, 2024

[slice] Document slice DSTs, including size guarantees #117474

Closed

joshlf mentioned this issue May 11, 2024

[type-layout] Document minimum size and alignment rust-lang/reference#1482

Closed

Rua mentioned this issue Jun 15, 2024

Why don't slices use dedicated metadata structs? #125517

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue for layout information behind pointers #69835

Tracking Issue for layout information behind pointers #69835

CAD97 commented Mar 8, 2020 •

edited

Loading

clarfonthey commented Aug 31, 2021

LegionMammal978 commented Apr 23, 2022

CAD97 commented Apr 23, 2022 •

edited

Loading

LegionMammal978 commented Apr 24, 2022

CAD97 commented Apr 24, 2022

LegionMammal978 commented Apr 24, 2022

CAD97 commented Apr 26, 2022

zopsicle commented Jun 14, 2022

joshtriplett commented Jul 13, 2022 •

edited by Mark-Simulacrum

Loading

CAD97 commented Jul 13, 2022

Jules-Bertholet commented Oct 21, 2022

CAD97 commented Oct 21, 2022

Jules-Bertholet commented Oct 21, 2022

Jules-Bertholet commented Oct 21, 2022

joshlf commented Oct 25, 2023

Unresolved Questions

jswrenn commented Oct 25, 2023

CAD97 commented Oct 26, 2023

RalfJung commented Oct 26, 2023

CAD97 commented Oct 26, 2023

RalfJung commented Oct 26, 2023

joshlf commented Oct 26, 2023

LegionMammal978 commented Oct 26, 2023

joshlf commented Oct 27, 2023

CAD97 commented Nov 16, 2023

Tracking Issue for layout information behind pointers #69835

Tracking Issue for layout information behind pointers #69835

Comments

CAD97 commented Mar 8, 2020 • edited Loading

About tracking issues

Unresolved Questions

Implementation history

clarfonthey commented Aug 31, 2021

LegionMammal978 commented Apr 23, 2022

CAD97 commented Apr 23, 2022 • edited Loading

LegionMammal978 commented Apr 24, 2022

CAD97 commented Apr 24, 2022

LegionMammal978 commented Apr 24, 2022

CAD97 commented Apr 26, 2022

zopsicle commented Jun 14, 2022

joshtriplett commented Jul 13, 2022 • edited by Mark-Simulacrum Loading

CAD97 commented Jul 13, 2022

Jules-Bertholet commented Oct 21, 2022

CAD97 commented Oct 21, 2022

Jules-Bertholet commented Oct 21, 2022

Jules-Bertholet commented Oct 21, 2022

joshlf commented Oct 25, 2023

Unresolved Questions

jswrenn commented Oct 25, 2023

CAD97 commented Oct 26, 2023

RalfJung commented Oct 26, 2023

CAD97 commented Oct 26, 2023

Footnotes

RalfJung commented Oct 26, 2023

joshlf commented Oct 26, 2023

LegionMammal978 commented Oct 26, 2023

joshlf commented Oct 27, 2023

CAD97 commented Nov 16, 2023

CAD97 commented Mar 8, 2020 •

edited

Loading

CAD97 commented Apr 23, 2022 •

edited

Loading

joshtriplett commented Jul 13, 2022 •

edited by Mark-Simulacrum

Loading