-
Notifications
You must be signed in to change notification settings - Fork 514
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
294 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,246 @@ | ||
# Type Layout | ||
|
||
The layout of a type is the way the size, alignment, and the offsets of any | ||
fields and discriminants for the values of that type. | ||
|
||
**PR NOTE: This doesn't include valid values. E.g. `bool` and `i8` have the | ||
same layout under this definition. Nor does it include calling convention | ||
differences, so `u8` and `#[repr(C)] struct S { f: u8 }` have the same layout, | ||
as does `*T` and `&T`. I'm not sure if it should or not.** | ||
|
||
While specific releases of the compiler will have the same layout for types, | ||
there is a lot of room for new versions of the compiler to do different things. | ||
Instead of trying to document exactly what is done, we only document what is | ||
guaranteed today. | ||
|
||
## Size and Alignment | ||
|
||
All values have an alignment and size. | ||
|
||
The *alignment* of a value specifies what addresses are valid to store the value | ||
at. A value of alignment `n` must only be stored at an address that is a | ||
multiple of n. For example, a value with an alignment of 2 must be stored at an | ||
even address, while a value with an alignment of 1 can be stored at any address. | ||
Alignment is measured in bytes, and must be at least 1, and always a power of 2. | ||
The alignment of a value can be checked with the [`align_of_val`] function. | ||
|
||
The *size* of a value is the offset in bytes between successive elements in an | ||
array with that item type including alignment padding. The size of a value is | ||
always a multiple of its alignment. The size of a value can be checked with the | ||
[`size_of_val`] function. | ||
|
||
Types where all values have the same size and alignment known at compile time | ||
implement the [`Sized`] trait and can be checked with the [`size_of`] and | ||
[`align_of`] functions. Types that are not [`Sized`] are known as [dynamically | ||
sized types]. Since all values of a `Sized` type share the same size and | ||
alignment, we refer to those shared values as the size of the type and the | ||
alignment of the type respectively. | ||
|
||
## Primitive Data Layout | ||
|
||
The size of most primitives is given in this table. | ||
|
||
Type | `size_of::\<Type>()` | ||
- | - | - | ||
bool | 1 | ||
u8 | 1 | ||
u16 | 2 | ||
u32 | 4 | ||
u64 | 8 | ||
i8 | 1 | ||
i16 | 2 | ||
i32 | 4 | ||
i64 | 8 | ||
f32 | 4 | ||
f64 | 8 | ||
char | 4 | ||
|
||
`usize` and `isize` have a size big enough to contain every address on the | ||
target platform. For example, on a 32 bit target, this is 4 bytes and on a 64 | ||
bit target, this is 8 bytes. | ||
|
||
Most primitives are generally aligned to their size, although this is | ||
platform-specific behavior. In particular, on x86 u64 and f64 may be only | ||
aligned to 32 bits. | ||
|
||
## Pointers and References Layout | ||
|
||
Pointers and references have the same layout. Mutability of the pointer or | ||
reference does not change the layout. | ||
|
||
Pointers to sized types have the same size and alignment as `usize`. | ||
|
||
Pointers to unsized types are sized. The size and alignemnt is guaranteed to be | ||
at least equal to the size and alignment of a pointer. | ||
|
||
> Note: Though you should not rely on this, all pointers to <abbr | ||
> title="Dynamically Sized Types">DSTs</abbr> are currently twice the size of | ||
> the size of `usize` and have the same alignment. | ||
## Array Layout | ||
|
||
Arrays are laid out so that the `nth` element of the array is offset from the | ||
start of the array by `n * the size of the type` bytes. An array of `[T; n]` | ||
has a size of `size_of::<T>() * n` and the same alignment of `T`. | ||
|
||
## Slice Layout | ||
|
||
Slices have the same layout as the section of the array they slice. | ||
|
||
## Tuple Layout | ||
|
||
Tuples do not have any guarantes about their layout. | ||
|
||
The exception to this is the unit tuple (`()`) which is guaranteed as a | ||
zero-sized type to have a size of 0 and an alignment of 1. | ||
|
||
## Trait Object Layout | ||
|
||
Trait objects have the same layout as the value the trait object is of. | ||
|
||
## Closure Layout | ||
|
||
Closures have no layout guarantees. | ||
|
||
## Representations | ||
|
||
All **FIXME** types have a *representation* that specifies what the layout | ||
is for the type. | ||
|
||
Note: The representation does not depend upon the type's fields or generic | ||
parameters. | ||
|
||
The possible representations for a type are the default representation, `C`, the | ||
primitive representations, and `packed`. Multiple representations can be applied | ||
to a single type. | ||
|
||
The representation of a type can be changed by applying the [`repr` attribute] | ||
to it. The following example shows a struct with a `C` representation. | ||
|
||
``` | ||
#[repr(C)] | ||
struct ThreeInts { | ||
first: i16, | ||
second: i8, | ||
third: i32 | ||
} | ||
``` | ||
|
||
The representation of a type does not change the layout of its fields. For | ||
example, a struct with a `C` representation that contains a struct `Inner` with | ||
the default representation will not change the layout of Inner. | ||
|
||
### The Default Representation | ||
|
||
Nominal types without a `repr` attribute have the default representation. | ||
Informally, this representation is also called the `rust` representation. | ||
|
||
There are no guarantees of data layout made by this representation. | ||
|
||
### The `C` Representation | ||
|
||
The `C` representation is designed for creating types that are interoptable with | ||
the C Language and soundly performing operations that rely on data layout such | ||
as reinterpreting values as a different type. | ||
|
||
This representation can be applied to structs, unions, and enums. | ||
|
||
#### \#[repr(C)] Structs | ||
|
||
The alignment of the struct is the alignment of the most-aligned field in it. | ||
|
||
The size and offset of fields is determine by the following algorithm. | ||
|
||
Start with a current offset of 0 bytes. | ||
|
||
For each field in declaration order in the struct, first determine the size and | ||
alignment of the field. If the current offset is not a multiple of the field's | ||
alignment, then add padding bytes increasing the current offset until the | ||
current offset is a multiple of the field's alignment. The offset for the field | ||
is what the current offset is now. Then increase the current offset by the size | ||
of the field. | ||
|
||
Finally, the size of the struct is the current offset rounded up to the nearest | ||
multiple of the struct's alignment. | ||
|
||
> Note: You can have zero-sized structs from this algorithm. This differs from | ||
> C where structs without data still have a size of one byte. | ||
#### \#[repr(C)] Unions | ||
|
||
A union declared with `#[repr(C)]` will have the same size and alignment as an | ||
equivalent C union declaration in the C language for the target platform. | ||
Usually, a union would have the maximum size of the maximum size of all of its | ||
fields, and the maximum alignment of the maximum alignment of all of its fields. | ||
These maximums may come from different fields. | ||
|
||
``` | ||
#[repr(C)] | ||
union Union { | ||
f1: u16, | ||
f2: [u8; 4], | ||
} | ||
assert_eq!(std::mem::size_o::<Union>(), 4); // From f2 | ||
assert_eq!(std::mem::align_of::<Union>(), 2); // From f1 | ||
``` | ||
|
||
#### \#[repr(C)] Enums | ||
|
||
For [C-like enumerations], the `C` representation has the size and alignment of | ||
the default `enum` size and alignment for the target platform's C ABI. | ||
|
||
> Note: The enum representation in C is implementation defined, so this is | ||
> really a "best guess". In particular, this may be incorrect when the C code | ||
> of interest is compiled with certain flags. | ||
> Warning: There are crucial differences between an `enum` in the C language and | ||
> Rust's C-like enumerations with this representation. An `enum` in C is | ||
> mostly a `typedef` plus some named constants; in other words, an object of an | ||
> `enum` type can hold any integer value. For example, this is often used for | ||
> bitflags in `C`. In contrast, Rust’s C-like enumerations can only legally hold | ||
> the discrimnant values, everything else is undefined behaviour. Therefore, | ||
> using a C-like enumeration in FFI to model a C `enum` is often wrong. | ||
It is an error for [zero-variant enumerations] to have the `C` representation. | ||
|
||
For all other enumerations, the layout is unspecified. | ||
|
||
### Primitive representations | ||
|
||
The *primitive representations* are the representations with the same names as | ||
the primitive integer types. That is: `u8`, `u16`, `u32`, `u64`, `usize`, `i8`, | ||
`i16`, `i32`, `i64`, and `isize`. | ||
|
||
Primitive representations can only be applied to enumerations. | ||
|
||
For [C-like enumerations], they set the size and alignment to be the same as the | ||
primitive type of the same name. For example, a C-like enumeration with a `u8` | ||
representation can only have discriminants between 0 and 255 inclusive. | ||
|
||
It is an error for [zero-variant enumerations] to have a primitive | ||
representation. | ||
|
||
For all other enumerations, the layout is unspecified. | ||
|
||
### The `packed` Representation | ||
|
||
The `packed` representation can only be used on `struct`s and `union`s. | ||
|
||
It modifies the representation (either the default or `C`) by removing any | ||
padding bytes and forcing the alignment of the type to `1`. | ||
|
||
> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and is | ||
> possible to [safely create unaligned pointers to `packed` fields][27060]. | ||
> Like all ways to create undefined behavior in safe Rust, this is a bug. | ||
[`align_of_val`]: ../std/mem/fn.align_of_val.html | ||
[`size_of_val`]: ../std/mem/fn.size_of_val.html | ||
[`align_of`]: ../std/mem/fn.align_of.html | ||
[`size_of`]: ../std/mem/fn.size_of.html | ||
[`Sized`]: ../std/marker/trait.Sized.html | ||
[dynamically sized types]: dynamically-sized-types.html | ||
[C-like enumerations]: items/enumerations.html#c-like-enumerations | ||
[zero-variant enumerations]: items/enumerations.html#zero-variant-enumerations | ||
[undefined behavior]: behavior-considered-undefined.html | ||
[27060]: https://github.com/rust-lang/rust/issues/27060 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters