Skip to content

Commit

Permalink
Gankro's suggestions plus more.
Browse files Browse the repository at this point in the history
  • Loading branch information
Havvy committed Dec 2, 2017
1 parent edbc964 commit 6032b43
Show file tree
Hide file tree
Showing 3 changed files with 130 additions and 42 deletions.
21 changes: 17 additions & 4 deletions src/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@ the structure of the program when the compiler is compiling it.

### Alignment

The *alignment* of a value specifies what addresses are valid to store the value
at.
The alignment of a value specifies what addresses values are preferred to
start at. Always a power of two. References to a value must be aligned.
[More][alignment].

### Arity

Expand Down Expand Up @@ -64,8 +65,18 @@ imported into very module of every crate. The traits in the prelude are pervasiv

### Size

The *size* of a value is the offset in bytes between successive elements in an
array with that item type including alignment padding.
The size of a value has two definitions.

The first is that it is how much memory must be allocated to store that value.

The second is that it is the offset in bytes between successive elements in an
array with that item type.

It is a multiple of the alignment, including zero. The size can change
depending on compiler version (as new optimizations are made) and target
platform (as `usize` varies).

[More][alignment].

### Slice

Expand Down Expand Up @@ -101,3 +112,5 @@ A trait is a language item that is used for describing the functionalities a typ
It allow a type to make certain promises about its behavior.

Generic functions and generic structs can exploit traits to constrain, or bound, the types they accept.

[alignment]: type-layout.html#size-and-alignment
61 changes: 45 additions & 16 deletions src/items/enumerations.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@
> _EnumItemDiscriminant_ :
>    `=` [_Expression_]
An _enumeration_ is a simultaneous definition of a nominal [enumerated type] as
well as a set of *constructors*, that can be used to create or pattern-match
values of the corresponding enumerated type.
An *enumeration*, also referred to as *enum* is a simultaneous definition of a
nominal [enumerated type] as well as a set of *constructors*, that can be used
to create or pattern-match values of the corresponding enumerated type.

Enumerations are declared with the keyword `enum`.

Expand All @@ -43,7 +43,7 @@ let mut a: Animal = Animal::Dog;
a = Animal::Cat;
```

Enumeration constructors can have either named or unnamed fields:
Enum constructors can have either named or unnamed fields:

```rust
enum Animal {
Expand All @@ -58,15 +58,15 @@ a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
In this example, `Cat` is a _struct-like enum variant_, whereas `Dog` is simply
called an enum variant. Each enum instance has a _discriminant_ which is an
integer associated to it that is used to determine which variant it holds. An
opaque reference to this variant can be obtained with the [`mem::discriminant`]
function.
opaque reference to this discriminant can be obtained with the
[`mem::discriminant`] function.

## C-like Enumerations
## C-like Enums

If there is no data attached to *any* of the variants of an enumeration and
there is at least one variant then it is called a *c-like enumeration*.
If there is no data attached to *any* of the variants of an enum and
there is at least one variant then it is called a *c-like enum*.

C-like enumerations can be cast to integer types with the `as` operator by a
C-like enums can be cast to integer types with the `as` operator by a
[numeric cast]. The enumeration can optionaly specify which integer each
discriminant gets by following the variant name with `=` and then an integer
literal. If the first variant in the declaration is unspecified, then it is set
Expand All @@ -81,21 +81,50 @@ enum Foo {
}

let baz_discriminant = Foo::Baz as u32;
assert_eq!(baz_discriminant, 123u32);
assert_eq!(baz_discriminant, 123);
```

Under the [default representation], the specified discriminant is interpreted as
an `isize` value although the compiler is allowed to use a smaller type in the
actual memory layout. The size and thus acceptable values can be changed by
using a [primitive representation] or the [`C` representation].

It is an error when either two variants share the same discriminant or for an
unspecified discriminant, the previous discriminant is the maximum value for the
size of the discriminant. <!-- Need examples here. -->
It is an error when two variants share the same discriminant.

## Zero-variant Enumerations
```rust,ignore
enum SharedDiscriminantError {
SharedA = 1,
SharedB = 1
}
enum SharedDiscriminantError2 {
Zero, // 0
One, // 1
OneToo = 1 // 1 (collision with previous!)
}
```

It is also an error to have an unspecified discriminant where the previous
discriminant is the maximum value for the size of the discriminant.

```rust,ignore
#[repr(u8)]
enum OverflowingDiscriminantError {
Max = 255,
MaxPlusOne // Would be 256, but that overflows the enum.
}
#[repr(u8)]
enum OverflowingDiscriminantError2 {
MaxMinusOne = 254, // 254
Max, // 255
MaxPlusOne // Would be 256, but that overflows the enum.
}
```

## Zero-variant Enums

Enums with zero variants are known as *zero-variant enumerations*. As they have
Enums with zero variants are known as *zero-variant enumss*. As they have
no valid values, they cannot be instantiated.

```rust
Expand Down
90 changes: 68 additions & 22 deletions src/type-layout.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
# Type Layout

The layout of a type is the way the size, alignment, and the offsets of any
fields and discriminants for the values of that type.
The layout of a type is its size, alignment, and the relative offsets of its
fields. For enums, how the discriminant is laid out and interpreted is also part
of type layout.

While specific releases of the compiler will have the same layout for types,
there is a lot of room for new versions of the compiler to do different things.
Instead of trying to document exactly what is done, we only document what is
guaranteed today.
Type layout can be changed with each compilation. Instead of trying to document
exactly what is done, we only document what is guaranteed today.

## Size and Alignment

Expand Down Expand Up @@ -37,7 +36,6 @@ The size of most primitives is given in this table.

Type | `size_of::\<Type>()`
- | - | -
bool | 1
u8 | 1
u16 | 2
u32 | 4
Expand All @@ -55,7 +53,7 @@ target platform. For example, on a 32 bit target, this is 4 bytes and on a 64
bit target, this is 8 bytes.

Most primitives are generally aligned to their size, although this is
platform-specific behavior. In particular, on x86 u64 and f64 may be only
platform-specific behavior. In particular, on x86 u64 and f64 are only
aligned to 32 bits.

## Pointers and References Layout
Expand All @@ -82,6 +80,9 @@ has a size of `size_of::<T>() * n` and the same alignment of `T`.

Slices have the same layout as the section of the array they slice.

> Note: This is about the raw `[T]` type, not pointers (`&[T]`, `Box<[T]>`,
> etc.) to slices.
## Tuple Layout

Tuples do not have any guarantes about their layout.
Expand All @@ -93,6 +94,9 @@ zero-sized type to have a size of 0 and an alignment of 1.

Trait objects have the same layout as the value the trait object is of.

> Note: This is about the raw trait object types, not pointers (`&Trait`,
> `Box<Trait>`, etc.) to trait objects.
## Closure Layout

Closures have no layout guarantees.
Expand All @@ -102,9 +106,6 @@ Closures have no layout guarantees.
All user-defined composite types (`struct`s, `enum`, and `union`s) have a
*representation* that specifies what the layout is for the type.

> Note: The representation does not depend upon the type's fields or generic
> parameters.
The possible representations for a type are the default representation, `C`, the
primitive representations, and `packed`. Multiple representations can be applied
to a single type.
Expand All @@ -121,6 +122,11 @@ struct ThreeInts {
}
```

> Note: As a consequence of the representation being an attribute on the item,
> the representation does not depend on generic parameters. Any two types with
> the same name have the same representation. For example, `Foo<Bar>` and
> `Foo<Baz>` both have the same representation.
The representation of a type does not change the layout of its fields. For
example, a struct with a `C` representation that contains a struct `Inner` with
the default representation will not change the layout of Inner.
Expand All @@ -134,39 +140,63 @@ There are no guarantees of data layout made by this representation.

### The `C` Representation

The `C` representation is designed for creating types that are interoptable with
the C Language and soundly performing operations that rely on data layout such
as reinterpreting values as a different type.
The `C` representation is designed for dual purposes. One purpose is for
creating types that are interoptable with the C Language. The second purpose is
to create types that you can soundly performing operations that rely on data
layout such as reinterpreting values as a different type.

Because of this dual purpose, it is possible to create types that are not useful
for interfacing with the C programming language.

This representation can be applied to structs, unions, and enums.

#### \#[repr(C)] Structs

The alignment of the struct is the alignment of the most-aligned field in it.

The size and offset of fields is determine by the following algorithm.
The size and offset of fields is determined by the following algorithm.

Start with a current offset of 0 bytes.

For each field in declaration order in the struct, first determine the size and
alignment of the field. If the current offset is not a multiple of the field's
alignment, then add padding bytes increasing the current offset until the
current offset is a multiple of the field's alignment. The offset for the field
is what the current offset is now. Then increase the current offset by the size
of the field.
alignment, then add padding bytes to the current offset until it is a multiple
of the field's alignment. The offset for the field is what the current offset
is now. Then increase the current offset by the size of the field.

Finally, the size of the struct is the current offset rounded up to the nearest
multiple of the struct's alignment.

Here is this algorithm described in psudeocode.

```rust,ignore
struct.alignment = struct.fields().map(|field| field.alignment).max();
let current_offset = 0;
for field in struct.fields_in_declaration_order() {
// Increase the current offset so that it's a multiple of the alignment
// of this field. For the first field, this will always be zero.
// The skipped bytes are called padding bytes.
current_offset += field.alignment % current_offset;
struct[field].offset = current_offset;
current_offset += field.size;
}
struct.size = current_offset + current_offset % struct.alignment;
```

> Note: You can have zero-sized structs from this algorithm. This differs from
> C where structs without data still have a size of one byte.
#### \#[repr(C)] Unions

A union declared with `#[repr(C)]` will have the same size and alignment as an
equivalent C union declaration in the C language for the target platform.
Usually, a union would have the maximum size of the maximum size of all of its
fields, and the maximum alignment of the maximum alignment of all of its fields.
The union will have a size of the maximum size of all of its fields rounded to
its alignment, and an alignment of the maximum alignment of all of its fields.
These maximums may come from different fields.

```
Expand All @@ -178,6 +208,17 @@ union Union {
assert_eq!(std::mem::size_of::<Union>(), 4); // From f2
assert_eq!(std::mem::align_of::<Union>(), 2); // From f1
#[repr(C)]
union SizeRoundedUp {
a: u32,
b: [u16; 3],
}
assert_eq!(std::mem::size_of::<SizeRoundedUp>(), 8); // Size of 6 from b,
// rounded up to 8 from
// alignment of a.
assert_eq!(std::mem::align_of::<SizeRoundedUp>(), 4); // From a
```

#### \#[repr(C)] Enums
Expand All @@ -201,6 +242,9 @@ It is an error for [zero-variant enumerations] to have the `C` representation.

For all other enumerations, the layout is unspecified.

Likewise, combining the `C` representation with a primitive representation, the
layout is unspecified.

### Primitive representations

The *primitive representations* are the representations with the same names as
Expand All @@ -218,14 +262,16 @@ representation.

For all other enumerations, the layout is unspecified.

Likewise, combining two primitive representations together is unspecified.

### The `packed` Representation

The `packed` representation can only be used on `struct`s and `union`s.

It modifies the representation (either the default or `C`) by removing any
padding bytes and forcing the alignment of the type to `1`.

> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and is
> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and it is
> possible to [safely create unaligned pointers to `packed` fields][27060].
> Like all ways to create undefined behavior in safe Rust, this is a bug.
Expand Down

0 comments on commit 6032b43

Please sign in to comment.