add a stucts-and-tuples chapter #31

nikomatsakis · 2018-09-25T21:49:52Z

Writes up what I believe to be the consensus around layout of structs and tuples. Suggestions very welcome!

Here are some of the highlights:

Tuples are defined to be laid out as if there were a "fresh generic struct" for their arity. So (T1...Tn) is laid out the same as TupleN<T1..Tn> where struct TupleN<P0..Pn>(P0..Pn).
- There is an exception for the case where all the types are the same (after lifetime erasure), in which case the layout is guaranteed to be compatible with [T; N].
Tuple structs are laid out the same as "named fields".
#[repr(Rust)] structs (the default) have no particular layout guarantees. In particular, even if two #[repr(Rust)] structs have the fields of the same types, they are not guaranteed to be laid out in a compatible way. This is a conservative position that could be strengthened, though I personally now think it'd be better to "opt-in" to any such guarantees. I cover the pros and cons in the doc.
#[repr(C)], #[repr(align)], #[repr(packed)], and #[repr(transparent)] are all discussed.

Fixes #11
Fixes #12
Fixes #17

reference/src/representation/structs-and-tuples.md

nikomatsakis · 2018-09-26T14:17:37Z

Pushed an update containing the various suggestions from @eddyb.

reference/src/representation/structs-and-tuples.md

gnzlbg · 2018-09-26T15:24:18Z

reference/src/representation/structs-and-tuples.md

+[#42877]: https://github.com/rust-lang/rust/issues/42877
+[pg-unsized-tuple]: https://play.rust-lang.org/?gist=46399bb68ac685f23beffefc014203ce&version=nightly&mode=debug&edition=2015
+
+There are also benefits also to having fewer guarantees. For example:


typo: two "also"s

reference/src/representation/structs-and-tuples.md

alercah · 2018-09-26T22:31:06Z

reference/src/representation/structs-and-tuples.md

+### Default layout ("repr rust")
+
+The default layout of structs is undefined and subject to change
+between compiler revisions. We further do not guarantee that two


Between individual compilations, no? I think that is what we had determined was the line.

Actually, I would like to push back on this slightly — there is a general desire to ensure that th compiler output is deterministic. This is not 100% true but it is very nearly true, and we would like to to be true. This seems to imply that, so long as the input does not change, the layout cannot change. I am not sure why we would need to lose that guarantee.

reference/src/representation/structs-and-tuples.md

hanna-kruppe · 2018-10-03T17:44:29Z

Long ago I proposed that we might want to guarantee (some subset of) newtype unpacking for repr(Rust) structs. @nikomatsakis carried this over into #11 as discussion point but it received no further discussion. I like to think that means it's uncontroversial 😄 I've also never heard of any reason why one might not want that to be true.

To make a specific proposal, let's restrict it to structs [1] that contain a single field having the same memory layout as the type of the sole field. So struct Foo<T>(T); and struct Foo<T> { x: T } would be laid out like T in memory, though possibly still passed differently in function calls.

[1] The same guarantee for (T,) are already covered by the special case of homogeneous tuples being laid out like arrays that is already in this PR.

the8472 · 2018-10-03T19:43:51Z

@rkruppe for (T,) the .0 field is public, in Foo<T> it is not. The members of the tuple are guaranteed to never change for that type but since Foo has private fields it is a non-breaking change to add new fields which would remove the layout guarantee.

So shouldn't that guarantee be restricted to structs with a single public field? Or at least there should be a lint if such transmutations are used by a crate that does not own Foo.

hanna-kruppe · 2018-10-03T19:51:17Z

Visibility should not affect layout, only who can rely on the layout.

the8472 · 2018-10-03T20:00:41Z

That might be worth documenting.

Still, it might be better to document the intent of such structs with repr(transparent).

hanna-kruppe · 2018-10-03T20:23:14Z

repr(transparent) has other, potentially undesirable effects (on ABI). That attribute is also shown in docs and thus sometimes used as documentation to the outside world, which one may not want to do even while relying on the newtype's layout internally within the library.

I agree that such subtleties should be documented, but "punishing" people who do not do that in a certain prescribed way (a repr attribute, instead of e.g. a comment) by making their code de jure undefined behavior (especially if it will de facto never misbehave because the layout cannot sensibly be any different) is simply user-hostile with no benefit.

reference/src/representation/structs-and-tuples.md

nikomatsakis · 2018-10-11T17:43:40Z

Here is the set of unresolved questions:

Zero-sized structs (#37). If you have a struct which --
transitively -- contains no data of non-zero size, then the size of
that struct will be zero as well. These zero-sized structs appear
frequently as exceptions in other layout considerations (e.g.,
single-field structs). An example of such a struct is
std::marker::PhantomData.

Single-field structs (#34). If you have a struct with single field
(struct Foo { x: T }), should we guarantee that the memory layout of
Foo is identical to the memory layout of T (note that ABI details
around function calls may still draw a distinction, which is why
#[repr(transparent)] is needed). What about zero-sized types like
PhantomData?

Homogeneous structs (#36). If you have homogeneous structs, where all
the N fields are of a single type T, can we guarantee a mapping to
the memory layout of [T; N]? How do we map between the field names
and the indices? What about zero-sized types?

Deterministic layout ([#35]). Can we say that layout is some deterministic
function of a certain, fixed set of inputs? This would allow you to be
sure that if you do not alter those inputs, your struct layout would
not change, even if it meant that you can't predict precisely what it
will be. For example, we might say that struct layout is a function of
the struct's generic types and its substitutions, full stop -- this
would imply that any two structs with the same definition are laid out
the same. This might interfere with our ability to do profile-guided
layout or to analyze how a struct is used and optimize based on
that. Some would call that a feature.

hanna-kruppe · 2018-10-11T20:10:34Z

reference/src/representation/structs-and-tuples.md

+The default layout of structs is not specified. Effectively, the
+compiler provdes a deterministic function per struct definition that
+defines its layout. This function may as principle take as input the
+entire input program. Therefore:


I am not sure this wording reserves the freedoms it want to reserve, and even if someone argued it does I would like it to be clarified. Since we did not get consensus to rule out profile-guided layout, the program source code is not all that informs layout. Not even if that includes build scripts, input data files, etc. that are used during the profiling run, since the program may not be deterministic w.r.t. these (e.g. it might have race conditions or depend on the system time or ...). So I don't think we can guarantee anything involving the words "deterministic function" and have to stick to something like "every time you invoke the compiler you may get a completely different layout".

Since we did not get consensus to rule out profile-guided layout

I might have missed this, but did you managed to write your thoughts about this down?

No, not yet.

Hmm. I agree with the gist of your comment @rkruppe but I think I don't quite agree with this part:

Not even if that includes build scripts, input data files, etc. that are used during the profiling run, since the program may not be deterministic w.r.t. these (e.g. it might have race conditions or depend on the system time or ...).

In particular, it seems like we would basically want to say that layout is determined by the program source code + other auxiliary inputs (e.g., compiler settings, output from PGO, etc).

I don't know that we need to give the freedom for rustc to choose arbitrary orderings on two consecutive runs where nothing at all changed (in particular, we've been shooting for deterministic compilation, and this would sort of contravene that). Of course it's ok to start there for now, since it doesn't say that we have to change layout...just seems looser than is needed.

Am I missing something?

To be clear: it is certainly possible to define PGO traces as part of the compiler input (beyond just sources and compiler flags), though it's a bit of a stretch IMO. That should be called out explicit in the text, though. What's currently written here could reasonably be read as saying the layout depends just on the source code and flags such as -C opt-level.

Ah, the updated version is good in that regard, thanks!

gnzlbg · 2018-10-11T21:49:48Z

reference/src/representation/structs-and-tuples.md

-a detailed write-up):
+layout scheme. See section 6.7.2.1 of the [C17 specification][C17] for
+a detailed write-up of what such rules entail. For most platforms,
+however, this means the following:


Maybe we should just go one step further here and just state that the "most recent" C specification applies. It would be a pain to have to update this every N years, history has shown that the newer C specs are backwards compatible with the old ones (e.g. the C99, C11, and C17 specs have never introduced breaking changes here - they just have defined behavior that was undefined before), and we probably want to be as compatible as possible with C anyways which means we have to follow the latest spec.

This kinds of ties Rust with the C spec, but this is already pretty much the case, not only for platform support, but some newer C proposals like N2289 - Zero overhead failure should definetely allow using Rust's Option and Result properly as error handling mechanisms in C FFI. So whether we like it or not, Rust is already a stakeholder in the C standardization process, and we should be sending someone to their meetings to represent Rust's interests in the ISO C standard evolution, and that would include layout guarantees for repr(C) types. We don't want any changes to the C language to make it impossible for Rust to target C via FFI.

gnzlbg

I honestly think this is a great start. As other issues progress, and unresolved issues get resolved, this will probably be amended many times, but it will be easier to discuss the details of those changes in PRs that build on top of this one.

reference/src/representation/structs-and-tuples.md

RalfJung · 2018-10-13T10:10:36Z

reference/src/representation/structs-and-tuples.md

+  ahead of
+  time](https://github.com/rust-rfcs/unsafe-code-guidelines/issues/11#issuecomment-420659840),
+  so the user cannot do it manually.
+- If layout is defined, then it becomes part of your API, such taht


Typo: s/taht/that/

reference/src/representation/structs-and-tuples.md

Also, fix the dangling "therefore".

per rkruppe's points

nikomatsakis · 2018-10-23T17:04:13Z

I pushed some updates. The most interesting thing has to do with zero-sized structs. I added the following text to the #[repr(C)] section:

One deviation from C comes about with "empty structs". In Rust, a struct that contains (transitively) no data members is considered to have size zero, which is not something that exists in C. This includes a struct like #[repr(C)] struct Foo { }. Further, when a #[repr(C)] struct has a field whose type has zero-size, that field may induce padding due to its alignment, but will not otherwise affect the offsets of subsequent fields (as it takes up zero space).

RalfJung · 2018-10-24T14:17:41Z

reference/src/representation/structs-and-tuples.md

+a struct like `#[repr(C)] struct Foo { }`. Further, when a
+`#[repr(C)]` struct has a field whose type has zero-size, that field
+may induce padding due to its alignment, but will not otherwise affect
+the offsets of subsequent fields (as it takes up zero space).


As @rkruppe noted on Zulip, this seems like a problem. It means that "copy-pasting struct definitions and adding repr(C) everywhere" does not give you C compatibility, because your Foo would actually take space when put in a larger struct in C.

This seems like a bug, TBH. I am not sure if it is a bug that we can still fix. Might be worth having at least a warning.

Yes — I was thinking the same in that conversation. That is, "bug and not clearly a bug we can fix", which does suggest that at least a lint is warranted.

For the benefit of others, the Zulip conversation was here.

That's not actually how C works - C does not allow empty structs. The godbolt link was from C++, which does allow empty structs, and in which empty structs have size one. Basically, this would only be an issue for somebody like Mozilla, who talk to C++ through a non-C ABI.

Notably, also, gcc and clang's C extension for empty structs has sizeof(struct { }) = 0: https://godbolt.org/z/K3_5fJ

Transcribing another thing @ubsan noted on Zulip: empty structs are accepted as an extension by some C compilers, but (at least) GCC and Clang make them have size zero, unlike C++. Example: https://godbolt.org/z/AS2gdC

Allow-by-default at best - 0 size structures are weird in C++, and usually you'd use either EBO or [[no_unique_address]] with them.

Isn't them being weird another argument for making this a warn-by-default lint? I expect many people will not know this. I am not a C++ expert, but I have programmed in C++ for many years and never heard about this; I don't think we can expect everybody doing C++ FFI to know about these issues.

So to summarize: C does not allow empty structs, some C language extensions allow empty structs with sizeof == 0, C++ does allow empty structs but these have a sizeof == 1 unless they are inherited from or they are fields that have the [[no_unique_address]] attribute (in both cases, they don't increase the size of the struct - i'm unsure what role the alignment of the type plays though).

I think #[repr(C)] should warn-by-default on this when it makes a difference, that is, when the ZST would change the layout. We could have an opt-in warning that always warns on ZST being used in #[repr(C)] but I fear that would be extremely noisy for little win.

About the situation with the C-language extension and C++ it appears that #[repr(C)] != #[repr(Cxx)], so maybe we just need to add new reprs to deal with those. In the mean time it might be worth it to just ignore C++ while specifying #[repr(C)] here (maybe add a note so that we don't forget).

I had to google EBO: Empty base optimization.

@RalfJung note that EBO is guaranteed by the C++>=11 standard for types with standard layout, like empty structs: struct T {};, so it is a required layout optimization.

avadacatavra · 2018-10-24T23:22:43Z

Is this almost ready to merge?

nikomatsakis · 2018-10-25T12:20:15Z

@avadacatavra my hope was that we will merge it today

nikomatsakis · 2018-10-25T12:22:41Z

That said, I think that some part of this conversation about zero-sized structs deserves to be added. I think we ought to also add "lint for #[repr(C)] structs of zero-size" to some sort of list (do we have a place for such a list?) for recommendations -- clearly there are details to be worked out, but there are definitely footguns there.

nikomatsakis · 2018-10-25T15:03:04Z

OK, I attempted to summarize the conversation from here in the latest commit and added some appropriate warning language.

reference/src/representation/structs-and-tuples.md

strega-nil · 2018-10-25T16:15:42Z

It looks like gcc (the only compiler which has implemented [[no_unique_address]]), has chosen:

to put [[no_unique_address]] members at the top of the class holding them
- (see https://gcc.godbolt.org/z/Q4oFIR - note that this is technically UB).
- Alignment does still matter tho: https://gcc.godbolt.org/z/Pn1eqo
Interestingly, it does have a bug: https://gcc.godbolt.org/z/ZkMpnS
- sizeof(Bar) should equal sizeof(Baz) should equal 1

avadacatavra · 2018-10-25T22:08:49Z

I'm going to merge this, and any additions can be filed as followup issues/PRs (cc @nikomatsakis)

add a stucts-and-tuples chapter

c3a54f9

This was referenced Sep 25, 2018

Representation of tuples #12

Closed

Representation of structs #11

Closed

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Show resolved Hide resolved

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

eddyb reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Show resolved Hide resolved

nikomatsakis mentioned this pull request Sep 26, 2018

Representation of unions #13

Closed

add nits from eddyb

08062bd

gnzlbg reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Show resolved Hide resolved

gnzlbg reviewed Sep 26, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Show resolved Hide resolved

gnzlbg reviewed Sep 26, 2018

View reviewed changes

alercah approved these changes Sep 26, 2018

View reviewed changes

hanna-kruppe reviewed Sep 27, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

scottmcm mentioned this pull request Oct 5, 2018

RevSlice requires repr(C) or repr(transparent) scottmcm/rev_slice#1

Closed

RalfJung reviewed Oct 5, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

RalfJung reviewed Oct 5, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

nikomatsakis added 3 commits October 11, 2018 08:44

correct typo

eb951c2

strengthen to "subject to change between compilations"

d5a144b

add note that packed adjusts alignment of the struct as a whole

569338b

assign issues

fbf35bf

nikomatsakis mentioned this pull request Oct 11, 2018

Effect of packed and align on representation #17

Closed

hanna-kruppe reviewed Oct 11, 2018

View reviewed changes

gnzlbg reviewed Oct 11, 2018

View reviewed changes

gnzlbg approved these changes Oct 11, 2018

View reviewed changes

RalfJung reviewed Oct 13, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

RalfJung reviewed Oct 13, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

RalfJung reviewed Oct 13, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

RalfJung reviewed Oct 23, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Show resolved Hide resolved

nikomatsakis added 5 commits October 23, 2018 12:29

generalize "input program" to "input to compiler"

97622bc

Also, fix the dangling "therefore".

be clearer about when reordering fields would be a breaking change

b0c86ea

rephrase "any" struct for clarity

f146fc4

rework "default layout" to guarantee nothing for now

72f5db3

per rkruppe's points

document zero-sized struct behavior

17dab33

RalfJung reviewed Oct 24, 2018

View reviewed changes

pnkfelix mentioned this pull request Oct 25, 2018

run-pass/extern-pass-empty is probably a bogus thing to test rust-lang/rust#53859

Open

say more about zero-sized things

a9223e2

nikomatsakis force-pushed the structs-and-tuples branch from 5318410 to a9223e2 Compare October 25, 2018 15:02

gnzlbg reviewed Oct 25, 2018

View reviewed changes

reference/src/representation/structs-and-tuples.md Outdated Show resolved Hide resolved

fix typo

ece91f5

avadacatavra merged commit 351bb96 into rust-lang:master Oct 25, 2018

add a stucts-and-tuples chapter #31

add a stucts-and-tuples chapter #31

Conversation

nikomatsakis commented Sep 25, 2018 • edited Loading

nikomatsakis commented Sep 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanna-kruppe commented Oct 3, 2018

the8472 commented Oct 3, 2018

hanna-kruppe commented Oct 3, 2018

the8472 commented Oct 3, 2018

hanna-kruppe commented Oct 3, 2018

nikomatsakis commented Oct 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnzlbg Oct 11, 2018 • edited Loading

Choose a reason for hiding this comment

gnzlbg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Oct 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

strega-nil Oct 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnzlbg Oct 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avadacatavra commented Oct 24, 2018

nikomatsakis commented Oct 25, 2018

nikomatsakis commented Oct 25, 2018 • edited Loading

nikomatsakis commented Oct 25, 2018

strega-nil commented Oct 25, 2018 • edited Loading

avadacatavra commented Oct 25, 2018

nikomatsakis commented Sep 25, 2018 •

edited

Loading

gnzlbg Oct 11, 2018 •

edited

Loading

strega-nil Oct 24, 2018 •

edited

Loading

gnzlbg Oct 25, 2018 •

edited

Loading

nikomatsakis commented Oct 25, 2018 •

edited

Loading

strega-nil commented Oct 25, 2018 •

edited

Loading