Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Effect of packed and align on representation #17

Closed
nikomatsakis opened this issue Aug 30, 2018 · 12 comments
Closed

Effect of packed and align on representation #17

nikomatsakis opened this issue Aug 30, 2018 · 12 comments
Labels
A-layout Topic: Related to data structure layout (`#[repr]`)

Comments

@nikomatsakis
Copy link
Contributor

Discussion topic for the effect of #[repr(packed)] and #[repr(align)] on memory layout.

Both of these attributes have RFCs. We should document what current behavior is, what gotchas to watch out for (e.g., &x.foo where foo is a field of a packed struct may not be aligned), and what we can guarantee going forward.

@nikomatsakis nikomatsakis added the A-layout Topic: Related to data structure layout (`#[repr]`) label Aug 30, 2018
@RalfJung
Copy link
Member

&x.foo where foo is a field of a packed struct may not be aligned

To note for the future validity discussion: This creates an invalid reference, which briefly exists even if we cast to raw pointer immediately. How to handle that is not part of the current discussion area, but we should remember adding it for the next :)

@avadacatavra avadacatavra added A-layout Topic: Related to data structure layout (`#[repr]`) active discussion topic and removed A-layout Topic: Related to data structure layout (`#[repr]`) labels Aug 31, 2018
@nikomatsakis
Copy link
Contributor Author

@RalfJung yes, I was reminded of that while writing this comment. I remember @arielb1 proposing that we have some kind of MIR operation for "borrowing as a raw pointer" -- this would perhaps be written in Rust syntax as &x as *const T. Such a syntactic form would be interpreted as this new operation, rather than the combination of a borrow and a cast. A bit...subtle perhaps. =)

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 31, 2018

A bit...subtle perhaps.

I worry about how would we teach this, and whether we can lint against people splitting this into two operations (e.g. let a = &x; let b = a as *const T;) where depending on many things, the split operation could be UB while the "joint" operation is not.

I still think that this is the best solution we've got, andI don't think it is worth it to add a new sigil for it. But if we introduce this subtlety doing the wrong thing should ideally produce a hard error (e.g. can't do &x.foo on a packed struct, but &x.foo as *const T is ok - maybe similar for uninitialized memory).

@RalfJung
Copy link
Member

RalfJung commented Sep 1, 2018

can't do &x.foo on a packed struct, but &x.foo as *const T is ok

Yes, please -- a new lint, first warn-by-default and then when crater permits, err-by-default.

For uninitailized memory, we will likely not have immediate UB when creating a reference, but that is something the next discussion area will show.

@alercah
Copy link

alercah commented Sep 8, 2018

#[repr(align = "n")] doesn't support under-alignment (packed must be used for that), so I don't think it has any effect on memory layout on its own. As far as I can tell from its RFC, its only defined effect is to increase the alignment of the type. This necessitates more padding, but in theory the compiler could also move fields around.

I cannot think of a good reason for the compiler to do so. The possibility of trying to align individual fields at some alignment higher than their natural alignment but lower than the struct's alignment seems extremely esoteric. That said, I do not know in what circumstances PGO might actually make this valuable.

The one thing I do think that we should guarantee is that a newtype with repr(align) lays out its field at index 0, so that you can safely cast a reference to an overaligned newtype to a reference to its field.

As for #[repr(packed)], the meaning is specified from this RFC. It is quite clear from this text that the primary effect on layout is simply in that it leads the compiler to consider fields to have lower layout requirements. With that in mind, I think that there is no other discussion of layout required here. If there are special cases for struct layout where the layout is different for different types of the same alignment, then those special cases should have to account for packing, rather than the other way round, I think.

For newtypes, a similar guarantee should hold as above: a packed newtype is represented exactly as the field type, except for its alignment. The effect of such a type is only to reduce alignment requirements.

In fact, I think that if we go with that definition, we could define packed's representation as follows:

  • If packed is applied to a newtype, its representation is identical except the alignment is lower.
  • If packed is applied to any struct, its representation is identical to a non-packed struct where each field were actually a newtype of itself with alignment lowered as in the first bullet.

@nikomatsakis
Copy link
Contributor Author

@alercah

The one thing I do think that we should guarantee is that a newtype with repr(align) lays out its field at index 0, so that you can safely cast a reference to an overaligned newtype to a reference to its field.

It seems surprising to me that #[repr(align)] would have an effect here! I guess the reasoning is sort of "why else would you have put that repr there"? I think this may be an argument for guaranteeing that a struct with 1 field always places that field at offset 0, full stop.

@nikomatsakis
Copy link
Contributor Author

That said, I feel like there is probably more work to be done than your comment might suggest. In particular, I think that the interaction with #[repr(C)] leads to stricter requirements here. It seems like we should be able to specify that #[repr(align)] has "the same effect as this clang annotation" or what have you.

@alercah
Copy link

alercah commented Sep 10, 2018

I meant that, theoretically, we could have #[repr(align(16))] struct WideU64(u64); laid out with the field at byte 9, putting the padding before the field.

But in general I think that's kinda silly and pointless.

@avadacatavra
Copy link
Contributor

rust-lang/rust#54148

@avadacatavra avadacatavra added the S-writeup-needed Status: Ready for a writeup and no one is assigned label Sep 13, 2018
@briansmith
Copy link

See https://oroboro.com/short-enum/, in particular its mention of the use of __attribute__ ((__packed__)) in GNU C to force the use of short enums on platforms where short enums aren't the default. See also #10 (comment).

@nikomatsakis
Copy link
Contributor Author

I believe this is basically addressed by #31, so I've marked this as final-comment-period as well.

@nikomatsakis
Copy link
Contributor Author

@briansmith I'm not entirely sure what points you meant to convey with those links, though I didn't read them yet. The purpose of this issue was to kind of describe how align/transparent work; are those links talking about details of things we should document? Perhaps take a look at #31, which has some (fairly short and not super detailed, admittedly) text on this topic, and make some suggested edits?

@nikomatsakis nikomatsakis added final-comment-period and removed S-writeup-needed Status: Ready for a writeup and no one is assigned labels Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-layout Topic: Related to data structure layout (`#[repr]`)
Projects
None yet
Development

No branches or pull requests

6 participants