-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC for structs with unspecified layouts. #79
Conversation
A fixed layout can be selected with the `#[repr]` attribute | ||
|
||
```rust | ||
#[repr(C)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the whole idea is a good one; +1. Bikeshed: I'm not sure if repr(C)
is the right notation - you might want a defined layout for other reasons - maybe repr(fixed)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to bikeshed, maybe repr(declaration)
or repr(as_written)
would be more descriptive?
I think repr(C)
is actually OK, because it's specifying that C layout rules should be used (i.e. declaration order), although it could easily be interpreted as "struct for C FFI". There are other possible "fixed" layouts (e.g. sorting by field size, or even alphabetically).
In any case, I don't particularly care about the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same thought, but then it occurred to me that the only people who will insist on such control over representation would be coming from a C background anyway. I do like the analogy to repr(C)
on enums, and it would be a shame to have repr(C)
and repr(fixed)
be aliases of each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #[repr(C)]
is the only thing that makes sense for producing a C-compatible struct. If #[repr(fixed)]
would produce something other than #[repr(C)]
then it might be worth having, but if it produces the exact same layout as #[repr(C)]
then it seems unnecessarily redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. repr(C)
sounds like the best bet.
I think the precedent of I misunderstood what the current If we had such an attribute, would it make sense to make it the default under this proposal? If not, what do you suspect the default would become? |
struct layouts, for example, | ||
[the Grsecurity suite](http://grsecurity.net/) of security | ||
enhancements to the Linux kernel provides | ||
[`GRKERNSEC_RANDSTRUCT`](http://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity_and_PaX_Configuration_Options#Randomize_layout_of_sensitive_kernel_structures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's completely unnecessary if you're confident that you are memory safe, which (modulo compiler bugs) Rust can give you (except unsafe blocks).
IMO C-struct-compatibility is a major selling point of Rust, it should be the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're writing a kernel in Rust, you presumably aren't guaranteeing that all the programs your kernel runs are also written in Rust. To that end, being able to randomize fields sounds plausibly useful.
However, I would imagine it's probably better done by writing a custom item decorator that randomizes the field order (and places whatever #[repr()]
attribute is necessary to tell the compiler to use the declaration order). Which is to say, the kernel author can write the necessary extension, Rust doesn't need to provide it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@o11c, it is also a major selling point of Rust to be highly efficient. IMO, this selling point is more important than C interop.
Of course, easy C interop will still be a selling point, but I suspect it is the wrong default. If we stick to our current default, every struct that is not used for C interop will pay the price of unoptimized representation, unless we add an annotation to the struct definition. This violates the notion of "pay for what you use". And given that the number of structs intended for C interop is relatively few, requiring this annotation for the majority of structs would be comparable to the burden of const-correctness in C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kballard that is a good point. It's a trivial syntax extension: https://gist.github.com/huonw/be05427dc80e44f1a594
I'll remove randomisation as a reason for this RFC.
For reference, our current A problem with packing without reordering is some types & architectures strongly prefer certain alignments, meaning unaligning/packing things naively (by just removing the padding without reordering) can lead to unexpectedly poor performance.
This is exactly what this RFC is discussing. :) |
👍 |
I expect that the overwhelming majority of structures need not be FFI compatible, but it might be enlightening to canvas rust and servo to see how many structures would need I would also add discussion of the lint to the detailed design section, I would expect that to be mandatory with a change such as this. |
While I like this idea in general, I'm worried about the third-party library corner case: I use a struct from a third-party lib and want to pass it to some C code. The struct isn't annotated with |
@Valloric Seems like you'd have the same issue with using a third-party enum. What third-party library are you envisioning that provides a non-C-compatible struct that you find just happens to match the definition of a C struct? And if you are relying on this, what guarantee do you have that the third-party library won't reorder its struct fields (or add new fields)? |
The way I'm reading the RFC, one of the reasons why not forcing a C-compatible layout is a good idea is security. Which I'm guessing means the struct layout can change from compilation to compilation (or rustc versions). So if I want to pass this to a different lib written it C++ that exposes a C interface for interacting with my Rust code, that just won't work. In other words, a non-C-compatible layout that possibly randomizes for security means I can't send a third-party Rust struct to a C interface I've exposed in a different lib. I think you're assuming the user might be trying to match some pre-existing C struct; that is not necessarily the case. |
To elaborate on my example a bit more, imagine the following (quite likely) scenario: I'm part of a large team/org and I'm writing my code in Rust. There's a different team with a C++ legacy codebase that accepts inputs from other libraries. They'd like to provide me with an C interface I can call, but the main data struct I want to give them comes from a third-party Rust lib I've integrated and the struct isn't marked as |
The problem I'm mentioning could be somewhat mitigated if rustc could have a flag that forces C-compatible layout even for structs that aren't marked with I can't see an easy way out of this "library composability" problem; this bothers me because I really like the proposal for both the security and perf benefits. |
Isn't that a problem with traits too? You want to store a value in a |
While an issue, that sounds more easily work-aroundable than "I just plain can't pass this to a non-Rust library". |
Not really. Only in the context of sensitive data structures in the kernel. The real benefit is the compiler can reorder your struct to minimize the amount of packing necessary. As long as you don't have a dependency on the explicit layout of the struct (which for us means using it with C FFI), such a change should not affect the behavior of Rust code, except in that it will make the struct use less memory.
If the different lib is interacting with your Rust code, then your Rust code must be exposing an If you really need to vend a third-party library's struct through FFI, then you can write your own compatible struct, marked as |
Nope, the workaround described here by kballard is pretty much the same process in each case. :) |
Though @Valloric raises a similar point to what I was about to come in here and say, which is that I'm curious how this would potentially complicate efforts of other languages to call Rust code natively (that is, via a dedicated Rust FFI rather than routing through C). |
@bstrie For a Rust FFI, I would assume the actual layout of the struct has to be defined somewhere, otherwise how can any Rust code use a struct defined in another library? Which is to say, either the rule for reordering fields has to be fixed, or the chosen field order has to be stored in the crate metadata (which seems like the more sensible approach). Any third-party language attempting to interact with Rust via a Rust FFI would naturally need to be able to read crate metadata, so it can find the field order. |
The only real benefit is optimisation, #![feature(simd)]
#[simd]
struct Vec2f(f32, f32);
struct Box {
bone: int,
offset: f32x4,
size: f32x4,
texture_offset: Vec2f,
children: uint,
}
|
I've removed security as a motivating factor (in light of field randomisation being possible with a simple syntax extension), mentioned the lint in more detail and written up the upstream-FFI struct problem under Drawbacks. |
how about changing #[packed] to mean aligned,reordered packing; add #[unpadded] to do what 'packed' does now, and keep the default as it is - |
What |
When I first started messing with such things, I always thought it silly that in GCC,
Besides, since Adding an I would like to see:
|
# Unresolved questions | ||
|
||
- How does this interact with binary compatibility of dynamic libraries? | ||
- Should the lint apply to C-compatible functions defined in Rust like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. This is a way better idea than a pervasive default. I care a lot about the compiler maintaining the order I declare my structs in, especially when I'm touching multithreaded code--which is most of my code these days. #[repr(fluid)]
might be a better alternative, allowing the compiler to optimize how it sees fit, e.g., alignment when dealing with alignment sensitive platforms. And don't get me started on serialization with this...
I agree with @o11c, there's enough precedent that |
@bstrie An acceptable compromise. |
DST probably gets in the way of some of these optimisations, since a type like |
Yes, I kept forgetting to mention the DST restriction, but that doesn't mean that optimizations would be wasted on (what I presume to be) the majority of structs that don't contain a DST, nor would it mean that DST-containing structs would be completely unable to benefit from reordering. |
Accepted as RFC 18. |
Is it acceptable for an RFC to be as vague as this one? Even if I support the sentiment, there's no indication of any concrete design, so whose job is it to come up with one? Where does that discussion happen? |
@bstrie: The important part is making it unspecified by default, and then fiddling with optimizations based on it being unspecified is just internal performance-related work where an RFC isn't usually required. |
I gathered some information about structs defined in Rust's crates.
edit: stopped counting tuples and enums, but my instrumentation still seems unrealistic. |
@pczarn, I'd be curious to know more about how your numbers are gathered. |
@bstrie, total padding is the difference between the size of a struct and the sum of the sizes of its fields. I couldn't get these numbers from (possibly generic) struct definitions in diff --git a/src/librustc/middle/trans/adt.rs b/src/librustc/middle/trans/adt.rs
index 9cea6d0..f75c298 100644
--- a/src/librustc/middle/trans/adt.rs
+++ b/src/librustc/middle/trans/adt.rs
@@ -161,6 +161,13 @@ fn represent_type_uncached(cx: &CrateContext, t: ty::t) -> Repr {
let dtor = ty::ty_dtor(cx.tcx(), def_id).has_drop_flag();
if dtor { ftys.push(ty::mk_bool()); }
+ let lltys = ftys.iter().map(|&ty| type_of::sizing_type_of(cx, ty)).collect::<Vec<_>>();
+ let llty_rec = Type::struct_(cx, lltys.as_slice(), packed);
+ println!("struct has size {}: {}",
+ machine::llsize_of_alloc(cx, llty_rec),
+ lltys.iter().map(|&typ| machine::llsize_of_alloc(cx, typ)).collect::<Vec<u64>>()
+ );
+
return Univariant(mk_struct(cx, ftys.as_slice(), packed), dtor)
}
ty::ty_enum(def_id, ref substs) => { (a single line taken from rustc's output: |
Is this transitive? That is, does it count the padding that the fields of On Wed, May 21, 2014 at 7:17 AM, Piotr Czarnecki
|
@cmr, no, it doesn't care whether any of the fields themselves are packed or not if that's what you mean. However, it could be modified to count the deep padding easily. Should it count the drop flag? The numbers are quite different without that additional bool. |
Yeah, the drop flag should definitely be counted. On Wed, May 21, 2014 at 8:22 AM, Piotr Czarnecki
|
(And the "deep" padding too, since that will matter) On Wed, May 21, 2014 at 10:36 AM, Corey Richardson [email protected] wrote:
|
The data is here: https://gist.github.com/pczarn/532a692f105208fcb428 I think I found a way to get rid of duplicates. It's transitive and probably accurate. It reports a few large structs. They are defined in The next step is calculating the benefits of layout optimization. |
The following structs need C compatibility. |
The RFC talks about the order of fields. Is it safe to assume that it therefore does not affect structs with a single field? In particular, is it safe to transmute between |
Yes, I think that is a sane thing to guarantee. Should clarify that in the RFC. |
@cmr I am almost certain that some of the other developers have asserted the opposite. That is, that it wouldn't be guaranteed. |
It must be guaranteed, since one can take references to the contained struct and all such references need to have the same layout. |
cleaned up streams tests and split adapters test into several methods
No description provided.