Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFI and union #5492

Closed
sanxiyn opened this issue Mar 22, 2013 · 28 comments
Closed

FFI and union #5492

sanxiyn opened this issue Mar 22, 2013 · 28 comments
Labels
A-FFI Area: Foreign function interface (FFI) P-low Low priority

Comments

@sanxiyn
Copy link
Member

sanxiyn commented Mar 22, 2013

How would one call C functions involving union with Rust FFI?

SpiderMonkey's jsval is one example.

@thestinger
Copy link
Contributor

There could be unsafe enum with the layout defined to be the same as C for interoperability. The only other way to deal with it would be finding the alignof and sizeof of the union in C for each platform and then translating that to Rust.

@sanxiyn
Copy link
Member Author

sanxiyn commented Apr 24, 2013

Referencing Aatch/rust-xcb#2.

@yichoi
Copy link
Contributor

yichoi commented Apr 25, 2013

referencing servo/servo#398

referencing servo/rust-mozjs#9

@Aatch
Copy link
Contributor

Aatch commented May 9, 2013

The unsafe enum idea appeals to me, since I thought about it as an option when trying to solve the union issue in rust-xcb, but decided that relying on the representation of enums was too "hacky" and fragile.

@pnkfelix
Copy link
Member

pnkfelix commented Jul 1, 2013

brson mentions in the description for #6346 that a "macro based solution" would be appropriate here, though I do not current know what that would entail. (It sounds to me like a potential alternative to the changes to the grammar to add unsafe enum that have been discussed here.)

@pnkfelix
Copy link
Member

pnkfelix commented Jul 1, 2013

Nominating for milestone 3, feature complete.

@emberian
Copy link
Member

I don't think a "macro-based solution" would be appropriate, as you need to restrict the valid range of values at the site of usage, which macros cannot do.

@graydon
Copy link
Contributor

graydon commented Aug 8, 2013

An attribute on an enum that makes it have no discriminant and makes any match on the variant-part succeed, should be sufficient. Not pretty but neither are C union semantics.

@graydon
Copy link
Contributor

graydon commented Aug 8, 2013

accepted for feature-complete milestone

@Skrylar
Copy link

Skrylar commented Dec 13, 2013

I ran in to this problem recently as well; Allegro makes use of Unions for passing events around in C, which turns out to be a pain to deal with in Rust.

@pnkfelix
Copy link
Member

We do want to solve this problem eventually, but it need not block 1.0. Assigning P-low.

@alxkolm
Copy link

alxkolm commented Nov 3, 2014

What status?

@alexchandel
Copy link

What's the recommended way to do FFI-compatible unions?

@jdm
Copy link
Contributor

jdm commented Jan 21, 2015

I believe structs containing a field which is at least as big as the largest type the union can represent and manual transmutes is the state of the art right now.

@mzabaluev
Copy link
Contributor

I believe structs containing a field which is at least as big as the largest type the union can represent and manual transmutes is the state of the art right now.

Make sure you get the alignments right. The struct should have #[repr(C)] and the field posing as the union (or the inner type, in case the newtype struct emulates the union itself) has the alignment of the most-aligned variant.

@alexchandel
Copy link

@jdm Even when variants are different sizes? transmute errors when T and U have different sizes, and transmute_copy is just as dangerous since it copies sizeof(U) bytes, triggering "undefined behavior".

@mzabaluev
Copy link
Contributor

Also, the overall size of the union is a multiple of the alignment of its most-aligned variant. This union has the size of 8:

union A {
    int32_t intval;
    char chars[5];
};

Which would require a Rust representation like:

#[repr(C)]
struct A {
    union_data: [i32; 2]
}

So yes, representing unions is not for the unwary.

@alexchandel
Copy link

@mzabaluev For a C union like this:

struct INPUT {
  DWORD type;
  union {
    MOUSEINPUT    mi;
    KEYBDINPUT    ki;
    HARDWAREINPUT hi;
  };
};

I use a struct field rather bytes. It's easier because the size and alignment change between platforms, and you can't do [u8; size_of::<MOUSEINPUT>()]

#[repr(C)]
pub struct MOUSEINPUT { ... }
#[repr(C)]
pub struct KEYBDINPUT { ... }
#[repr(C)]
pub struct HARDWAREINPUT { ... }

#[repr(C)]
pub struct INPUT {
    pub tag_: DWORD,
    pub union_: MOUSEINPUT, // MOUSEINPUT largest and most aligned
}

@mzabaluev
Copy link
Contributor

@alexchandel Good when it works, but sometimes the largest variant is not the most aligned, like in my example above.

@niconiconico9
Copy link

Is there a reason why this bug is tagged as "P-low"?
The alternatives that are proposed and I guess currently used entails that a great care is taken for handling alignment properly.
The last example on how this can be fixed without any language addition, is a perfect example how the language is promoting to write code that is incorrent because it don't provide a proper solution

@ghost
Copy link

ghost commented Aug 24, 2015

I don't know how feasible it would be to implement, but an example usage could be:

#[repr(union)]
pub struct XEvent {
  pub type_: c_int,
  pub xany: XAnyEvent,
  // ...
  pub pad: [c_long; 24],
}

Like C unions, each field would start at the beginning of the struct, and the size of the struct would be that of its longest field. This wouldn't require adding union as a language keyword. The only limitation I can think of would be that accessing a field in the union would require unsafe, which is already used often when interfacing with C libraries.

A macro based solution could look something like:

union! {
  pub union XEvent {
    pub type_: c_int,
    pub xany: XAnyEvent,
    // ...
    pub pad: [c_long; 24],
  }
}

// functions generated by macro:
impl XEvent {
  pub unsafe fn type_<'a> (&'a self) -> &'a c_int { ::std::mem::transmute(self) }
  pub unsafe fn type__mut<'a> (&'a mut self) -> &'a mut c_int { ::std::mem::transmute(self) }
  pub unsafe fn xany<'a> (&'a self) -> &'a XAnyEvent { ::std::mem::transmute(self) }
  pub unsafe fn xany_mut<'a> (&'a mut self) -> &'a mut XAnyEvent { ::std::mem::transmute(self) }
  // ...
  pub unsafe fn pad<'a> (&'a self) -> &'a [c_long; 24] { ::std::mem::transmute(self) }
  pub unsafe fn pad_mut<'a> (&'a mut self) -> &'a mut [c_long; 24] { ::std::mem::transmute(self) }
}

The only thing that prevented me from writing this macro is the inability to determine the size of the union at compile time. The best workaround I could come up with is providing a guess of the size of the largest field and making the union generate tests to verify this.

union! {
  pub union XEvent : [c_long; 24] {
    pub type_: c_int,
    pub xany: XAnyEvent,
    // ...
    pub pad: [c_long; 24],
  }
}

// test generated by macro:
#[test]
fn test_union_size_XEvent () {
  use std::cmp::max;
  use std::mem::size_of;
  let sizes = [
    size_of::<c_int>(),
    size_of::<XAnyEvent>(),
    // ...
    size_of::<[c_long; 24]>(),
  ];
  assert!(sizes.iter().fold(0, |a, b| max(a, *b)) == size_of::<[c_long; 24]>());
}

Of course, it would be much easier on developers of language bindings to have unions available as a language feature.

@retep998
Copy link
Member

winapi would benefit massively from unions as part of the core language. I currently use a macro to make do, but its just not the same.

@joshtriplett
Copy link
Member

I'm interested in unions as well, for several Linux kernel APIs. The proposal of having an "unsafe union", guaranteed to match the C layout, would work perfectly; almost any non-trivial instance of such a C union only makes sense to access in an unsafe block, given its trivial equivalence to the unsafe std::mem::transmute.

@serprex
Copy link

serprex commented Oct 25, 2015

Most unions in C have a descriptor field, therefore there's a need for 2 cases (has-desciptor & has-no-descriptor). Being able to specify a struct-unique enum with custom type descriptor & the fields corresponding values would allow Rust to use the union in a type safe manner while being able to interoperate with C APIs

Essentially something like

#[enum_explicit_descriptor(t)]
#[enum_explicit_values = "I: 0, N: 1"]
unsafe struct TValue{
  t: u8,
  val: unsafe enum IntOrFloat{
    I(i32),
    N(f32),
  },
}

Using unsafe struct to handle cases where the type descriptor isn't adjacent to the union. Even then, something could be done like

#[enum_explicit_descriptor_type(u8)]
#[enum_explicit_descriptor_typeoffset(-1)] // This could be behind-the-struct by default
#[enum_explicit_values = "I: 0, N: 1"]
enum IntOrFloat{
  I(i32),
  N(f32),
}

Then there'd need to be compile-time machinery that makes sure there's a valid u8 behind the enum in definitions, though user code would access a struct TValue{ t:u8, val: IntOrFloat }

The issue of having typeoffset could be resolved by requiring explicit enums only be contained in structs & have enum_explicit_layout_typeoffset be specified by the struct. Would require a bit more strictness though since one wouldn't be able to know how to find the descriptor of an &IntOrFloat parameter

@mzabaluev
Copy link
Contributor

@serprex: I don't think it's worthwhile to add language support for external descriptors of unions, even in cases where there is a 1:1 match between a single descriptor field value and a union variant. The code using unions is expected to be close to FFI, where unsafe is the norm; so variant matching can be always unsafe, and the burden of ensuring the correct variant would be completely on the programmer, as it is in C.

@joshtriplett
Copy link
Member

@mzabaluev I agree. For a first pass, at least, we just need an unsafe construct to access fields of a C union in a C-compatible, interoperable way. We can always produce a safe wrapper around that, and even produce macros to generate such wrappers for common cases.

@joshtriplett
Copy link
Member

I posted a preliminary proposal using #[repr(C,union)] struct { ... } (requiring unsafe blocks for field accesses, assignments, or initializations) to https://internals.rust-lang.org/t/pre-rfc-unsafe-enums/2873/23.

@huonw
Copy link
Member

huonw commented Jan 5, 2016

Closing in favour of rust-lang/rfcs#877.

@huonw huonw closed this as completed Jan 5, 2016
flip1995 pushed a commit to flip1995/rust that referenced this issue May 17, 2020
…lint, r=phansch

Improve `option_and_then_some` lint

fixed rust-lang#5492

changelog: Improve and generalize `option_and_then_some` and rename it to `bind_instead_of_map`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-FFI Area: Foreign function interface (FFI) P-low Low priority
Projects
None yet
Development

No branches or pull requests