Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement untagged unions (RFC 1444) #36016

Merged
merged 22 commits into from
Sep 3, 2016
Merged

Conversation

petrochenkov
Copy link
Contributor

@petrochenkov petrochenkov commented Aug 26, 2016

cc #32836

Notes:

  • The RFC doesn't talk about #[packed] unions, this implementation supports them, packing changes union's alignment to 1 and removes trailing padding.

  • The RFC doesn't talk about dynamically sized unions, this implementation doesn't support them and rejects them during wf-checking (similarly, dynamically sized enums are not supported as well).

  • The lint for drop fields in unions can't work precisely before monomorphization, so it works pessimistically - non-Copy generic fields are reported, types not implementing Drop directly, but having non-trivial drop code are reported.

    struct S(String); // Doesn't implement `Drop`
    union U<T> {
        a: S, // Reported
        b: T, // Reported
    }
    
  • Remove the old AST-based backend from rustc_trans. #35764 was indeed helpful and landed timely, I didn't have to implement internal drop flags for unions.

  • Unions are not permitted in constant patterns, because matching on union fields is unsafe, I didn't want unsafety checker to dig into all constants to uncover this possible unsafety.

  • The RFC doesn't talk about #[derive], generally trait impls cannot be derived for unions, but some of them can. I implemented only #[derive(Copy)] so far. In theory shallow #[derive(Clone)] can be derived as well if all union fields are Copy, I left it for later though, it requires changing how Clone impls are generated.

  • Moving union fields is implemented as per Untagged unions (tracking issue for RFC 1444) #32836 (comment).

  • Testing strategy: union specific behavior is tested, sometimes very basically (e.g. debuginfo), behavior common for all ADTs (e.g. something like coherence
    checks) is not generally tested.

r? @eddyb

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Aug 26, 2016

Actually, I'd like to make the implementation a bit more conservative, than written in the RFC.
Namely, to replace the unions_with_drop_fields lint with a hard error and require user-defined Drop implementation for unions with potentially non-trivially destructible fields (i.e. the same thing C++ does).

Effectively, #[allow(unions_with_drop_fields)] is replaced with impl Drop for U { fn drop(&mut self) {} } in code wanting to use such unions.

@@ -181,6 +181,9 @@ impl<'a, 'gcx, 'tcx> Builder<'a, 'gcx, 'tcx> {
ExprKind::Adt {
adt_def, variant_index, substs, fields, base
} => { // see (*) above
let is_union = adt_def.adt_kind() == ty::AdtKind::Union;
let active_field_index = if is_union { Some(fields[0].name.index()) } else { None };

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a better idea to use direct assignment here instead of an Rvalue - that would remove the need to change AggregateKind::Adt - alternatively, an Union AggregateKind could be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, an Union AggregateKind could be added

IIRC, I tried this and didn't like the result (too much common boilerplate code for Adt and Union)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field assignment approach might be best then.

Copy link
Contributor Author

@petrochenkov petrochenkov Aug 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what it means exactly. Any example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of lvalue = union { f: operand } you'd have lvalue.f = operand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that could make { let u: Union; u.f = x; u } pass a future MIR move checker - do we want that?

Copy link
Contributor Author

@petrochenkov petrochenkov Aug 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future move checker with full support for fragments should accept both { let u: Union; u.f = x; u } and { let s: StructWith1Field; s.f = x; s } I guess, see the last item in #36016 (comment) and "# Future work" in https://github.com/rust-lang/rust/blob/master/src/librustc_borrowck/borrowck/README.md.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, then the field approach is definitely correct - except for custom Drop impls, which require whole-value initialization "to be armed", what's the plan there?

Copy link
Contributor

@nikomatsakis nikomatsakis Aug 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, then the field approach is definitely correct - except for custom Drop impls, which require whole-value initialization "to be armed", what's the plan there?

There is needs to be special treatment for types that implement drop, yes. The idea is that you can't have such a type in a "partially initialized" state, so you can't move out from its fields, nor can you write to a field if the type itself is not initialized.

@eddyb
Copy link
Member

eddyb commented Aug 26, 2016

LGTM, great job!
I'm a bit sad about adding TyUnion and supporting trans::adt but I think it's easier to land this PR as-is and punt on those long-standing duplication issues.

r? @nikomatsakis for the borrowck bits & miscellaneous other details.

@rust-highfive rust-highfive assigned nikomatsakis and unassigned eddyb Aug 26, 2016
@petrochenkov
Copy link
Contributor Author

petrochenkov commented Aug 26, 2016

I think it's easier to land this PR as-is and punt on those long-standing duplication issues.

That would be nice, this patch set bitrots relatively quickly.
Improvements/refactorings in trans can be made together with refactoring remains of the old trans (also, I don't have enough expertise in the area to do polishing).

@@ -178,11 +177,28 @@ impl<'a, 'tcx, 'v> Visitor<'v> for EffectCheckVisitor<'a, 'tcx> {
self.require_unsafe(expr.span, "use of mutable static");
}
}
hir::ExprField(ref base_expr, field) => {
if let ty::TyUnion(..) = self.tcx.expr_ty_adjusted(base_expr).sty {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I feel like it'd be nice to match the sty (exhaustively) and have a bug! call for the unexpected cases. This makes me mildly nervous that some bug or addition might slip through.

// Unions
//=-----------------------------------------------------------------------------

struct UnionMemberDescriptionFactory<'tcx> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @michaelwoerister -- does this file look roughly right? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me!

@nikomatsakis
Copy link
Contributor

OK, did a review. Here are my thoughts:

  • I was skeptical at first, but adding the Option<usize> to the MIR ADT does seem to work well. And I remember now that I was thinking of union as more of a repr attribute on struct, and that approach seems to fit with that interpretation (i.e., a struct where all fields are at offset 0). So I am happy with it.
  • The borrowck integration is not the way I imagined doing it. I had imagined that instead of adding a loan for every field whenever one is borrowed, we would instead change the equality metric so that if you accessed field.x and field.y is borrowed where x and y potentially overlap, we would error out.
    • That said, this way is not bad.
    • But I am unhappy about the dummy types and changing the def'n of Eq and so forth. I always think that a custom Eq is a sign of subtle logic and this seems to be no exception. Now you have to track which LoanPath instances have "real" types and which don't.
    • But I do plan to be rewriting this to operate over MIR and maybe we can address it then.
    • Have to think about this just a bit.
  • The test cases are looking pretty good. As has been my practice for big PRs like this, I am going to go through the RFC and try to make a list of important points in there. I'll concatenate that with a list of tests I wrote down while skimming through the code and post it to make sure we hit all the important points. Many of them you already covered in any case.
  • I agree with @eddyb that it'd be better to have TyAdt but also that this would be better done in a follow-up PR.

@petrochenkov
Copy link
Contributor Author

@nikomatsakis

And I remember now that I was thinking of union as more of a repr attribute on struct

This is exactly how it turned out to be from implementation point of view. Unions tend to stay close to their syntactic surface representation and therefore treated in the same way as structs by most of compiler passes except for layout calculation and borrow/move checking.
I can imagine if union's syntax were modeled after enums, all the implementation would tend to enums instead.

@nikomatsakis
Copy link
Contributor

OK here is my master list of tests. Feel free to check-off ones that you feel you already have (I have already done so in some cases, but I know you've got a lot of these covered already).

  • test defining and calling an inherent method on a union type
  • test suggestions when accessing a union field but different field with similar name was intended (e.g., user did x.principle instead of x.principal).
  • test suggestions when accessing a union field but method was intended (e.g., user wrote x.calculate instead of x.calculate()
  • test semantics of assigning directly to a field whose type implements drop
  • test semantics of assigning to an enum which implements drop
    • enum dtor runs, fields should not be dropped
  • test dropping an enum which has a custom dtor by doing out of scope
    • fields should not be dropped
  • test dropping an enum which does not have a custom dtor by doing out of scope
    • fields should not be dropped
  • test pattern matching using 1 field, as intended
  • test pattern matching using 0 fields is error
  • test pattern matching using >1 field is error
  • test pattern matching using .. is error
  • test union containing unsized types, both in tail and not in tail position
  • test repr(C) applied to union and then passed to C code
  • test that "c type" lint warns if union does not have repr(C)
  • test that you can implement Copy even if all fields do not implement Copy
  • test borrowck interactions
    • borrow a, write to b
    • borrow a mutably, borrow b shared
    • borrow a mutably, borrow b mutably
    • move from a etc
  • test rustdoc integration
  • test that writing field is unsafe
  • test that reading field is unsafe
  • test that pattern matching union is unsafe
  • test that constructing a union is not unsafe (though honestly...well, wev)
  • test that union fields can be declared pub and then used outside of declaring module
  • test that private union fields cannot be used outside of declaring module
    • extra credit: test pub(restricted) :)
  • test that feature gate is required to declare a union type
  • test generic unions:
    • run-pass with generic type parameters being used in various ways
    • run-pass test generic union w/ field containing a projection (union Foo<T: Iterator> { elem: T::Item })
    • compile-fail with a where-clause that is not supposed (e.g., union Foo<T: Copy>, check that Foo<Rc<u32>> is an error to use)
  • test unions cannot be used in constant patterns (this is not in RFC, but seems good)
  • packed union size/alignment tests
  • test instantiating a union with 0 or >1 fields yields error
  • test lint warning if union has a Drop field
  • test implementing some non-special trait for a union type and calling methods on it
  • test size/alignment is as expected
  • test we can implement Copy on unions that have fields that are not Copy
    • we may want to lint on this too though? Seems good.
  • test C interop (i.e., declare a #[repr(C)] union with a non-trivial layout and pass it to some C code)

Here are some tests we could add, except that I think the impl will not work quite the way I would want (and I'm ok with that):

  • test that derive(Copy, Clone) works (or open a FIXME and leave a note on the tracking issue)
  • test deferred union initialization
    • the rfc says let x: Union; x.f = 22; use(x); should work; however, this doesn't work right for structs afaik, so I think it's ok if it doesn't for now

@nikomatsakis
Copy link
Contributor

One thing I thought of while doing this: we probably want to consider the borrowck error messages. i.e., I suspect they could give better advice (in general we try to point out when a restriction on path x is due to a borrow of some other path y). But we can leave this for later.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Aug 26, 2016

The borrowck integration is not the way I imagined doing it.

What I wrote is just the simplest way to implement it :D
But it's also the simplest way to explain/document the behavior - "if one union's field is borrowed/moved/assigned then all its other fields are borrowed/moved/assigned as well".
(It also feels like the right model in general.)

But I am unhappy about the dummy types and changing the def'n of Eq and so forth.

That assert in PartialEq had to be removed anyway (see #36004), types can be slightly different even without unions.

@nikomatsakis
Copy link
Contributor

Having thought about the borrowck -- I think I'm inclined to land it with this approach and consider patching it up and refactoring it later. This PR is too good to stall for long. :)

@nikomatsakis
Copy link
Contributor

r=me once tests are set.

@petrochenkov
Copy link
Contributor Author

@bors r=nikomatsakis

@bors
Copy link
Contributor

bors commented Sep 3, 2016

📌 Commit 436cfe5 has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Sep 3, 2016

⌛ Testing commit 436cfe5 with merge d748fa6...

bors added a commit that referenced this pull request Sep 3, 2016
Implement untagged unions (RFC 1444)

cc #32836

Notes:
- The RFC doesn't talk about `#[packed]` unions, this implementation supports them, packing changes union's alignment to 1 and removes trailing padding.
- The RFC doesn't talk about dynamically sized unions, this implementation doesn't support them and rejects them during wf-checking (similarly, dynamically sized enums are not supported as well).
- The lint for drop fields in unions can't work precisely before monomorphization, so it works pessimistically - non-`Copy` generic fields are reported, types not implementing `Drop` directly, but having non-trivial drop code are reported.

    ```
    struct S(String); // Doesn't implement `Drop`
    union U<T> {
        a: S, // Reported
        b: T, // Reported
    }
    ```

- #35764 was indeed helpful and landed timely, I didn't have to implement internal drop flags for unions.
- Unions are not permitted in constant patterns, because matching on union fields is unsafe, I didn't want unsafety checker to dig into all constants to uncover this possible unsafety.
- The RFC doesn't talk about `#[derive]`, generally trait impls cannot be derived for unions, but some of them can. I implemented only `#[derive(Copy)]` so far. In theory shallow `#[derive(Clone)]` can be derived as well if all union fields are `Copy`, I left it for later though, it requires changing how `Clone` impls are generated.
- Moving union fields is implemented as per #32836 (comment).
- Testing strategy: union specific behavior is tested, sometimes very basically (e.g. debuginfo), behavior common for all ADTs (e.g. something like coherence
checks) is not generally tested.

r? @eddyb
@joshtriplett
Copy link
Member

Awesome work, @petrochenkov!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants