-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For structs with all isbits or isbitsunion fields, allow them to be s… #32448
Conversation
@test addr(b) == addr(b2) | ||
@test addr(b) == addr(b3) | ||
@test addr(b2) == addr(b3) | ||
# @test addr(b) == addr(b2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash, these tests are failing on this branch; is that expected? I'm not sure why the Ref b2
case would be affected by anything I've changed, but I think I understand that for b3
, because it's an array, when we store it in an array (inline now), then extract it back again, it's technically a different jl_value_ptr
, but still the same bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so it looks like in the Ref case, the alignment of B23367
is now different, which is used by datatype_pointerfree
, to determine how to unsafe_convert
to a Ptr. Again, it seems like we're more correct now by saying that B23367
is indeed datatype_pointerfree
. I'm not sure how much people really run into this kind of thing because doing ccall w/ structs w/ Union fields probably isn't very common.
@@ -5863,7 +5863,7 @@ let | |||
@test b.x === Int8(91) | |||
@test b.z === Int8(23) | |||
@test b.y === A23367((Int8(1), Int8(2), Int8(3), Int8(4), Int8(5), Int8(6), Int8(7))) | |||
@test sizeof(b) == sizeof(Int) * 3 | |||
@test sizeof(b) == 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this is more correct now? The change is via the call to jl_islayout_inline
in jl_f_sizeof
here.
Dumb question: what does this mean for the data layout of structs with union fields? Would it be possible/make sense to always recursively strip the selector bytes out of such structs and store them separately? That is, make the non-selector parts of arrays of unions and arrays of structs of unions ABI compatible with the C version of the same thing? |
@c42f, so I'm actually not familiar with the C memory layout of structs with union fields, is it just the max size of the largest union type? I guess it doesn't track which element was put in the field, but relies on the user pulling out whatever they want? |
I think it's ultimately defined by the platform ABI just as the struct layout is. But yes, I'd guess the union bits would just have the largest size and strictest alignment of types within the union.
Correct; in C you need to store the type tag info explicitly, or infer it somehow from other state in the system. |
Bump. Probably best if @JeffBezanson, @vtjnash, or @Keno review this. No urgency really, but when you have a free moment. |
Bump again. |
Let's look at JuliaCon |
Bump :) |
This seems pretty good so far. It's good to see how small it is as I wasn't sure how much of the rest of the system knew to check these flags, and not assume that The one aspect to finish this up that I think may need to be preserved is to ensure that |
Hmmm......do we really though? That's kind of a big crux of the change here is that struct Foo
x::Int
y::Union{Missing, Float64}
Foo() = new()
end
[Foo()] == Vector{Foo}(undef, 1) So IMO, this almost feels more like a "bugfix" than change in behavior. How would we still do an UndefRefException? Special case these structs in |
Bump, this would be nice to have for 1.3 |
src/datatype.c
Outdated
@@ -248,7 +248,8 @@ static unsigned union_isbits(jl_value_t *ty, size_t *nbytes, size_t *align) JL_N | |||
return 0; | |||
return na + nb; | |||
} | |||
if (jl_isbits(ty)) { | |||
if (jl_is_datatype(ty) && jl_datatype_isinlinealloc(ty)) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{
on previous line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
…tored inline in arrays
1995295
to
d98544d
Compare
I see this was marked for triage; was anything concluded? |
There was a heated discussion, I'm not sure what we concluded. @JeffBezanson, you seemed in favor. |
@vtjnash, do you have any follow up here from previous discussion/questions? Anything else we need to change here? |
I'll plan on merging this in 12 hours or so unless @vtjnash or anyone else has concerns. |
If I define struct WrappedUnion
v::Union{Missing,Float64}
end how does the storage layout now compare to I'd like to suggest that the storage layout for these two |
That's pretty difficult to do, since it means the layout of the same struct is different depending on where/how that struct is stored. It's also hard to handle things like |
I still think this PR was a breaking change in its current state and needed updates to the documentation. |
I feel like with systematic hoisting of selectors you can get consistent layout when each such struct comes in two separable but consistently laid out parts: the data and the type selector bits. I've tried to write it out in more detail over at #33120. |
@vtjnash, it would be helpful if you could be more specific about what exactly you consider breaking and what documentation needs updating. As I've posted a few times here, I'm happy to push forward on things, but the lack of responses/direction has left this in its current state. Personally, I still feel this is more "bugfix" than breaking given the work and docs that went in to general isbits-Union optimizations. |
I still keep running into problems that this causes internally because it was a breaking change that changes assumptions made by some users and inference/codegen. I know you wanted to push ahead faster on this then I could review, and were rejecting my requests for changes to make it non-breaking. But it seems pretty yucky now that we implement
I generally intended the isbits optimizations to be exactly that: optimizations for unions of isbits types. The documentation was intended to reflect that. Regardless of what you feel like calling it, the fact is it changed that memory model without fixing all of the places where we assume the prior memory model, both in the documentation and in the code. |
…tored inline in arrays
This allows more structures to be stored inline in arrays, like:
We already had an
isinlinealloc
flag onDataType
, so we switchjl_islayout_inline
to use that instead of justisbitstype
. Then, when we're setting theisinlinealloc
flag for a DataType, we just check that each field isisbitstype
orisbitsunion
.