-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
violation of GC safety #27
Comments
Thank you very much Jameson! I was quite worried about this, and was hoping someone more knowledgeable/cleverer than I would take a look someday. (You might be able to tell from the comment). Can you point me somewhere that explains the correct procedure, or an example of where the correct procedure is implemented? Otherwise I'll have to make that branch an error for now (it's probably not used in the wild). |
I don't think it's possible to implement correctly, maybe @yuyichao can say for certain? |
In additional to gc safety issue, this also violate the immutability since the tuple is not inlined if the type is not a bits type so this is not really fixable. The GC safety can be worked around (does not apply here due to the immutable issue) by writing llvm code using |
I don't understand this. I'm not sure what you mean by inlined. I think I've found the correct bit of memory to write the pointers to.
Or calling an appropriate C function, I assume? Maybe there is something I'm missing. |
AFAICT
Sure, if you write your own. |
Inlined means the memory of the field is, .... well ...., inlined and the field is not a reference to another piece of memory. |
Right. From playing it seems that the pointer to my Is the problem with GC that some of my code might be executed in-between this GC scan and the actual deletion? During this function a reference to the new data should exist on the stack, so at all times there should be at least one or at least two references to the to-be copied value. But if the function exits and my MVector is not rescanned is there a moment the GC thinks no reference exists? Sorry, I'd just like to understand better... I don't even know what the write barrier is used for (what exactly do we not want to be written to?). My understanding of GC is pretty simplistic - it scans the stack and its references and all their references and so-on, and deletes all the stuff it doesn't find - and obviously execution has to be paused somehow while this occurs. |
There are two problems immutability and GC safety. The inlining of the field has nothing to do with the GC, only the immutability.
|
Definitely true on the stack. It is possible to mutate global But I have to disagree when the immutable thing descends from a mutable type somewhere. An obvious case: I can always reinterpret a This idea must generalize to mutables with immutable fields, because (a) it would be super weird if The idea of reinterpreting a length-1 |
What do you mean? On the stack or not is totally irrelevant. Whether it is inlined into mutable object is.
What do you mean by global consts? And again global consts has nothing to do with this.
Correct.
No this is NOT what I'm saying. For
Not in general. Again, only when the immutable field is inlined.
Therefore you can't do that when the immutable object is not inlined into the array either.
For non-inline object the memory holding the immutable object is never mutated. Only the reference to the memory (the pointer stored in the field).
FWIW, this is a really bad way to tell if it's okay to mutate the memory. You can generally pass pointers to immutable memory using
Only when inlined.
As an unrelated piece of information what you might need to know before hitting it yourself, it is undefined if you creates an actual array from the memory belongs to a non-array object (i.e. call |
First, thank you @yuyichao for a thorough response. Sorry if I've seemed a little slow, or perhaps assertive, but I prefer my misconceptions to be corrected :) I think I finally figured out where we are getting our wires crossed. My mental model of I guess I think of non-inline immutables as anomalies that will go away in Julia v0.6 (fingers crossed). I'm happy not to support them for now. Some responses:
Right. Inlined ==
I think here I'm talking about isbits immutables and mutables, and you're talking about non-isbits immutables. :) Does that seem right?
I assume using
OK now you have really piqued my interest. I was considering this kind of thing for the future. What is going on here? Thanks again for all the information! :) |
Also, to be clear, what I'm not trying to do is update the memory references in |
The field is mutable in that the you can reassign the field. The object referenced by/stored in the field is not. This is possible to guess what you mean but it's confusing.
There will still be cases a immutable can't be inlined.
No. It is not necessarily unboxed on the stack and it's not unboxed on the heap as const global. It's NOT OK to overwrite the memory of a global const immutable since it's nothing different from any immutable object.
You get a pointer to both with
I'm talking about immutable objects in general. You can't overwrite the memory owned by a immutable object. A immutable object inlined into another (possibly mutable) object (or array if it matters) doesn't own the memory.
I'm talking about function argument, not return value.
FastAnonymous.jl is reasonably defined. Direct memory access bypassing julia's runtime is totally different.
Arrays are not allowed to alias memory of normal object, that's all. |
FWIW, I never talked anything about your |
So to summarize, this isn't fixable, except by replacing the Presumably JuliaLang/julia#17115 or something equivalent will then come along make all this just work nicely. |
For reference, a safe way to update individual elements in an immutable is to replace the entire immutable. One might hope that LLVM might generate optimized code that does the equivalent of your unsafe manipulation when it can prove that it will cause no harm to do so. Example: For obvious reasons, this has non-optimal performance, but that might not affect you here. |
OK, all this is rather interesting and I confess I need some more time to digest it all. But thank you everyone for all the responses. I'm a little confused why if I could replace an entire tuple I couldn't replace just one item as an optimization, except for the possibility the the GC isn't built to expect that to ever happen (or that codegen might possibly make an optimization that is inconsistent with the change, but as far as I've seen codegen doesn't make many assumptions about things on the heap). A
Bugger. In the case all types/fields were concrete, I was expecting them to work like C-structs (immutables) or pointers to C-structs (mutables). |
What's the confusion? The GC is irrelevant here. The single most important property is the type layout, i.e. whether the tuple is inlined (in another word, which heap box owns the memory)
This is essentially meaningless. Any object is a reference to some memory, even isbits immutables. They can be optimized to not be if the compiler can prove that's find to do and the optimization is irrelevant here.
It does all the time.
You can't have recursive inlined struct in C either. |
And in another word, for non inlined tuple when you replace the tuple you replaced the pointer that reference the tuple. When you replaced an element you mutate the tuple itself. They are completely different and these are not optimizations. |
Hmm... I think I've had an insight: the first thing in your mind is the mutable/immutable reference model while the optimizations to the stack, allocations to the heap, and so on are just optimizations. Whereas I'm still thinking like a C/C++ programmer, and pretending as if Julia is using pointers and stack allocated objects as its way of implement mutable and immutable things, because I spend a lot of time staring at Still, aren't the suggestions in JuliaLang/julia#17115 and JuliaLang/julia#11902 all about mutating fields in a chain of things where something is immutable and something is mutable? Like changing one element of a ref of a tuple? |
Yes, and those are impossible with non-inlined immutables |
That's fine. But please pay attention to when I said inlining and discuss based on that. A inlined field is just like a struct in C, a non-inlined one is just a pointer. |
Okay, now I'm wondering what is a good way forward. I can think of several options. Let's say for the moment I want to support
Any opinions on which ones are possible/preferable?
Can we do that in Julia? In C/C++ the recursion is achieved via a pointer/reference field, right?. I assumed I would need a |
Only 1 and 5 are options. None of the others are.
No, that's why there's always something that's not inlineable.
No that's totally irrelevant. |
While a little unadventurous, it may have to do for now.
If you consider a multiple-year timescale as Julia evolves I suspect we can do a good |
I've made a commit that disables this completely, for now. Does anyone here want to use the non-isbits case, even if it is a bit slow? It would just be another One thing I do plan on doing is make |
FWIW, I wanted to do this today. What’s the status, six years on? It seems innocent enough to expect… m = MVector(1, 2, missing, 4)
m[3] = 3 to work. Alas, it doesn’t, and I followed the comment in the source code to this issue. P.S. I was able to get around this and move on by doing: m = MVector(1, 2, missing, 4)
m = setindex(m, 3, 3) # makes a copy since I wasn’t too worried about making a new copy. |
Without an ABI for unions or GC-managed references similar it's difficult to support this in a package without possibly invoking undefined behavior resulting in bugs. One thing that might be useful is if Julia exposed "mutable tuples" and we could build on those. @Jollywatt note that if you are using static arrays in a specific & known context (where you know the length of the arrays) you can create your own mutable |
@yuyichao would you please be so kind to explain why the third option does not work? Or was that a limitation of an older Julia version and is not relevant anymore? |
It's the opposite. When the Julia compiler was very simple (Julia 0.4 maybe? Or 0.3?) you could get it to create new types during the code generation phase. This was never officially supported, but was a really fun hack :) Since then, the compiler does more, especially in terms of caching work that is already done for fast recompilation while keeping everything neatly up-to-date when a definition changes, in a series of self-consistent "worlds". However, the system doesn't allow (because it doesn't make sense) to inject a new type when compiling in existing world (since new types more-or-less define the new world) so the "hack" won't (and can't) work. These days I favour adding a mutable tuple (like a tuple but with mutable fields and invariant type parameters). It would be a fair bit of work to introduce a new fundamental datatype to Julia though (like the recent |
Thanks for the answer, @andyferris.
Indeed, mutable inline structs and mutable tuples seem to be the two missing basic data structures.
But does it need to be in an existing world? At least for a static type, this seems to set the bar too high, as other types are typically defined in global space as well. While the new memory type would be the cleanest solution, I am currently experimenting with something along these lines: using StaticArrays
abstract type WritableVector{N, T} <: FieldVector{N, T} end
macro define_wvector(N)
:(mutable struct $(Symbol("WVector", N)){T} <: WritableVector{$N, T}
$(Expr(:block, (:($(Symbol("_", i))::T) for i in 1:N)...))
end) |> esc
end
@define_wvector 16
a = 101:116 .|> Int8 |> WVector16
julia> a[13]
113
julia> a[13] = Int8(2); a[13]
2 Granted, this does not avoid using the reference for non- b = 101:116 .|> Int8 .|> Ref{Int8} |> WVector16
julia> b[13]
Base.RefValue{Int8}(113)
julia> b[13] = Ref{Int8}(2); b[13]
Base.RefValue{Int8}(2) If this works, maybe we do not need a mutable tuple, but could benefit from an Do you have benchmarks which can be used to compare a |
Yes - that is a perfectly cromulent approach. I believe older packages like ImmutableArrays.jl and FixedSizedArrays.jl did stuff like this. Honestly it probably has fewer downsides than the current We'd really have to confer with the various communities (including autodiff) to see whether this is a realistic approach for them. |
the MVector violates gc safety when it contains non-bitstypes (it lacks a write barrier), resulting in the possibility that the gc will lose the object and delete the objects in the array while still in-use:
StaticArrays.jl/src/MVector.jl
Lines 49 to 50 in 1e18069
The text was updated successfully, but these errors were encountered: