-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Get rid of #undef and replace it with null in Array{Union{Null, T}} #23721
Comments
@vtjnash and I discussed this at JuliaCon (getting rid of #undef). I'm not sure about the While it would certainly be nice to reduce language complexity w/ one less form of "undef"/"missinginess", we mainly discussed this in the context of #18632, which has been a long-desired optimization. While that PR has been deferred due to being "an optimization", we should still consider any deprecations that might be needed in order to facilitate those optimizations. |
I think "getting rid of the uninitialized Array constructors" is essentially equivalent to this proposal, since then you have no way to create an "empty" |
Oh, one other thing I'd mention is that with nice syntax for |
Since |
What would be breaking would be removing uninitialized array constructors, which is mostly what this proposal is about. So I don't think the milestone is appropriate. |
You're absolutely right - and the representation of the eltype would need to be carefully considered, too. |
If we remove My interpretation of the "billion dollar mistake" is the introduction of first-class null references --- i.e. null references you can pass to a function. The problem is that null references propagate, so that a problem can appear far from the source. Immediately raising an error on reading a null reference helps localize the problem. So whatever the merit of replacing Also, |
The implementation could be changed to implicitly use My suspicion is that the really problematic part of the billion dollar mistake is not that |
I should add that I think we're getting a bit late in the 1.0 timeline to be considering such a fundamental change. However, since it's hard to write code that relies on |
Yes, being able to express the exclusion of null mitigates the issue to a large extent, but the point is that raising an error on reading an undefined location is not the billion dollar mistake. In fact it's almost as far as you can get from it --- not only can you express that something is non-null, you're forced to express it in many contexts (i.e. it implicitly puts a |
Deprecating uninitialized array constructors would clearly be breaking; fortunately it's not terribly hard to do (marking them as |
I'm definitely interested in changing the array constructors for #16029. Keeping our current semantics but making it "harder" to get uninitialized arrays is something I could be on board with. |
I agree with everything Jeff has said. The typical "legitimate" use case for |
That's more or less what this proposal suggests, by requiring a call to |
@nalimilan I wonder if I'd prefer something... completely orthogonal to the type system, I suppose. For example, I want my |
Yeah, but that system doesn't work for |
I recall reading early threads which wanted the uninitialized array constructors because @benchmark zeros(1000)
BenchmarkTools.Trial:
memory estimate: 7.94 KiB
allocs estimate: 1
--------------
minimum time: 455.375 ns (0.00% GC)
median time: 1.205 μs (0.00% GC)
mean time: 1.926 μs (25.12% GC)
maximum time: 24.584 μs (90.79% GC)
--------------
samples: 10000
evals/sample: 144
@benchmark Vector{Float64}(1000)
BenchmarkTools.Trial:
memory estimate: 7.94 KiB
allocs estimate: 1
--------------
minimum time: 101.663 ns (0.00% GC)
median time: 183.116 ns (0.00% GC)
mean time: 381.337 ns (44.79% GC)
maximum time: 2.417 μs (68.70% GC)
--------------
samples: 10000
evals/sample: 956 |
Yes, we'd still need a way to get uninitialized arrays. But the syntax for it can change. |
Conclusion from triage: radical changes to |
Array constructors are indeed the main source of BTW, regarding the more general discussion, I've bumped into this comment from Erik Lippert (one of the C# creators), which was posted during discussions on
|
Union{T,Nothing} does this now |
I mentioned this idea at #16029 (comment) but I thought this proposal entails large enough changes that it deserves its own issue. The idea is to get rid of
#undef
entries in non-isbits
arrays, and replace it withnull
(from #23642). This would have several advantages:#undef
is kind of weird since you can get this value, but never set it again, and having multiple notions of missing/uninitialized value increases the complexity of the language.Array{T}
always contains valid values of typeT
: currently there's always the possibility that indexing into a non-isbits
array throws anUndefRefError
, which is akin to the "billion dollar mistake" (though in a less severe form since it only affects arrays).Concretely, in order to be able to set uninitialized values to
null
, we must force all uninitialized array constructors to createArray{Union{Null, T}}
rather thanArray{T}
objects. For non-isbits
arrays, this is not a problem since there is room to store the type tag already, or aNULL
pointer can be translated tonull
as a built-in special case. Forisbits
arrays, with @quinnj's recent work to optimizeUnion
s (#22441), the performance impact is limited: it is equivalent to allocating anArray{T}
plus anArray{UInt8}
(the latter storing the type, i.e. whether the value is initialized or not). Moreover, withUnion{Null, T}
the type tag forNull
is equal to0
, which means theArray{UInt8}
part can be allocated directly and efficiently usingcalloc
. But of course forisbits
arrays it would be more common to create an initializedArray{T}
directly usingzeros
orfill
(which would be the recommended constructors).Once all entries in an unitialized
Array{Union{Null, T}}
have been replaced with valid objects, the array can be turned into anArray{T}
via a simple call toconvert
. This never requires making a copy, so it's very efficient. Then theArray{T}
can safely be passed to functions which expect it to contain only validT
objects, which is now always guaranteed.If it turns out that creating the
Array{UInt8}
type tag part is too costly in some very particular cases, anunsafe_array
function could be provided, which would allow creating uninitializedArray{T}
objects forisbits
types. This should probably not be allowed for non-isbits
types (though if it was the current behavior of#undef
could possibly be retained, with the expectation that most users would never see it).Since it forces filling arrays with valid values on construction, this proposal effectively amounts to forbidding uninitialized arrays (#9147).
The text was updated successfully, but these errors were encountered: