-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The fate of Nullable #22682
Comments
In my opinion, we should do the following:
The current uses of There's something to be said for having both representations in Base, as they serve fundamentally different purposes. The container-based approach is ideal for things like Keno's suggested revision to the iteration protocol, where a |
Regarding |
Say you're parsing a text file, and you know that it contains data that is either an integer or |
The use case you give is typical, but I am not sure if this should be handled by Of course My feeling is that in general the discussion about leaving or removing |
Am I correct in reading that the optimizations for unions (and Arrays of unions) will only be possible for isbits types? In which case Nullable would still be useful for more complex types. |
No, |
Why does |
The problem with keeping the |
For the use cases where Nullable has been used as a result-or-flag that something went wrong (roughly as a deferred exception), I'm increasingly finding that it isn't only useful to know that something went wrong, but also a bit of detail about what specifically went wrong. So for the optional meaning we may get more mileage out of a result-or-error-code type than the current result-or-null type (assuming API's were migrated to start using it uniformly). |
Such a result-or-error-code type should probably be a sum type, so should we reconsider adding a facility for general sum types to Base and make the result-or-error-code type a special case of it? (Sum types are not |
@tkelman This is exactly why I am asking See e.g. https://www.schoolofhaskell.com/school/starting-with-haskell/basics-of-haskell/10_Error_Handling There are two things to consider in my opinion:
|
IMO exception objects should stay out of control flow. If something goes wrong and emits an error, the error should be thrown. If you really need to deal with an exception object, What you're describing, @tkelman, sounds like result types. IMO they have their place, but generally speaking, if you have |
Of course, careful judgement is needed for deciding throwing an exception versus using a result-or-error-code, but |
Of course, I completely forgot to mention the main interest of Result types would also be a nice extension of the |
Forgot to mention I've updated the description to cover these points. |
My preference would be to just leave struct Nullable{T}
value::Union{T,Null}
end I think the data-missing-values story should just stick with the names that are already used in DataFrames/DataArrays today, i.e. That strategy would avoid a lot of deprecation work and breaking of old code: essentially all the code depending on the current I don't think there are particularly super strong arguments pro/con various naming schemes per se, for example I can see lots of arguments pro/con |
A separate point: if we do want to use different concepts for the software engineering and data science case of missing values (which I strongly support) and want to use a |
Arguments about naming are often quite close to bikeshedding, but in this case there are a few reasons to change the names:
|
Also a reason to replace |
I would keep the complete terminology separate for these two cases if the decision is to have two separate concepts of missingness. For example use
Pandas also uses
Isn't that the kind of result type that @ararslan described above? I agree with him, that seems a different thing than what |
That could also be changed to say |
Pandas calls missing values |
They use |
My preference is
I think that |
So, how about the following:
|
I don't think any changes should affect
That doesn't really make sense because I'd be fine w/
|
I don't think we should give another meaning to |
Whoops. I meant |
👎 to |
Lint, in particular, heavily relies on Nullable{Any} where the element can be |
Sure, that's fine @TotalVerb. @ararslan, it's not really another meaning to |
"Not |
OK, if the CS null is not going to be
as proposed by DavidAnthoff. That would leave 4 different kinds of nullish values:
... then there would be just |
What you're calling |
I'll make a PR retaining my favorite solution to illustrate what this implies. |
See PR at #23642. |
Can someone clarify if a shorthand like It seems two would be needed, one for Without a shorthand this union approach is less intuitive, harder to read, and only mildly less verbose in net than the old |
No, |
It's actually been deprecated quite a while; I did it during JuliaCon last year. 🙂 It's also unclear still whether |
I think elsewhere we concluded that right meaning for the |
All this stuff go v1. Great ! One minor issue remains, the verbosity we have to deal with when we type and retype So we have
And i propose to alleviate verbosity while typing by adding
PS The issue is closed, i don't know if it is the better place to discuss about that... |
I had proposed |
My proposal was made especially to not change any semantics (consensus was hardly reached ), only to reduce retyping. The whole is homegeneous, easily memorable, and yet in use today. I must confess i have not discern if Anyway, there are some choices today. |
With the perspective of representing missing data as
Union{T, Null}
in Julia 0.7, we should decide what will happen toNullable
. I think the consensus is that being container-like,Nullable
is appropriate to represent "the software engineer's null", as opposed to "the data analyst's null", a.k.a. missing values. In other words,Nullable
offers three properties whichUnion{T, Null}
does not:Union{T, Null}
, where code may work when a value is of typeT
but not when it is of typeNull
, which might not have been properly anticipated/tested).Nullable{Nullable{T}}
fromNullable{T}
(contrary toUnion{Union{T, Null}, Null} == Union{T, Null}
), which is useful when you need to make the difference between "no value" and "null value". Such situations arise e.g. when doing a dictionary lookup (tryget
, cf. Get dict value as a nullable #13055 and Add get(coll, key) for Associative returning Nullable #18211); when parsing a string viatryparse
to a value which could either be of typeT
, null, or invalid; or when wrapping a value which could either be of typeT
or null in aNullable
before returning it from a function.The two first features are the ones which turned out to be annoying when working with missing data, but which can provide additional safety for general programming. A detailed discussion of the advantages and drawbacks of these approaches can be found in the
Nullable
Julep.Given that, several paths can be taken for
Nullable
in Julia 0.7:Nullable{T}
a (deprecated) type alias forUnion{T, Null}
. This would have the advantage that Julia would have a single concept of null/missing values, but without the advantages of the three points above. Checks that code is correctly prepared to handle null values could still be done by a linter.Nullable{T}
a (deprecated) type alias forUnion{Some{T}, Null}
, withSome{T}
a wrapper around a value of typeT
which would behave essentially likeNullable{T}
currently. Applying a function on the value would require usingf.(x)
,broadcast
or pattern matching, so that missingness would never propagate without explicitly asking for it. The advantages would be those of the three points above, at the cost of two different representations of missingness (or almost two, since theNull
type would be used in both cases).Nullable
in Base and move it as-is to a package. A possible variant would be to rename it toOption
in order to prevent confusion withNull
and be more consistent with other languages like Scala, Rust or Swift (IIRC,Nullable
was originally calledOption
and lived in the Options.jl package). The main advantage of this approach is to have a single representation of null values in Base, in particular to avoid setting the design ofNullable
in stone in 1.0. The main issue is that no code would be able to useNullable
in Base, which implies in particular changingtryparse
to returnUnion{T, Null}
, and that no correcttryget
method could be implemented for dicts (Get dict value as a nullable #13055). OTOH this could help increasing consistency with e.g.match
, which returnsUnion{T, Void}
, and uses ofNullable
are not so widespread in Base.EDIT: added mention of point 3. in the first series of bullets.
The text was updated successfully, but these errors were encountered: