-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a vector is always == to itself, even when containing missing #34744
Conversation
Precisely --- I'm inclined not to do this, largely because it only makes sense for mutable types. We should change Dict comparison instead. |
Tl;Dr: we definitively should change the mental model of Just for the point @oxinabox brought up, rephrasing it to leverage the precise point into focus: I first found reflexivity of variables should hold even for missings as these represent values you don't know. But no matter which value, it will be equal to itself. Otherwise we lose an important property of That brings me to the conclusion that Seems like the only formally correct way would be to suffer from the same problem that Strings and Symbols have and turn Missing into a mutable empty struct to be able to distinguish different instances. 😭 EDIT: Or change the mental model of missing to something else which properly reasons for that case (which definitively would the the better path to keep good performance I guess). The current model is flawed in some sense anyway as that model would suggest CAUTION: The following thoughts might insult you or provoke some very bad feelings, so be warned and only keep reading on your own responsibility. |
Even without julia> NaN === NaN
true
julia> NaN == NaN
false
julia> isequal(NaN, NaN)
true
julia> x = [NaN]
1-element Array{Float64,1}:
NaN
julia> x === x
true
julia> x == x
false
julia> isequal(x, x)
true |
make |
Not sure what this mean but note: julia> "foo" === "foo"
true
julia> :bar === :bar
true |
This seems like a non-useful way of thinking about the Julia
and define |
You need to overload each occurence of missing for your own type aswell then. But that's not the point. The point is that each properly defined type should be reflexive. (NaN being a float is a bad design from consistency points of view but can't be changed). So this is not only about missing but about default equality implementations in general. Default implementation should follow the lines of |
This issue comes up for any sort of computing with domains. Intervals are the classic example. Giving And yes, there is a point in julia where abstraction stops, and it's |
I agree with @JeffBezanson that the correct fix is to change @rapus95 Please don't rehash all discussions that happened before |
I meant that: julia> Base.isimmutable(:a)
false
julia> fieldnames(Symbol)
() an empty mutable struct.
Definitively
@JeffBezanson that sounds a bit harsh but to be fair, I claimed it to be the only correct way for the given mental model @oxinabox, me and probably some others are carrying. And for that model I'd still say the claim to be true.
I definitively support that aswell because without it my custom equals So, let me rephrase it once again (that's why I added the EDIT in the first place): Tl;Dr: That mental model is flawed, not the implementation (which is good & very performant 👍); I don't want to change the implementation but am calling for a better model. Regarding |
Any documentation on this should not use the word "currently", since that implies it might change to being known at some point. Of course, a mutable container can do that, but |
Here's another one: julia> isequal(missing, missing)
true
julia> Set([missing]) == Set([missing])
true Personally, I actually think (set1 == set2) == (set1 ⊆ set2 && set2 ⊆ set1) == (length(set1) == length(set2) && set1 ⊆ set2) Since But a set is just a collection; I feel we should have been consistent and made arrays and dictionaries and tuples to use have a "reflexive" (I also think there is something here to preserve along the lines of "the interface of arrays should be independent to the interface of the elements", but I don't have good enough language to express that idea. Basically, it would seem more composable if container implementers didn't have to know that Basically, I think |
Yes, things would certainly be cleaner if containers didn't have to have a special case for |
I think What do we mean by "bitten" here? For the longest time I assumed I am frequently not sure what the purpose of Out of curiousity - if I really wanted " (Note: the other path to efficient recursive |
It's not in julia> Hour(1) + Minute(10) == Minute(70)
true
julia> isequal(Hour(1) + Minute(10), Minute(70))
false Reading #18485, IIUC, the motivation behind |
I just found this great write-up by @StefanKarpinski explaining (It would be nice if design documentations/explanations like this are in a predicable place like Julep... #33239) |
I think, either way, we should add exactly those contracted properties to the docs of the comparison operators. I suggest to tag each ordering system's operators with a keyword (like "hashing" and "intuitive/IEEE" for now) and to define the following properties on those different "universes":
Now, for the given categories we have required contracts and suggestions
✔️ required/contracted Further improvements & additions would be very welcome! Edit: How about adding a 3rd universe by splitting up intuitive & IEEE? That'd at least help to define the actions that may happen in @fastmath. |
Thanks @tkf, that is very helpful. (If I were to critique Stefan's rationale I would say that jumbled-up ordering of One thing I would like to get out of this (as a library writer implementing things like sets and ordered collections, or algorithms for Can one safely assume that EDIT: Damn, I just remembered that unordered sets (and dictionaries) can't satisfy |
What about |
It's important to distinguish keys and values here. Sets don't contain their elements (and dicts their keys) in the same way that arrays do. For sets and dicts, the container determines what equality predicate to use.
Yes,
The motivation for this is memoization. A numerical function can easily give a very different answer for |
Thanks for the explanation. I feel this is much stronger motivation than
I think I understand this, and, because of this, I think it's important to document the behavior like julia> Dict(1 => missing) == Dict(1 => missing)
missing
julia> Dict(missing => 1) == Dict(missing => 1)
true and also julia> Dict(0.0 => true, -0.0 => false)[0.0]
true as I don't think it is immediately obvious that this is how |
@JeffBezanson what specifically didn't you like about adding a tabular overview of the current comparison operators/systems aswell as contracting them in how they work for given circumstances? (For knowing what to do when implementing my own variant aswell as knowing what I can rely on)
I'll add that to the table. |
Cool, I submitted a PR (#34798) to document that. @JeffBezanson Regarding this PR - I for one would prefer array equality to strictly depend on the elements (and |
I feel like those suggestions are a good idea for the new Dictionaries.jl package. That |
I don't have any issues with tables, but that one seems to me to confuse much more than it clarifies. I don't really understand what it's trying to convey. It seems to make the situation seem much more ad hoc and confusing than it actually is. Perhaps some table would be helpful. |
Yes I think we should add that. |
Well, after reading all those function docs again I feel like I haven't been on the newest version regarding their documentation. I guess the only thing I'm missing is a paragraph about "Comparables" in the Interfaces section of the docs. That'd guide you on which functions to overload in a similar manner to how iteration is explained. Btw why do we fallback from Well, and I find the docs of
That sounds contradictory. The second sentence should rather clarify that outside of the common domain it may not be defined. Like:
We also should note |
Agree, that does seem contradictory. Clarification would be good. I believe @JeffBezanson wrote that in the first place. Can you clarify what you meant? Perhaps it's that given
Whether two values are comparable or not depends only on their types, not specific values. |
Yes that's basically what it means. It's just to point out that while |
Could we put a stable total order on (at least concrete) |
closes #34744 use `isequal` to compare keys in `ImmutableDict`
closes #34744 use `isequal` to compare keys in `ImmutableDict`
If the philosophy behind the compatibility between hasing and sorting is to support consistent behavior of hash-based and comparison-based containers, I think making |
I don't think that follows; having an equivalence relation is weaker than having a total order. |
(I opened #34815 for discussing totality of |
closes #34744 use `isequal` to compare keys in `ImmutableDict`
closes #34744 use `isequal` to compare keys in `ImmutableDict`
closes JuliaLang#34744 use `isequal` to compare keys in `ImmutableDict`
closes #34744 use `isequal` to compare keys in `ImmutableDict`
This tries to fix the following inconsistency:
The docstring for
==
saysWhich would suggest
d == d
should bemissing
. But another point of vue would suggest thatv == v
should betrue
, as well expressed by @oxinabox:This PR makes
(v == v) == true
. But this raises the quesion for some immutable types: what should(missing,) == (missing,)
be? we don't know if both tuples have been created from the same source or not...cc. @nalimilan
(apologies if this was already discussed to death and consciously decided upon the current behavior)