-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Equality semantics for -0
and NaN
#65
Comments
Thanks for this clean write-up! I support @bakkot's suggestion. |
I should mention that there is one other case for which equality is non-obvious: |
One more case to consider: what should (new Set([#[+0]])).has(#[-0]); evaluate to? My inclination is (For context, this is non-obvious because I guess I should mention another possible solution, which is to say that |
Tuples' equality should be defined as the equality of its contents. So Well, at least that was my first though, but I see some implications on this line of reasoning. Mainly, it is easy to check if something is a NaN ( // worse
if (isNaN(tuple[0]) && isNaN(tuple[1]) && isNaN(tuple[2])) { }
// better
if (tuple === #[NaN, NaN, NaN]) { } Considering that one of the aspects of this proposal improves the results of equality operations, I consider @bakkot approach the best. |
How should we decide on this question? During the discussion following the October 2019 TC39 presentation about this proposal, we heard people arguing both sides of this debate. I doubt that this is the kind of thing that we'd get useful data about by implementing various alternatives, though, as one or the other semantics here are not all that useful. |
Based on the discussion above (specifically the argument that we should try not to extend the unusual semantics that -0/NaN has, to more types), the champion group prefers @bakkot's suggested semantics, i.e.: assert(#[-0] !== #[+0]);
assert(#[NaN] === #[NaN]); I'll likely soon update the explainer with more examples to explain this, and link back to this discussion. |
I'm neutral to the suggested semantics for With the (This isn't a problem for the I would advocate these assertions: #[+0] == #[-0]
#[+0] === #[-0]
!Object.is(#[+0], #[-0])
#[NaN] != #[NaN]
#[NaN] !== #[NaN]
Object.is(#[NaN], #[NaN]) I'd also be happy with (I've edited this comment because it originally advocated for a more conservative change than what I'd actually prefer, but I think it'd be better to have it accurately reflect my beliefs - please keep in mind that some emoji reacts might be for the earlier version that advocated for |
As an example, what if someone uses a record/tuple as a coordinate? const coord = #{x: 0, y: 3}; And then they decide to move it around a bit: const coord2 = #{x: coord.x * -4, y: coord.y - 3};
const isAtOrigin = coord2 === #{x: 0, y: 0};
|
@Zarel I think you have good points. If we're trying to reduce the chance of bugs by programmers that don't know the details of the language, probably what you suggest is the best indeed. Also, programmers who are aware of such details, will either:
So to be honest I think the decision is basically irrelevant for us who know these details... So why not benefit the unaware ones? :) |
Where a value is stored shouldn't change how its identity is understood. If we are comparing records and tuples based on their contents, we should use the existing identity rules the language has for their contents. Also... as long as we're here, IEE-754 defines -0 as equal to 0 and NaN not equal to NaN, not for lulz, but because you can end up in situations like @Zarel points out, and NaN very purposely has no identity (NaN/NaN shouldn't be 1). These are not quirks of JavaScript but important invariants of how our chosen floating point number system works. |
JavaScript has many different identity rules, and we'd have to pick one (or rather, one for each relevant situation). The decision in this thread was to pick |
@bakkot If I do |
This is reasonably compelling to me. I guess I would be OK with the |
My first choice is My second is (Which I think reflects the understanding of the entire rest of the spec that treating @papb says "So to be honest I think the decision is basically irrelevant for us who know these details" but I don't think this is true – I know all the details, and I still would probably get tripped up by a |
I see in the update slides that -0 and +0 are still being considered not equal and I want to reiterate how incorrect that is. When a negative number underflows to 0 it has to keep its sign or further values extrapolated from that number will have the incorrect sign (this is also why normalizing the value to +0 is not correct). Due to this property, IEEE 754 punts sign normalization to equality, which @Zarel helpfully demonstrated above:
As for NaN, it doesn't break any calculation you might be doing to make it equal to itself (although it is against IEEE 754 to do so, and some people will argue that preventing that equality can halt forward progress as NaN intends), but as I mentioned above, I think breaking programmer's expectations about recursive equality is far more harmful than the benefit of being able to compare NaN (I was unable to find a single person who thought that non-recursive equality was a good idea, and many were surprised that such a question would even need to be asked). |
Just a random person here, but I would really really prefer that IEEE 754 numbers act like IEEE 754 numbers, and not what "make sense". NaN != NaN is a pain in the ass for everyone but it is there for good reasons. NaN is not a value, it's a catch-all for "can't do this". It's SUPPOSED to be a pain in the ass, because it's a signal that your code screwed up somewhere. NaN also is not a number, it's not really even intended to be a value, it's an error. If NaN == NaN, then you're saying 0/0 == inf/0 , which doesn't seem helpful at all. You might as well assert that two uninitialized values in C have to compare equally. Second, your computer's hardware isn't going to like you trying to tell it that NaN's are equal, and there are different encodings of NaN, so it's turns every floating point comparison into multiple ones. Please don't randomly second-guess a standard "because it seems to make sense to me", especially when it's trivial to find out the reasons these things are why they are. I'm all for trying to find a better approximation for real numbers than IEE754, for interesting values of "better", but when every computer built in the last 40 years has worked a particular way I'd like people to please think more than twice before saying "let's just randomly change the rules in this particular use case". |
Cc @erights |
I tested and this is correct. But I find it extremely surprising. But good! Where in the spec is this normalized? Given this strange, surprising, and pleasant fact, I am leaning towards normalizing -0 to 0 with records and tuple and then adopting But first I want to understand how the spec already normalizes these for Sets and Maps. Thanks. |
Btw, the Agoric distributed object system used to carefully preserve the difference between 0 and -0 in its serialization format. We defined our distributed equality semantics for passable values to bottom out in We changed this to let JSON always normalize -0 to 0. Our distributed equality semantics now bottom out in SameValueZero, which works well with that normalization. |
The whole NaN !== NaN thing to me is a category error between thinking within the system of arithmetic that these values are about, vs thinking about the role these values play as distinguishable first class values in the programming language. In E there is a distinct arithmetic equality comparison operator that is a peer to
|
If the memo is built naively on Sets and Maps, it will work correctly on NaN but memoize incorrectly on -0. Some other unfortunate anomalies: ['a', NaN].includes(NaN); // true, good
['a', NaN].indexOf(NaN); // -1, crazy
(_ => {
switch(NaN) {
case NaN: return 'x';
default: return 'y';
}
})(); // y, insane. Would anyone expect that? |
SameValueZero is also surprising, but less so: ['a', -0].includes(0); // true
['a', -0, 0].indexOf(0); // 1
(_ => {
switch(-0) {
case 0: return 'x';
case -0: return 'z';
default: return 'y';
}
})(); // x |
In
|
Good, thanks. I am in favor of always normalizing -0 to 0 in records and tuples. Such immutable containers would never contain a -0. We'd then compare using |
Did you all just skip my comment or something? |
@devsnek Your comment mostly just says that you are in favor of recursive equality, meaning presumably that each of the four equality algorithms in JS would be extended so that invoking them on tuples would invoke these recursively on their contents. This would mean that, instead there being three values for which |
@bakkot i mean the part about zeros... sets/maps are a weird category because they deduplicate their keys. most set/map impls in languages either don't specialize for 0 and use whichever one is inserted first (like c++) or provide no default hashing implementation for doubles (like rust) but js chose to normalize it to +0. I don't think you can really take any useful conclusion for "how to store a property" from that. Aside from maps/sets, it has been pointed out multiple times that normalizing or doing weird equality things to -0 is mathematically incorrect with regard to how ieee754 works. This crusade against having a functioning implementation of numbers needs to stop. |
As @acutmore alluded to, arguably the important thing is the notion of equality used in other operations such as map indexing, which in Java is Set.of(Double.NaN).contains(Double.NaN) // true
Object.equals(-0.0, 0.0) // false
Set.of(-0.0).contains(0.0) // false and in JavaScript is new Set([NaN]).has(NaN) // true
-0.0 === 0.0 // true
new Set([-0.0]).has(0.0) // true This is as opposed to eg, Haskell, which actually does maintain IEEE-754 semantics in collections: let nan = 0.0/0.0
Set.member nan (Set.fromList [nan]) -- False
Set.member (-0.0) (Set.fromList [0.0]) -- True
Set.fromList [nan, nan] -- fromList [NaN,NaN]
Set.fromList [0.0, -0.0] -- fromList [-0.0]
Set.fromList [-0.0, 0.0] -- fromList [0.0] I actually prefer the Haskell semantics for consistency, even though it does have some potentially surprising cases above, though |
@Maxdamantus Java and JS use different name, but JS And JS do not use Programming languages have differences on existence check, it may be ok, because they could have different definition for "key" and "existence", but I think programmers would like to have the consistent |
JavaScript's
In JavaScript,
Sorry, I messed that example up. I guess the point might be that there's already an inconsistency in JS, so it's not clear that one way is better than the other, which is probably why I don't feel strongly on one side or the other. As I was saying, personally I think it would have been cleaner if they had made |
@Maxdamantus Java
The "inconsistency" we are talking is JS actually have some basic "consistency", that is using SameValueZero for existence check, so I would like we can keep that "consistency" of "inconsistency", not introduce another level inconsistency to |
Another question, |
@hax https://tc39.es/ecma262/#sec-set.prototype.add
|
@rickbutton I think you missed some parentheses: the second case is |
Good point, ignore me. |
I was wondering: was the option of raising an error upon an attempt of putting I see that @rricard's comment says:
...but I was hoping for more details. Why does "being a primitive" imply "being able to contain any other primitive"? As I see, Record & Tuple are the first kind of primitives that contain multiple values, so no precedent seems to exist for this... |
I'd put it another way: |
I was talking specifically about By specifying that Records and Tuples with
Whichever decision is made for |
Hi @papb thanks for the idea, I think you’re right in that that rejecting NaN may not have been considered before. I do think that @rricard ’s comment you quoted is a strong design goal for the proposal. “[R&T] should be able to contain any other primitive value”. While they do reject object values due to the help catch accidental introduction of mutability or identity or both. Rejecting |
While that may be true. There is already precedent in the language for how equality of |
Normally, the objection "but that would be weird" (however phrased) is properly taken to be a strong objection, because it indicates the feature would violate the principle of least surprise. For -0 and NaN within Records and Tuples, all of the choices anyone has yet invented are weird in this sense. The objection is not strong if it also applies to all alternatives. |
@erights while i agree with that, rejecting NaN from a container is much, much weirder than giving a surprising equality answer. |
Hi @ljharb I do not disagree. Nevertheless, I felt it worth making the meta point. Thanks. |
Both are weird, but rejecting them at construction time has the advantage of making the surprising behavior obvious, while having unusual equality semantics can lead to subtle bugs. |
@pygy that's a good point. The principle of least surprise must be understood in terms of the dangerous consequences of a surprise:
On these grounds, a dynamic early rejection of placing problematic values into R&T is clearly less dangerous than having a weird equality semantics, where the surprise case still returns a boolean rather than throwing. Despite this, both @ljharb and I agree above to take the hit on the weird equality semantics (silent divergence) rather than the dynamic early weird value rejection (reliable throw). It is still an overall tradeoff, but I appreciate the logic of this counter-argument. We should still take it seriously, even if we feel that other factors overrule it in this case. |
@erights That's a superb way of putting it! Thank you very much!
Can you please clarify why?
What factors? Note: I know that there are already multiple arguments within this long issue, but... To me, everything you said in the last comment is a very strong argument in favor of opting for the early throw... I would love to see how exactly you're still capable of disagreeing with such strong argument (written by yourself 😅) |
@papb i can’t speak for anyone else, but i can speak for myself: as a JS developer, i would be truly shocked and perplexed if creating a tuple or record with const savedItemsPerRow = parseInt(localStorage.getItem('itemsPerRow'), 10);
updateStore('userPreferences', #{ itemsPerRow: savedItemsPerRow });
// elsewhere (getItemsPerRow gets userPreferences from the store, then itemsPerRow from that record):
const itemsPerRow = getItemsPerRow(store) || 3; if records allow |
The surprises for How surprising these are depends on how acquainted coders are with those intricacies. My guess would be that the median dev is not at all familiar with any of this (e.g. I just discovered that the default JSON stringifier culls the minus sign of |
There are many contexts in which you want to store data without any need for comparison semantics — in common programming language value categories (example: C++ concepts), storable categories are a large superset of comparable categories. Disallowing NaNs would be far more damaging than treating records containing them as equal. It would gratuitously break the use case of using records to temporarily store some Numbers and retrieve them later without any interpretation of what Numbers they contain. |
I wish we could generalize that reasoning not to stop at Numbers, but go all the way to any Values. Unfortunately, that's broken because we can't store and retrieve uninterpreted objects. 😞 |
Yeah, I think I'm convinced that throwing on Maybe
Something else that crossed my mind: it would also become a hassle for people to refactor existing code that uses objects and arrays into records and tuples. The special I think it might be even possible for TypeScript projects to have automated refactoring to records and tuples, by statically analyzing whether or not each array/object has to be mutable. If |
Is this intentional, or has this simply not been addressed yet? I'd expect that Currently specified behaviour according to comment: const a = -10;
const set = new Set();
set.add(#[a*0, a*0]);
set.has(#[0, 0]); // false, because it has `#[-0, -0]` instead; EDIT: nevermind; it will be true In general, Edit: nevermind. It already uses |
@Maxdamantus Map and Set will continue to apply SameValueZero on Records & Tuples with no conversion on insertion. const s = new Set();
s.add(#[-0]).add(#[0]);
s.size; // 1
Object.is(-0, […s].at(0).at(0)); // true |
@papb re. typing IEEE Floats are effectively a TS could keep track of the operations that can return |
What should each of the following evaluate to?
(For context, this is non-obvious because
+0 === -0
istrue
,Object.is(+0, -0)
isfalse
,NaN === NaN
isfalse
, andObject.is(NaN, NaN)
istrue
.)Personally I lean towards the
-0
cases all beingfalse
and theNaN
cases all beingtrue
, so that the unusual equality semantics of-0
andNaN
do not propagate to the new kinds of objects being introduced by this proposal.The text was updated successfully, but these errors were encountered: