-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: new approach to efficiently hashing 1, 1.0, big(1), the same. #6624
Conversation
Oh, I should also note that this is almost certainly really broken on 32-bit systems. I just didn't worry about that while developing this, but should be fixable with a little attention. |
+:100: Oh boy have I been waiting for this to land. This is a really nice piece of work. Questions about comparing collections should not get in the way of this, since those issues exist anyway. Now we have
|
## hashing rational values ## | ||
|
||
#= | ||
`decompose(x)`: non-canonical decomposition of rational values as `den*2^pow/num`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say num*2^pow/den
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, yes. Duh. Fortunately, only the documentation is wrong :-)
Would it suffice to have a single definition
? |
Good point. That probably saves typing overall. |
So does this mean
would work? |
Yes, @IainNZ, that's exactly what it means – but I have to fix |
The major change is that numeric values that are mathematically equal hash the same. This is tricky to do in a way that allows the hashing of Int64, Float64, Uint64 and such common numeric types to all be fast – nearly as fast as just applying the core hash to them as raw bits. Although tests pass, this is an inherently half-baked state since isequal and hash are now badly out of sync. Many decisions still need to be made about how to hash collections: does the collection type matter or just the element type of the collection? Or neither? This also does away with the bitmix function, instead favoring a Merkle-Damgård style of combining arbitrary amounts of data into a single, fixed-size hash value output. In a sense the hash function with two arguments replaces the bitmix function – you can give the result of hashing previous values as the second argument to hash and the result will depend on both in a difficult-to-predict way.
@JeffBezanson, this code could really benefit from specialization on default arguments since many performance-critical cases have the zero default second argument. How hard would that be to do? |
I suppose it is really just inlining; if the function called by
but I doubt we want to do that in all cases. |
In this case, I suspect we might really want to do it in all cases – the hash function is expected to be called with just one argument – the second argument is for Merkle-Damgård chaining. |
@JeffBezanson, I can't for the life of me figure out why |
This may be due to LLVM's Demonic Constant Folding™.
0 for a constant NaN, 0x8000000000000000 for non-constant. A good fix might be to move the |
Go home, LLVM, you're drunk. |
This fucking LLVM undefined bullshit completely trashes the performance of float hashing. Pardon the language, but this is just so infuriating. I had this down to almost matching integer hashing performance and this COMPLETELY FUCKING POINTLESS UNDEFINED VALUES CRAP IS SABOTAGING IT. |
LLVM's fptosi intrinsic is undefined for NaN, so LLVM obnoxiously and pointlessly does different things when it gets NaN as a run-time value than as a compile-time value. To avoid this shitty, pointless trap, we have to avoid calling fptosi on NaN by introducing a branch into the hashing function – even though for hashing, we don't care *what* value is produced, just as long as it's consistent. Unfortunately, this affects the performance of Float64 hashing pretty badly. I was not able to figure out any way to recover this lost performance. LLVM really needs to stop doing this.
Apparently the type assertion after the decompose call was sabotaging the ability to inline the call to decompose for Rational{Int}. Removing the type assert gives a 10x boost, bringing the speed of generic hash(Rational) to withing 10x of hash(Int).
But hey, this way LLVM lets you get the wrong answer really fast! You can save one whole instruction! |
I don't think this is actually a performance optimization on their part – I think this is intentionally inconsistent behavior for your own good, so you don't rely on the undefined behaviors, gasp! |
Time to write a LuaJIT bytecode backend? :p |
No, it's to make it easier and faster to map languages like C that have undefined behavior. The problem is that the behavior is undefined in the first place. If you're going to have undefined behavior then ok, better make uses of it as obvious as possible. IMO the fix would be to give everything defined behavior by default, and use intrinsics or attributes of some kind to mark that e.g. a denominator can't be zero. |
This actually provides less gain over the generic real hashing function than you would think, but it is slightly faster.
While messing around with generic equality checking based on the new decode function introduced in the hashing work, I discovered that LLVM seems to be much better able to analyze expressions that use signbit when it's boolean and explicitly defined as `x < 0` for integer values. Since `true == 1` and `false == 0` this is a pretty benign change, although technically it is breaking. I've wanted to do this for a while and this seems like as good a time as any.
Defining isfinite in terms of decompose is simple. It seems good to insist that only floating-point reals can be NaN – NaN in hardware is simply an unavoidable reality. For any user-defined type that is not implemented in hardware, NaN should not exist since operations that would produce NaNs should raise immediate errors instead.
I don't think stashing the new key on assignment is going to have a big performance impact. The biggest implication that I could think of was that it might be slightly worse for gc, but that's highly speculative. |
This patch works around the error, but I suspect it is more of a bandaid: diff --git a/src/gf.c b/src/gf.c
index 6d3a168..9fd0bae 100644
--- a/src/gf.c
+++ b/src/gf.c
@@ -1262,7 +1262,10 @@ static jl_tuple_t *arg_type_tuple(jl_value_t **args, size_t nar
size_t i;
for(i=0; i < nargs; i++) {
jl_value_t *a;
- if (jl_is_type(args[i])) {
+ if (args[i] == (jl_value_t*)jl_null) {
+ a = args[i];
+ }
+ else if (jl_is_type(args[i])) {
a = (jl_value_t*)jl_wrap_Type(args[i]);
}
else if (!jl_is_tuple(args[i])) { |
Ok – want to do the merge? There's an obvious conflict (you probably resolved it already), and then just apply your bandaid. |
Conflicts: base/profile.jl
Ok, the bandaid turns out to cause problems. I'll keep working, and I'll merge when I have something decent. |
I'm worried about |
Maybe we can do something using |
We don't actually need |
I don't understand --- |
Yes, but none of |
I was thinking of |
ex:
The sorted order is, unsurprisingly, not consistent with the partial order. |
RFC: new approach to efficiently hashing 1, 1.0, big(1), the same.
Yay! |
In LLVM (inherited from C), fptosi has undefined behavior if the result does not fit the integer size after rounding down. But by using the same strategy as generic hashing of Real values, we actually can end up with a sitatuion that is faster for the CPU to deal with and avoids the UB. Refs #6624 (3696968) Fixes #37800
In LLVM (inherited from C), fptosi has undefined behavior if the result does not fit the integer size after rounding down. But by using the same strategy as generic hashing of Real values, we actually can end up with a sitatuion that is faster for the CPU to deal with and avoids the UB. Refs #6624 (3696968) Fixes #37800
In LLVM (inherited from C), fptosi has undefined behavior if the result does not fit the integer size after rounding down. But by using the same strategy as generic hashing of Real values, we actually can end up with a sitatuion that is faster for the CPU to deal with and avoids the UB. Refs #6624 (3696968) Fixes #37800
The major change is that numeric values that are mathematically equal hash the same. This is tricky to do in a way that allows the hashing of
Int64
,Float64
,Uint64
and such common numeric types to all be fast – nearly as fast as just applying the core hash to them as raw bits.Although tests pass, this is an inherently half-baked state since isequal and hash are now badly out of sync. Many decisions still need to be made about how to hash collections: does the collection type matter or just the element type of the collection? Or neither?
This also does away with the bitmix function, instead favoring a Merkle-Damgård style of combining arbitrary amounts of data into a single, fixed-size hash value output. In a sense the hash function with two arguments replaces the bitmix function – you can give the result of hashing previous values as the second argument to hash and the result will depend on both in a difficult-to-predict way.