Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating point hash bug #37800

Closed
jmkuhn opened this issue Sep 29, 2020 · 6 comments · Fixed by #38031
Closed

Floating point hash bug #37800

jmkuhn opened this issue Sep 29, 2020 · 6 comments · Fixed by #38031
Labels
bug Indicates an unexpected problem or unintended behavior hashing
Milestone

Comments

@jmkuhn
Copy link
Contributor

jmkuhn commented Sep 29, 2020

using InteractiveUtils
using Printf

function myhash(x)
    # the following is valid for x == Inf
    h = UInt(0)
    num, pow, den = Base.decompose(x)
    den == 0 && return hash(ifelse(num > 0, Inf, -Inf), h)
    return nothing
end

versioninfo()
h = hash(Inf)
@printf("%#x\n", h)
h = myhash(Inf)
@printf("%#x\n", h)

With LLVM 9 I get the expected output:

Julia Version 1.6.0-DEV.1014
Commit 2a36f83ce9 (2020-09-22 20:52 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
0x807bb202c9cbfc6f
0x807bb202c9cbfc6f

With LLVM 10 I get:

Julia Version 1.6.0-DEV.1015
Commit 9adcdd42a0 (2020-09-23 03:04 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-10.0.1 (ORCJIT, skylake)
0x807bb202c9cbfc6f
0x15d7d083d04ecb90

Edit:
0x15d7d083d04ecb90 is the correct hash for NaN

@yuyichao
Copy link
Contributor

yuyichao commented Sep 29, 2020

julia> function myhash(b)
           return hash(ifelse(b, Inf, -Inf), UInt(0))
       end
myhash (generic function with 1 method)

julia> myhash(false)
0x15d7d083d04ecb90

julia> myhash(true)
0x15d7d083d04ecb90

julia> hash(-Inf, UInt(0))
0x96e782fc7639e1bc

julia> hash(Inf, UInt(0))
0x807bb202c9cbfc6f

@yuyichao
Copy link
Contributor

Seems as if it evaluated Inf == Inf to false and used the NaN hash value instead...

@yuyichao
Copy link
Contributor

And it seems like a julia bug. The implementation of hash relies on fptoui of the floating point value which produces a poison value if the fp value couldn't fit. It is therefore perfectly legal for LLVM to produce any result for hash(Inf). The implementation was previously relying on fptoui to return a fixed value which is wrong. In fact, it doesn't even return the same result on, say AArch64.

@yuyichao yuyichao added bug Indicates an unexpected problem or unintended behavior hashing labels Sep 29, 2020
@yuyichao yuyichao changed the title LLVM 10 codegen bug Floating point hash bug Sep 29, 2020
@yuyichao
Copy link
Contributor

Ref 3696968 which was an incomplete fix with an incorrect explaination.

@KristofferC
Copy link
Member

Did some archeology and found 3696968. Kinda funny how this comes back 6 years later.

@KristofferC
Copy link
Member

Sniped...

@KristofferC KristofferC added this to the 1.6 features milestone Sep 29, 2020
vtjnash added a commit that referenced this issue Oct 14, 2020
In LLVM (inherited from C), fptosi has undefined behavior if the result
does not fit the integer size after rounding down. But by using the same
strategy as generic hashing of Real values, we actually can end up with
a sitatuion that is faster for the CPU to deal with and avoids the UB.

Refs #6624 (3696968)
Fixes #37800
vtjnash added a commit that referenced this issue Oct 15, 2020
In LLVM (inherited from C), fptosi has undefined behavior if the result
does not fit the integer size after rounding down. But by using the same
strategy as generic hashing of Real values, we actually can end up with
a sitatuion that is faster for the CPU to deal with and avoids the UB.

Refs #6624 (3696968)
Fixes #37800
vtjnash added a commit that referenced this issue Oct 26, 2020
In LLVM (inherited from C), fptosi has undefined behavior if the result
does not fit the integer size after rounding down. But by using the same
strategy as generic hashing of Real values, we actually can end up with
a sitatuion that is faster for the CPU to deal with and avoids the UB.

Refs #6624 (3696968)
Fixes #37800
vtjnash added a commit that referenced this issue Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior hashing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants