Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash128 not matching python #8

Closed
maneeshsahu opened this issue Jan 10, 2020 · 1 comment
Closed

hash128 not matching python #8

maneeshsahu opened this issue Jan 10, 2020 · 1 comment
Labels

Comments

@maneeshsahu
Copy link

Hi @karanlyons
I am not getting the hash128 of the this library to match the python mmh3.

Python mmh3:
hex(mmh3.hash128("I will not buy this tobacconist's, it is scratched.")))

Yields: 0x67d73523f0079673d30654abbd8227e3

But in your readme:
murmurHash3.x64.hash128("I will not buy this tobacconist's, it is scratched.");

Yields: d30654abbd8227e367d73523f0079673

Why is there a mismatch?

@karanlyons
Copy link
Owner

karanlyons commented Jan 10, 2020

$ g++ ./mmh3_reference_repl.cpp; and ./a.out
> 
00000000000000000000000000000000
> I will not buy this tobacconist's, it is scratched.
d30654abbd8227e367d73523f0079673

There is a mismatch because the mmh3 library for Python is incorrect. Specifically it swaps the order of the two final uint64_ts relative to the reference implementation.

I could not tell you why it does this, but if you can’t find a Python implementation/binding that is accurate against the reference implementation you can simply swap the 64 bits around:

>>> o = mmh3.hash128("I will not buy this tobacconist's, it is scratched.")
>>> hex(((o & 0xffffffffffffffff) << 64) + (o >> 64))
'0xd30654abbd8227e367d73523f0079673'

Taking the (unwitting) test vectors from aappleby/smhasher#73 (comment):

 N |                            Bytes | MM3-128 (x64) Reference          | murmurHash3.x64.hash128
---|----------------------------------|----------------------------------|---------------------------------
 1 |                               00 | 4610abe56eff5cb551622daa78f83583 | 4610abe56eff5cb551622daa78f83583
 2 |                             0000 | 3044b81a706c5de818f96bcc37e8a35b | 3044b81a706c5de818f96bcc37e8a35b
 3 |                           000000 | 79d54dd1bf7137480af5e7f1b766291d | 79d54dd1bf7137480af5e7f1b766291d
 4 |                         00000000 | cfa0f7ddd84c76bc589623161cf526f1 | cfa0f7ddd84c76bc589623161cf526f1
 5 |                       0000000000 | 3df460ff3e17b53a17874fba56e69767 | 3df460ff3e17b53a17874fba56e69767
 6 |                     000000000000 | 7d480f9fa80ec469719af4070b74d89d | 7d480f9fa80ec469719af4070b74d89d
 7 |                   00000000000000 | f402c55ac5dec98f2de586f681711c02 | f402c55ac5dec98f2de586f681711c02
 8 |                 0000000000000000 | 28df63b7cc57c3cbf2557dfcc4e8fe52 | 28df63b7cc57c3cbf2557dfcc4e8fe52
 9 |               000000000000000000 | 73269217e5476f20f1fa3fc86728ca0c | 73269217e5476f20f1fa3fc86728ca0c
10 |             00000000000000000000 | 5b3d684f8c57ce161ba63bef94931146 | 5b3d684f8c57ce161ba63bef94931146
11 |           0000000000000000000000 | 056e0d6c8921404673c2da0104c39955 | 056e0d6c8921404673c2da0104c39955
12 |         000000000000000000000000 | a4d8ece9d7c0dfe3803bbf8eb6f0853f | a4d8ece9d7c0dfe3803bbf8eb6f0853f
13 |       00000000000000000000000000 | a10ea8b22762995abb1575409cfb7dc6 | a10ea8b22762995abb1575409cfb7dc6
14 |     0000000000000000000000000000 | 028b7708fcbbed1e8393f0698afe46ea | 028b7708fcbbed1e8393f0698afe46ea
15 |   000000000000000000000000000000 | 6ce113b115a56871195953c2230f8db2 | 6ce113b115a56871195953c2230f8db2
16 | 00000000000000000000000000000000 | 4bbd1bf27da918d6b465a9eccd791cb6 | 4bbd1bf27da918d6b465a9eccd791cb6

 N |                            Bytes | MM3-128 (x86) Reference          | murmurHash3.x86.hash128
---|----------------------------------|----------------------------------|---------------------------------
 1 |                               00 | 88c4adec54d201b954d201b954d201b9 | 88c4adec54d201b954d201b954d201b9
 2 |                             0000 | 04a872bbedcd774bedcd774bedcd774b | 04a872bbedcd774bedcd774bedcd774b
 3 |                           000000 | e0d93642acf40e87acf40e87acf40e87 | e0d93642acf40e87acf40e87acf40e87
 4 |                         00000000 | cc066f1f9e5178409e5178409e517840 | cc066f1f9e5178409e5178409e517840
 5 |                       0000000000 | 50a68ecfd01a6609d01a6609d01a6609 | 50a68ecfd01a6609d01a6609d01a6609
 6 |                     000000000000 | 777fa95660bde92360bde92360bde923 | 777fa95660bde92360bde92360bde923
 7 |                   00000000000000 | 0d45d85efb848988fb848988fb848988 | 0d45d85efb848988fb848988fb848988
 8 |                 0000000000000000 | e028ae414772b0844772b0844772b084 | e028ae414772b0844772b0844772b084
 9 |               000000000000000000 | 5ad58a7e543371085433710854337108 | 5ad58a7e543371085433710854337108
10 |             00000000000000000000 | 64010da262e8bc1762e8bc1762e8bc17 | 64010da262e8bc1762e8bc1762e8bc17
11 |           0000000000000000000000 | 2f35ebd169f8166569f8166569f81665 | 2f35ebd169f8166569f8166569f81665
12 |         000000000000000000000000 | 332d18d156b5986456b5986456b59864 | 332d18d156b5986456b5986456b59864
13 |       00000000000000000000000000 | 583cbe60ca53c80fca53c80fca53c80f | 583cbe60ca53c80fca53c80fca53c80f
14 |     0000000000000000000000000000 | a8e046b5855ca909855ca909855ca909 | a8e046b5855ca909855ca909855ca909
15 |   000000000000000000000000000000 | 3553d0af909796639097966390979663 | 3553d0af909796639097966390979663
16 | 00000000000000000000000000000000 | 5a4075d66b2d3d27d3926c2feb228a07 | 5a4075d66b2d3d27d3926c2feb228a07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants