|
| 1 | +# Hash Function Prospector |
| 2 | + |
| 3 | +This is a little tool for automated [integer hash function][wang] |
| 4 | +discovery. It generates billions of [integer hash functions][jenkins] at |
| 5 | +random from a selection of [nine reversible operations][rev]. The |
| 6 | +generated functions are JIT compiled and their avalanche behavior is |
| 7 | +evaluated. The current best function is printed out in C syntax. |
| 8 | + |
| 9 | +The *avalanche score* is the number of output bits that remain "fixed" |
| 10 | +on average when a single input bit is flipped. Lower scores are better. |
| 11 | +Ideally the score is 0 — e.g. every output bit flips with a 50% chance |
| 12 | +when a single input bit is flipped. |
| 13 | + |
| 14 | +Prospector can generate both 32-bit and 64-bit integer hash functions. |
| 15 | +Check the usage (`-h`) for the full selection of options. Due to the JIT |
| 16 | +compiler, only x86-64 is supported, though the functions it discovers |
| 17 | +can, of course, be used anywhere. |
| 18 | + |
| 19 | +## Discovered Functions |
| 20 | + |
| 21 | +So far I've used prospector to discover these two high quality 32-bit |
| 22 | +integer hash functions: |
| 23 | + |
| 24 | +```c |
| 25 | +/* Avalanche score = 1.83 |
| 26 | + * Compiles to only 23 bytes on x86-64 |
| 27 | + * High avalanche |
| 28 | + * 6 billion hashes / second (Haswell) |
| 29 | + */ |
| 30 | +uint32_t |
| 31 | +mosquito32(uint32_t x) |
| 32 | +{ |
| 33 | + x = ~x; |
| 34 | + x ^= x >> 16; |
| 35 | + x *= UINT32_C(0xb03a22b3); |
| 36 | + x ^= x >> 10; |
| 37 | + return x; |
| 38 | +} |
| 39 | + |
| 40 | +/* Avalanche score = 1.51 |
| 41 | + * Very effective avalanche |
| 42 | + * 3.3 billion hashes / second (Haswell) |
| 43 | + */ |
| 44 | +uint32_t |
| 45 | +skeeto32(uint32_t x) |
| 46 | +{ |
| 47 | + x = ~x; |
| 48 | + x ^= x >> 2; |
| 49 | + x += x << 21; |
| 50 | + x ^= x >> 15; |
| 51 | + x ^= x << 5; |
| 52 | + x ^= x >> 9; |
| 53 | + x ^= x << 13; |
| 54 | + return x; |
| 55 | +} |
| 56 | +``` |
| 57 | +
|
| 58 | +## Reversible operation selection |
| 59 | +
|
| 60 | +```c |
| 61 | +x = ~x; |
| 62 | +x ^= constant; |
| 63 | +x *= constant; // only odd constants |
| 64 | +x += constant; |
| 65 | +x ^= x >> constant; |
| 66 | +x ^= x << constant; |
| 67 | +x += x << constant; |
| 68 | +x -= x << constant; |
| 69 | +x = (x << constant) | (x >> (nbits - constant)); |
| 70 | +``` |
| 71 | + |
| 72 | +Technically `x = ~x` is covered by `x = ^= constant`. However, `~x` is |
| 73 | +uniquely special and particularly useful. The generator is very unlikely |
| 74 | +to generate the one correct constant for the XOR operator that achieves |
| 75 | +the same effect. |
| 76 | + |
| 77 | + |
| 78 | +[rev]: http://papa.bretmulvey.com/post/124027987928/hash-functions |
| 79 | +[wang]: https://gist.github.com/badboy/6267743 |
| 80 | +[jenkins]: http://burtleburtle.net/bob/hash/integer.html |
0 commit comments