Skip to content

Commit

Permalink
Auto merge of rust-lang#127226 - mat-1:optimize-siphash-round, r=nnet…
Browse files Browse the repository at this point in the history
…hercote

Optimize SipHash by reordering compress instructions

This PR optimizes hashing by changing the order of instructions in the sip.rs `compress` macro so the CPU can parallelize it better. The new order is taken directly from Fig 2.1 in [the SipHash paper](https://eprint.iacr.org/2012/351.pdf) (but with the xors moved which makes it a little faster). I attempted to optimize it some more after this, but I think this might be the optimal instruction order. Note that this shouldn't change the behavior of hashing at all, only statements that don't depend on each other were reordered.

It appears like the current order hasn't changed since its [original implementation from 2012](rust-lang@fada46c#diff-b751133c229259d7099bbbc7835324e5504b91ab1aded9464f0c48cd22e5e420R35) which doesn't look like it was written with data dependencies in mind.

Running `./x bench library/core --stage 0 --test-args hash` before and after this change shows the following results:

Before:
```
benchmarks:
    hash::sip::bench_bytes_4             7.20/iter +/- 0.70
    hash::sip::bench_bytes_7             9.01/iter +/- 0.35
    hash::sip::bench_bytes_8             8.12/iter +/- 0.10
    hash::sip::bench_bytes_a_16         10.07/iter +/- 0.44
    hash::sip::bench_bytes_b_32         13.46/iter +/- 0.71
    hash::sip::bench_bytes_c_128        37.75/iter +/- 0.48
    hash::sip::bench_long_str          121.18/iter +/- 3.01
    hash::sip::bench_str_of_8_bytes     11.20/iter +/- 0.25
    hash::sip::bench_str_over_8_bytes   11.20/iter +/- 0.26
    hash::sip::bench_str_under_8_bytes   9.89/iter +/- 0.59
    hash::sip::bench_u32                 9.57/iter +/- 0.44
    hash::sip::bench_u32_keyed           6.97/iter +/- 0.10
    hash::sip::bench_u64                 8.63/iter +/- 0.07
```
After:
```
benchmarks:
    hash::sip::bench_bytes_4             6.64/iter +/- 0.14
    hash::sip::bench_bytes_7             8.19/iter +/- 0.07
    hash::sip::bench_bytes_8             8.59/iter +/- 0.68
    hash::sip::bench_bytes_a_16          9.73/iter +/- 0.49
    hash::sip::bench_bytes_b_32         12.70/iter +/- 0.06
    hash::sip::bench_bytes_c_128        32.38/iter +/- 0.20
    hash::sip::bench_long_str          102.99/iter +/- 0.82
    hash::sip::bench_str_of_8_bytes     10.71/iter +/- 0.21
    hash::sip::bench_str_over_8_bytes   11.73/iter +/- 0.17
    hash::sip::bench_str_under_8_bytes  10.33/iter +/- 0.41
    hash::sip::bench_u32                10.41/iter +/- 0.29
    hash::sip::bench_u32_keyed           9.50/iter +/- 0.30
    hash::sip::bench_u64                 8.44/iter +/- 1.09
```
I ran this on my computer so there's some noise, but you can tell at least `bench_long_str` is significantly faster (~18%).

Also, I noticed the same compress function from the library is used in the compiler as well, so I took the liberty of copy-pasting this change to there as well.

Thanks `@semisol` for porting SipHash for another project which led me to notice this issue in Rust, and for helping investigate. <3
  • Loading branch information
bors committed Jul 4, 2024
2 parents 66b4f00 + 16fc41c commit f6fa358
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 10 deletions.
11 changes: 6 additions & 5 deletions compiler/rustc_data_structures/src/sip128.rs
Original file line number Diff line number Diff line change
Expand Up @@ -70,18 +70,19 @@ macro_rules! compress {
($state:expr) => {{ compress!($state.v0, $state.v1, $state.v2, $state.v3) }};
($v0:expr, $v1:expr, $v2:expr, $v3:expr) => {{
$v0 = $v0.wrapping_add($v1);
$v2 = $v2.wrapping_add($v3);
$v1 = $v1.rotate_left(13);
$v1 ^= $v0;
$v0 = $v0.rotate_left(32);
$v2 = $v2.wrapping_add($v3);
$v3 = $v3.rotate_left(16);
$v3 ^= $v2;
$v0 = $v0.wrapping_add($v3);
$v3 = $v3.rotate_left(21);
$v3 ^= $v0;
$v0 = $v0.rotate_left(32);

$v2 = $v2.wrapping_add($v1);
$v0 = $v0.wrapping_add($v3);
$v1 = $v1.rotate_left(17);
$v1 ^= $v2;
$v3 = $v3.rotate_left(21);
$v3 ^= $v0;
$v2 = $v2.rotate_left(32);
}};
}
Expand Down
11 changes: 6 additions & 5 deletions library/core/src/hash/sip.rs
Original file line number Diff line number Diff line change
Expand Up @@ -76,18 +76,19 @@ macro_rules! compress {
($state:expr) => {{ compress!($state.v0, $state.v1, $state.v2, $state.v3) }};
($v0:expr, $v1:expr, $v2:expr, $v3:expr) => {{
$v0 = $v0.wrapping_add($v1);
$v2 = $v2.wrapping_add($v3);
$v1 = $v1.rotate_left(13);
$v1 ^= $v0;
$v0 = $v0.rotate_left(32);
$v2 = $v2.wrapping_add($v3);
$v3 = $v3.rotate_left(16);
$v3 ^= $v2;
$v0 = $v0.wrapping_add($v3);
$v3 = $v3.rotate_left(21);
$v3 ^= $v0;
$v0 = $v0.rotate_left(32);

$v2 = $v2.wrapping_add($v1);
$v0 = $v0.wrapping_add($v3);
$v1 = $v1.rotate_left(17);
$v1 ^= $v2;
$v3 = $v3.rotate_left(21);
$v3 ^= $v0;
$v2 = $v2.rotate_left(32);
}};
}
Expand Down

0 comments on commit f6fa358

Please sign in to comment.