-
-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Canon uniform benches #1286
Canon uniform benches #1286
Conversation
This is based on @TheIronBorn's work (#1154, #1172), with some changes.
This beats other uniform_int_i128 results
Rationale: with multiple RNGs we need to reduce the number of tests; random is the most useful for single samples.
CPU capped to 2GHz; some fluctuation still observed
Previously, nrmr was potentially incorrect (depending on size). thresh32_or_uty is the same as the old z.
…nbiased Running only these benches, results are vaguely similar to before (10% differences aren't uncommon).
Many results for 'sample' methods are similar but several show >10% deviation. Concerning.
Finally with these changes deviations are usually under 1%. But not always: I still see a couple of large (10%+%) deviations (not present on a re-run).
Thanks to increased sample size these offer much more detailed plots.
This is the same as the previous bench except (a) it doesn't pin CPU frequency and (b) it runs 'cargo bench' first in an attempt to remove unusually slow results still sometimes observed.
I.e. just use 64-bit sampling for 32-bit output. This is significantly faster with 64-bit RNGs and similar with 32-bit RNGs.
These include too many files for GitHub's web interface!
Result run 5(This is the combined plot only since full output page is large enough to crash Firefox.) Note: "sample" means Canon with 32-bit sampling in i8, i16, 64-bit sampling on i32, i64, and 128-bit sampling on i128. Canon32 means 32-bit sampling. Canon32-2 is Canon32 with an extra round of bias reduction (max three samples). Lemire uses Further note: all i8, i16 results are so fast it's probably not worth differentiating (much). Also: with everything I tried, still sometimes a benchmark would run ~15% slower than normal. There may be the odd result that's slower than it should be. singlei8: reject Biased64 due to poor perf with 32-bit RNGs. sample is a bit ahead, sample-unbiased a bit behind, but not much difference. i16: similar to i8, but all Canon variants significantly worse with Pcg32, making ONeill look better. Weird. i32: just pick sample or sample-unbiased (latter wins with Pcg32; weird). i64: sample is best biased algorithm. Best unbiased is sample-unbiased or ONeill (similar profile aside from different bumps). i128: Canon-red is best biased algorithm; best unbiased is either sample-unbiased (shortest tail) or canon-red-un (slimmer profile). ONeill is not competitive. Use sample (Canon's method). Maybe add sample-unbiased as an option. distribution(I.e. sampling from the same range repeatedly, ignoring set-up costs.)
i8: reject Biased64; pick anything else. i16: sample, but sample-unbiased also has notable poor perf. i32: Canon (sample) is best. Lemire64 is also good. i64: Lemire is the best on average, but only a little ahead of sample (Canon). i128: Lemire is fastest, Canon-red next best. Result: Canon (sample) is generally the fastest method (Canon-red wins for i128, but Canon is still good). Lemire is the fastest unbiased method. ConclusionCanon's method ( Further note: all methods are marked |
Benchmark / development branch for Canon's method of uniform sampling. This PR is for reference purposes only and will not be merged.
Related: #1196