MersenneTwister: more efficient Float64 scalar generation with caching #25197

rfourquet · 2017-12-19T22:54:08Z

The is branched off and inspired from #25058. There seem to be a sweet spot at randomizing arrays of size 8016 bytes for dSFMT: beyond this threshold, the generation rate increases, and for Float64, this shows a speed-up of about 30% (compared to the current Float64 cache size of 3056 bytes, the minimum possible for dSFMT). This again would change the generated streams, so is breaking. Let's see if Nanosoldier agrees this time:
@nanosoldier runbenchmarks("random", vs=":master")

ararslan · 2017-12-19T23:12:01Z

@nanosoldier runbenchmarks("random", vs=":master")

nanosoldier · 2017-12-20T00:18:42Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

rfourquet · 2017-12-20T10:42:52Z

Nanosoldier still not seeing improvements... let see with the "problem" benchmarks:

@nanosoldier runbenchmarks("problem", vs=":master")

rfourquet · 2017-12-20T10:56:15Z

Oups, forgot to quote: @nanosoldier runbenchmarks("problem", vs=":master")

nanosoldier · 2017-12-20T11:56:21Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

rfourquet · 2017-12-20T19:45:57Z

The regressions in Nanosoldier's report are expected: those two problems use the samerand function from "BaseBenchmarks.jl/utils", which for some reason deepcopies a MersenneTwister object at each invocation. which dominates totally the benchmarks. In this PR, it's more expensive to copy an RNG because we make it bigger, so no mystery here. The mystery is why the benchmarks needs to deepcopy an RNG... I think this is wrong, as it represents 99% percent of the time spent in the benchmark.

ViralBShah · 2017-12-21T10:59:37Z

I don't think we guarantee the same stream across releases - so it should be perfectly ok to do this if it gives better performance.

rfourquet · 2017-12-23T10:10:18Z

The performance improvements are confirmed on at least 4 different computers, so I will merge tomorrow unless there are objections.

Like for integers, a cache of size 8016 bytes seems to be optimal.

StefanKarpinski · 2017-12-23T20:48:10Z

I don't think we guarantee the same stream across releases

We should not change the behavior of the same RNG type across releases, but we are allowed to change default RNGs with minor releases. If we change the stream for MersenneTwister in the future, it would require a major version bump of the package that provides it.

rfourquet · 2017-12-24T11:24:20Z

We should not change the behavior of the same RNG type across releases,

Do you mean minor releases? (as opposed to "major version bump")

StefanKarpinski · 2017-12-24T20:38:23Z

I guess it depends and it's a choice we have to make. We need to make sure that code can request a specific behavior of an RNG over time, which means that RNGs must be in packages and that one can continue to ask for MersenneTwister v"1.*" no matter what version of Julia one is using. Perhaps applying SemVer to RNG package behavior is also sensible. In that case, we can make breaking changes to the behavior of an RNG, but we can only do so if we bump the major version.

bjarthur · 2018-08-26T22:04:48Z

w.r.t. the comment above about SemVer of RNG, is it possible to recreate the julia 0.6 sequence of random numbers in 0.7/1.0? thanks.

rfourquet · 2018-08-27T07:54:15Z

There is the (registered, I think) RandomV06 package for that.

bjarthur · 2018-08-27T10:54:35Z

thanks!

rfourquet added performance Must go faster randomness Random number generation and the Random stdlib labels Dec 19, 2017

rfourquet mentioned this pull request Dec 20, 2017

MersenneTwister: more efficient integer generation with caching #25058

Merged

rfourquet force-pushed the rf/rand/cache-float branch from 105b486 to a786746 Compare December 20, 2017 10:41

rfourquet force-pushed the rf/rand/cache-float branch from a786746 to f42ac06 Compare December 21, 2017 14:51

rfourquet force-pushed the rf/rand/cache-float branch from f42ac06 to 2839de5 Compare December 23, 2017 16:52

MersenneTwister: more efficient Float64 scalar generation with caching

4e42436

Like for integers, a cache of size 8016 bytes seems to be optimal.

rfourquet force-pushed the rf/rand/cache-float branch from 2839de5 to 4e42436 Compare December 23, 2017 16:56

rfourquet merged commit 99b8dc3 into master Dec 24, 2017

fredrikekre deleted the rf/rand/cache-float branch December 24, 2017 11:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MersenneTwister: more efficient Float64 scalar generation with caching #25197

MersenneTwister: more efficient Float64 scalar generation with caching #25197

rfourquet commented Dec 19, 2017

ararslan commented Dec 19, 2017

nanosoldier commented Dec 20, 2017

rfourquet commented Dec 20, 2017

rfourquet commented Dec 20, 2017

nanosoldier commented Dec 20, 2017

rfourquet commented Dec 20, 2017

ViralBShah commented Dec 21, 2017

rfourquet commented Dec 23, 2017

StefanKarpinski commented Dec 23, 2017

rfourquet commented Dec 24, 2017

StefanKarpinski commented Dec 24, 2017

bjarthur commented Aug 26, 2018

rfourquet commented Aug 27, 2018

bjarthur commented Aug 27, 2018

MersenneTwister: more efficient Float64 scalar generation with caching #25197

MersenneTwister: more efficient Float64 scalar generation with caching #25197

Conversation

rfourquet commented Dec 19, 2017

ararslan commented Dec 19, 2017

nanosoldier commented Dec 20, 2017

rfourquet commented Dec 20, 2017

rfourquet commented Dec 20, 2017

nanosoldier commented Dec 20, 2017

rfourquet commented Dec 20, 2017

ViralBShah commented Dec 21, 2017

rfourquet commented Dec 23, 2017

StefanKarpinski commented Dec 23, 2017

rfourquet commented Dec 24, 2017

StefanKarpinski commented Dec 24, 2017

bjarthur commented Aug 26, 2018

rfourquet commented Aug 27, 2018

bjarthur commented Aug 27, 2018