Random: always use SamplerRangeFast for MersenneTwister #27560

rfourquet · 2018-06-13T19:03:55Z

This is a "breaking" change concerning the numbers generated in a call like rand(1:3, 10). It may be too late for 0.7, but on the other hand the resistance to change may be too high in a later release for such non-essential efficiency improvements.

To generate a random value in 1:3, there are 2 distinct steps:

create a Sampler object, which involves one-time computations
use this sampler to generate a bunch of numbers

As the number of generated random values increases, the cost of step 1) becomes negligible (amortization).

We currently have two Sampler types with different compromises on the costs of both steps:

SamplerRangeInt (SRI) which is more costly at 1), but is more efficient at using as few entropy bits as possible, in step 2)
SamplerRangeFast (SRF) which is cheap at 1), but wastes more entropy bits (depending on the length of the range)

By default, an RNG will use SRI. MersenneTwister uses:

SRF for scalar calls, like rand(1:3): MersenneTwister is fast enough at generating entropy that wasting some bits is preferable in this case
SRI for array calls, like rand(1:3, 10): this was the the original method, and was not updated when SRF was introduced, as the status-quo was/is faster in some cases.

I propose now to use SRF in all cases for MersenneTwister, for more uniformity (e.g. srand(0); [rand(1:10), rand(1:10)] will give the same result as srand(0); rand(1:10, 2)), and for efficiency, as "most (e.g. 90%) of the time" this will give improved speed.

For a given length of array, the speed of the SRI method doesn't vary much with the length L of the range, unlike with SRF:

if L<=2^n with L close to 2^n, SRF can be between 2 and 3 times as fast as SRI
if L = k + 2^n with k>0 "small", SRF is slower than SRI by a small margin, e.g. 10%. As k grows, SRF gets faster, and becomes again faster than SRI e.g. when k ≈ (2^n)/10 (which means that SRF is slightly slower than SRI for 10 percent of input ranges of length between 2^n+1 and 2^(n+1)).

I lack time now to do advanced performance analysis and graphics, but here is a representative benchmark session (assume the range is $-escaped):

julia> a = zeros(Int, 1000)

# master: SamplerRangeInt

julia> @btime rand!($a, 1:2^30)
  15.184 μs (0 allocations: 0 bytes)

julia> @btime rand!($a, 1:2^30+1)
  15.165 μs (0 allocations: 0 bytes)

julia> @btime rand!($a, 1:2^30+2^26)
  15.176 μs (0 allocations: 0 bytes)

julia> @btime rand!($a, 1:2^30+2^27)
  15.166 μs (0 allocations: 0 bytes)

julia> @btime rand!($a, 1:2^30+2^29)
  15.687 μs (0 allocations: 0 bytes)

# PR: SamplerRangeFast

julia> @btime rand!($a, 1:2^30)
  6.117 μs (0 allocations: 0 bytes)

julia> @btime rand!($a, 1:2^30+1)
  16.535 μs (0 allocations: 0 bytes)

julia> @btime rand!($a, 1:2^30+2^26) # still slower than SRI
  15.699 μs (0 allocations: 0 bytes) 

julia> @btime rand!($a, 1:2^30+2^27)
  14.334 μs (0 allocations: 0 bytes) # faster than SRI again

julia> @btime rand!($a, 1:2^30+2^29)
  9.599 μs (0 allocations: 0 bytes) # significantly faster than SRI

rfourquet · 2018-06-13T19:04:31Z

cc. @mschauer

StefanKarpinski · 2018-06-13T20:14:21Z

We can break some things between 0.7.0-alpha and 0.7.0-beta so this would be ok.

mschauer · 2018-06-13T21:34:49Z

I am still fascinated that a single division is so expensive compared to our MersenneTwister entropy generation. Anyway I think this is a sane change.

rfourquet · 2018-06-14T19:06:23Z

Thanks for your answers! So I will mark this for triage.

I am still fascinated that a single division is so expensive compared to our MersenneTwister entropy generation

Me too, indeed! SIMD is certainly crucial for that...

If someone likes to test the performance on her machine, it should be rather straigthforward, with this eval rather than compiling this branch:

using Random
for T in Base.BitInteger_types
    @eval Random Sampler(rng::MersenneTwister, r::UnitRange{$T}, ::Val{Inf}) = SamplerRangeFast(r)
end

StefanKarpinski · 2018-06-21T19:22:31Z

@rfourquet: triage agrees that you're likely the only person in a position to really make a call on this. It's fine to make a breaking change now if you deem it worthwhile. Merge if you see fit.

rfourquet · 2018-06-22T17:02:58Z

Thanks for having discussed this.

I did few more benchmark to try to get an idea what would be the expected speed-up for a range of "random length" (not so trivial to measure that in a meaningful way; I did what I could). Benchmarking only the step 2) mentioned in the OP (which is equivalent to benchmarking rand! on asymptotically large arrays, which favors SamplerRangeInt), I observe speed-ups typically between 40% and 60% (and a fair majority over 50%), so it seems safe to assume that it will be an improvement on most machines. The main drawback I see is that the time it takes is less predictable (it depends a lot on the length of the range).

If no objection comes, I will therefore merge this week-end.

mschauer · 2018-06-22T17:43:29Z

The improvement even on asymptotically large arrays settles it.

rfourquet · 2018-06-25T13:05:59Z

The Travis failure is irrelated to this change and happens in other PRs too (concerns FileWatching).

rfourquet added performance Must go faster randomness Random number generation and the Random stdlib labels Jun 13, 2018

rfourquet added the triage This should be discussed on a triage call label Jun 14, 2018

StefanKarpinski removed the triage This should be discussed on a triage call label Jun 21, 2018

Random: always use SamplerRangeFast for MersenneTwister

0c698be

rfourquet force-pushed the rf/rand/MT-rangeint-fast branch from 547b257 to 0c698be Compare June 24, 2018 16:18

rfourquet merged commit 8eba7c4 into master Jun 25, 2018

rfourquet deleted the rf/rand/MT-rangeint-fast branch June 25, 2018 13:07

jrevels pushed a commit that referenced this pull request Jul 2, 2018

Random: always use SamplerRangeFast for MersenneTwister (#27560)

e1a4691

rfourquet mentioned this pull request Aug 31, 2018

random sampling from an (abstract)array is slow #20582

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random: always use SamplerRangeFast for MersenneTwister #27560

Random: always use SamplerRangeFast for MersenneTwister #27560

rfourquet commented Jun 13, 2018

rfourquet commented Jun 13, 2018

StefanKarpinski commented Jun 13, 2018

mschauer commented Jun 13, 2018

rfourquet commented Jun 14, 2018

StefanKarpinski commented Jun 21, 2018

rfourquet commented Jun 22, 2018

mschauer commented Jun 22, 2018

rfourquet commented Jun 25, 2018

Random: always use SamplerRangeFast for MersenneTwister #27560

Random: always use SamplerRangeFast for MersenneTwister #27560

Conversation

rfourquet commented Jun 13, 2018

rfourquet commented Jun 13, 2018

StefanKarpinski commented Jun 13, 2018

mschauer commented Jun 13, 2018

rfourquet commented Jun 14, 2018

StefanKarpinski commented Jun 21, 2018

rfourquet commented Jun 22, 2018

mschauer commented Jun 22, 2018

rfourquet commented Jun 25, 2018