Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a L'Ecuyer or MRG32k3a implementation? #997

Closed
CGMossa opened this issue Jul 8, 2020 · 13 comments
Closed

Is there a L'Ecuyer or MRG32k3a implementation? #997

CGMossa opened this issue Jul 8, 2020 · 13 comments

Comments

@CGMossa
Copy link
Contributor

CGMossa commented Jul 8, 2020

Background

What is your motivation?

I've looked for an implementation in rand_core and in the docs, plus searched the repository and found no mention of L'Ecuyer.

What type of application is this? (E.g. cryptography, game, numerical simulation)

I believe that L'Ecuyer PRNG algorithm yields numbers that are suited concurrent sampling.

Feature request

I think this should be implemented if not already. I've been given these
resources on it:

http://www.iro.umontreal.ca/~lecuyer/myftp/papers/streams00s.pdf
https://github.com/vigna/MRG32k3a
https://pubsonline.informs.org/doi/abs/10.1287/opre.47.1.159

I'm going to attempt a re-write.

@vks
Copy link
Collaborator

vks commented Jul 8, 2020

Why are you looking for this RNG in particular? What does it offer over the other ones already implemented in Rand (i.e. chacha, xoshiro)?

@CGMossa
Copy link
Contributor Author

CGMossa commented Jul 8, 2020

I'm not an expert on this, I just have seen in the R community, that this prng is also in consideration when evaluating different prngs. I believe xoshiro is also a parallel-safe prng, so l'ecuyer doesn't offer something new.
Are there a TestU01-test/benchmark written anywhere?

@dhardy
Copy link
Member

dhardy commented Jul 8, 2020

parallel-safe

Does this term mean anything?

You need independent RNG state for each that to get good performance.

Perhaps you mean that one can safely seed multiple independent streams. For this purpose we specifically chose RNGs supporting ~128-bit seeds or larger. There are some potential issues with PCG streams being too similar, so not all of those RNGs are suitable (but the ones with 64-bit output already use 128-bit internal state).

You also need to ensure the seeding mechanism does not create similarities in state — some people recommend using a different type of RNG for seeding each thread's generator; I'd recommend using a (near) crypto-grade generator such as ChaCha since any such similarities would violate the requirements on a crypto generator. Seeding one RNG from another is easy; see the book.

@vks
Copy link
Collaborator

vks commented Jul 8, 2020

Maybe we should add a paragraph to the book about initializing RNGs for parallel computations?

@CGMossa
Copy link
Contributor Author

CGMossa commented Jul 8, 2020

Yeah, that's what I was thinking of.

I don't know why the different bits length matters. I'll close this, as I can see this "multiple independent streams"-business has been addressed.

PS: @vks yeah, because I looked at the book first, and thought: This is not addressed here, maybe there's something missing.

@dhardy
Copy link
Member

dhardy commented Jul 8, 2020

Bit lengths: just because if the PRNG state is too small (e.g. 64 bits) and you have many parallel instances each drawing many values, it is conceivable that some will overlap. With 128 bits that isn't really a problem.

Yes, the book may need improving. Leave this open as a reminder until it does. Any volunteers to address it?

@CGMossa
Copy link
Contributor Author

CGMossa commented Jul 8, 2020

I'd like to help :D

@dhardy
Copy link
Member

dhardy commented Jul 8, 2020

Cheers! Hopefully this will get you started.

@CGMossa
Copy link
Contributor Author

CGMossa commented Jul 29, 2020 via email

@fabienlefloch
Copy link

fabienlefloch commented Oct 5, 2020

I could see several arguments for MRG32k3a and "parallelization". With L'Ecuyer RNG, you may skip ahead fast. Use case is you want to split a Monte-Carlo simulation over N processors. Two kinds of skip-ahead are possible: fixed size (skip 2^40 numbers for example), or variable size: I have an MC simulation over M paths, I want to skip k*M/N for k=0,...,N-1.

Now this is also possible with Chacha (depending on how it is actually used - did not check the details of the implementation in this lib).

Another argument is that perhaps one is moving some existing stuff to Rust and want to keep the same numbers. The existing stuff is likely to use a popular RNG, such as some Mersenne-Twister variant or MRG32k3a. PCG64xyz and xoroshiro++** are fairly new to the scene.

Personally I find it surprising that so much care seems to be taken of in the design of this lib, and yet the author does not seem all too familiar with the scientific community needs.

@vks
Copy link
Collaborator

vks commented Oct 5, 2020

Now this is also possible with Chacha (depending on how it is actually used - did not check the details of the implementation in this lib).

With Chacha you could just generate random seeds, iterate through all possible seeds, or use the set_stream method. You can also skip ahead a variable number of words with set_word_pos.

Another argument is that perhaps one is moving some existing stuff to Rust and want to keep the same numbers. The existing stuff is likely to use a popular RNG, such as some Mersenne-Twister variant or MRG32k3a.

The design of Rand is extensible, you can use a crate (or implement it yourself) if you want to use a legacy RNG.

Personally I find it surprising that so much care seems to be taken of in the design of this lib, and yet the author does not seem all too familiar with the scientific community needs.

What needs of the scientific community are not addressed?

@dhardy
Copy link
Member

dhardy commented Oct 6, 2020

The design of Rand is extensible, you can use a crate (or implement it yourself) if you want to use a legacy RNG.

We have a rngs repository for additional generators and would be happy to accept implementations of well-known generators there. That said, we have no funding source and cannot commit to implementing much ourselves.

There are reasons to move on from Mersenne Twister, but in case you need to use it to reproduce results, an implementation exists (not from us but compatible). I'm not sure if there are currently MRG32k3a or L'Ecuyer implementations for Rust, but creating and publishing one yourself (or as a pull request on our repository if preferred) shouldn't be so hard.

Use case is you want to split a Monte-Carlo simulation over N processors.

This can be done with skip-ahead, e.g. with ChaCha, but typically we recommend using a cryptographic master generator to seed each parallel generator. Our ChaCha implementation (and most I believe) uses a 64-bit counter, which likely isn't enough for parallel usage, however it uses a 256-bit seed. The parallel generator (ChaCha or other) still needs to support at least 128-bits of state and independent streams (so not recommended to use PCG), but a 64-bit period is acceptable.

@vks
Copy link
Collaborator

vks commented Aug 24, 2021

I'm closing this in favor of rust-random/book#45. Please feel free to open a new issue if something else needs to be addressed as well.

@vks vks closed this as completed Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants