Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relevant prior art: dSFMT #1

Closed
pao opened this issue Apr 10, 2015 · 6 comments
Closed

Relevant prior art: dSFMT #1

pao opened this issue Apr 10, 2015 · 6 comments

Comments

@pao
Copy link

pao commented Apr 10, 2015

(I tried to post this as a comment to your blog post, but it didn't post, and there was no message telling me it was going into a moderation queue. So I apologize if this is redundant.)

You might want to take a look at dSFMT, which directly generates double-precision random numbers, including uniform on [0, 1): http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/

This algorithm still has undesirable statistical properties, however, as discussed at JuliaLang/julia#6464.

@art4711
Copy link
Owner

art4711 commented Apr 10, 2015

I've actually seen a paper about it. It spent a lot of time talking about the PRNG which is a quite solved problem (mersenne twister isn't the answer other than some very specialized applications where you care about speed before quality) and then did some hand-waving and suddenly we magically had floating points. It's the hand-wavy bit that I'm actually interested in.

Looking at the code it's completely ignoring the problem of mapping between sets with different number of elements or with different precision. For example, generating perfectly random numbers [1,2) is trivial, it's one of the first things I did in rd.c. Mapping that to [0,1) without losing information can not be done by simply subtracting 1.0, but that's exactly what that code does.

I just committed a file called "some-more-tests.c" that shows why that doesn't work.

@pao
Copy link
Author

pao commented Apr 10, 2015

mersenne twister isn't the answer other than some very specialized applications where you care about speed before quality

Off topic a bit, but those applications aren't that specialized. At any rate, as you identify it's not the bit that's relevant to what you're exploring.

Looking at the code it's completely ignoring the problem of mapping between sets with different number of elements or with different precision. For example, generating perfectly random numbers [1,2) is trivial, it's one of the first things I did in rd.c. Mapping that to [0,1) without losing information can not be done by simply subtracting 1.0, but that's exactly what that code does.

Whee! That is indeed a problem.

If you're looking for help, you may find some sympathetic bit-twiddlers hanging around the Julia community. We've been looking at alternatives to MT as default random number generator (JuliaLang/julia#8786) but that discussion hasn't addressed correctly extending such a generator to floating point.

@art4711
Copy link
Owner

art4711 commented Apr 10, 2015

We've been looking at alternatives to MT as default random number generator

Please, please, please. Whatever you do, pick a cryptographically secure generator as the default and then maybe make it overridable to something worse if someone really wants it. People screw it up so often.

Look at the recent talks and rants from openbsd about the topic:
http://www.openbsd.org/papers/hackfest2014-arc4random/index.html
http://www.tedunangst.com/flak/post/random-in-the-wild
http://marc.info/?l=openbsd-tech&m=141776286105814&w=2
http://marc.info/?l=openbsd-cvs&m=141807513728073&w=2
http://www.openbsd.org/papers/eurobsdcon2014_arc4random/index.html

@pao
Copy link
Author

pao commented Apr 10, 2015

Choosing a CSPRNG as the default is unlikely--it serves to make choosing random variates (which is what technical computer users are generally doing) slow, to get strength that the majority of the users don't need. (I note that of the list of things that Ron Rivest complained about on our mailing list, this was somewhat surprisingly not one of them.) The trouble is that for the applications that Julia is targeted at, your users (i.e., me) don't need security, they need thousands of Monte Carlo replications completed quickly. Statistical randomness is sufficient for that.

So while your concern is appreciated--and despite my sympathy for "safe by default" choices--this one's hard because it is such a pain point for exactly the group of users Julia attracts.

Feel free to state a case in that issue, though; you'll certainly be more effective than I would at it (since I have decidedly mixed feelings on the matter)!

Anyways, we've strayed quite far from the base topic...rather than use this as my personal message board to you, I'll let this lie in peace as you have in fact seen dSFMT and found it (for good reason) wanting.

@pao pao closed this as completed Apr 10, 2015
@art4711
Copy link
Owner

art4711 commented Apr 12, 2016

I just remembered this conversation. You probably won't get this (I have not idea how github notifies people), but in case you do, I did a thing: https://github.com/art4711/randbench (regarding the "CSPRNG is slow" meme) and https://github.com/art4711/unpredictable. With the new compiler that will be in go 1.7, I'm down to 25ns per 64 bit number which is just 2.5x slower than the default PRNG in Go.

@pao
Copy link
Author

pao commented Apr 12, 2016

Github does notifications quite well, as it happens. On the other hand, I'm not really strongly coupled to Julia development at the moment. But I'll glom your comment onto JuliaLang/julia#8786. Thanks for letting me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants