Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement sampling from the unit sphere and circle #567

Merged
merged 3 commits into from
Jul 27, 2018

Conversation

vks
Copy link
Collaborator

@vks vks commented Jul 23, 2018

This uses a method by Marsaglia (1972), which was found to be the fastest of the methods given by MathWorld. (See my benchmarks.)

There are some open questions:

  • Technically this is a uniform distribution. However, for more than one dimension, the uniform unit distribution is not unique, so I implemented it as a separate distribution.
  • The name UnitSphere is slightly incorrect. We are not sampling from the the unit sphere, but rather its surface. However, UnitSphereSurface is a bit long and sampling from the unit sphere is trivial given a sample from the surface, so I'm not sure we should add it.
  • I did not bother to port the benchmarks to Rand. I think it is not really necessary, but it could be easily done.
  • The tests don't check for uniformity.
  • assert_almost_eq could probably be used in all tests.

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition.

loop {
let (x1, x2) = (self.uniform.sample(rng), self.uniform.sample(rng));
let (x1_sq, x2_sq) = (x1*x1, x2*x2);
let sum = x1_sq + x2_sq;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just use a single sum_sq = ... since the individual squares are not used.

continue;
}
let factor = (1.0_f64 - x1_sq - x2_sq).sqrt();
return [2.*x1 * factor, 2.*x2 * factor, 1. - 2.*sum];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-multiply factor by 2?

@@ -190,6 +193,7 @@ pub use self::bernoulli::Bernoulli;
pub mod uniform;
mod bernoulli;
#[cfg(feature="alloc")] mod weighted;
#[cfg(feature="std")]mod unit_sphere;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a space

@vks
Copy link
Collaborator Author

vks commented Jul 24, 2018

Thanks for the feedback, I addressed your comments.

Do we want to add sampling from the unit sphere volume as well? The implementation is fairly straightforward (not very surprisingly, rejection sampling is the most efficient). If yes, I think UnitSphere should be renamed to UnitSphereSurface.

@dhardy
Copy link
Member

dhardy commented Jul 24, 2018

To be honest I'm out of my depth regarding which sampling algorithms we should choose to implement. But I read the paper for this one and it looks like a nice approach to an apparently common problem.

I don't think there's much use for sampling from the volume of a sphere (this is about the only thing I can think of), and it's not exactly difficult to implement rejection sampling, so I don't think we would want it.

One can try to generalise sampling from the surface to other dimensions, but for 2D there's a choice of (cos(θ), sin(θ)) or (x, sign*sqrt(1-x^2)) (where sign is extracted from a spare bit), and Marsaglia's approach in 4D (from the same paper) is significantly different. Besides which, the 2D and 3D algorithms are applicable to different problems and I'm not sure if there are any common uses for 4D sampling (representing random spin using Quarternions, maybe? Too long since I looked at that problem).

So the only important questions are, I think,

@vks
Copy link
Collaborator Author

vks commented Jul 24, 2018

To be honest I'm out of my depth regarding which sampling algorithms we should choose to implement.

I implemented all the common ones and picked the fastest one. I think speed is the only concern, they should all produce correct uniform distributions.

do we also want to include UnitCircle

FWIW, I already implemented it. I think we should include it, because the efficient implementation is non-trivial.

and UnitHypersphere

Sampling from the volume or the surface? I can't spontaneously think of any use cases. For large dimensions, rejection sampling becomes inefficient, so it might be useful to provide an implementation.

@dhardy
Copy link
Member

dhardy commented Jul 24, 2018

because the efficient implementation is non-trivial.

Only slightly surprising. Trig is slow. Do you want to try the method I suggested? Problem is getting the sign bit without an extra RNG call requires re-implementing the conversion to FP (as in our Ziggurat function).

@vks
Copy link
Collaborator Author

vks commented Jul 25, 2018

Do you want to try the method I suggested?

It is fast, but it is unfortunately not uniform.

Problem is getting the sign bit without an extra RNG call requires re-implementing the conversion to FP (as in our Ziggurat function).

Maybe we should implement Distribution<(f64, u8)> for the float distributions.

@vks
Copy link
Collaborator Author

vks commented Jul 25, 2018

I have a uniformity test using histograms. Should I use an external crate for the histogram implementation, or should we ship our own (private) implementation in rand?

@dhardy
Copy link
Member

dhardy commented Jul 25, 2018

It is fast, but it is unfortunately not uniform.

Oh, of course not, sorry. Replacing x by sin(x) should fix it, but will probably be quite slow.

Should I use an external crate for the histogram implementation

Not really sure what you mean here. An extra lib for testing using histograms? That sounds fairly specific to testing random number distributions so I suppose it could be a Rand sub-lib, but it could work just as well as a separate crate too. But I should probably leave it to your judgement.

@vks
Copy link
Collaborator Author

vks commented Jul 26, 2018

Replacing x by sin(x) should fix it, but will probably be quite slow.

Yes, that works. It is twice as fast as the naive trigonometric approach, but still slower than the current implementation.

Not really sure what you mean here. An extra lib for testing using histograms?

No, I was using an extra lib just for histograms, because I already implemented them there. In principle I could copy-and-paste the implementation into Rand's test directory.

@dhardy
Copy link
Member

dhardy commented Jul 26, 2018

I don't see a problem with adding a dev-dependency on average for testing. We already did that with bincode.

@vks vks force-pushed the sample-sphere branch from 41aaa3c to f553739 Compare July 26, 2018 15:43
Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good. The only thing I might add is that we should add a test to detect value-breaking changes. (Just recording a few outputs from the current implementations with a fixed random stream is enough.)

src/distributions/unit_circle.rs Outdated Show resolved Hide resolved
@clarfonthey
Copy link
Contributor

Is there a reason why this is always f64 and not parameterised to accept both f32 and f64? That way in the future an implementation for say f80 or f128 could be added in a not breaking way.

@dhardy
Copy link
Member

dhardy commented Jul 27, 2018

Perhaps I should have been clearer: we're sampling from the edge of the unit circle. (Since we return a pair of coordinates, it would be easy to mis-understand this as sampling from the disc.)

In this case, it also seems there are people who want to sample from the disc itself, though I don't know if this has an application or is just a toy problem. But if we ever want to implement that, we can use the name UnitDisc (or Disk; I'm not actually sure if that's the correct US spelling or mostly applicable to bit-storage devices; Disc is however correct English).

@vks can you squash please? I don't think this needs 6 commits.

Is there a reason why this is always f64

Status quo. See #100. I'd rather get this right once and break stuff as necessary than use a piecemeal solution. (Also, this won't actually be released until 0.6, so there's time to do so before it becomes a "breaking" change.)

@dhardy
Copy link
Member

dhardy commented Jul 27, 2018

Oh, and CI is fixed now... but is tripping up on average. Could you please fix that too @vks?

@vks
Copy link
Collaborator Author

vks commented Jul 27, 2018

In this case, it also seems there are people who want to sample from the disc itself, though I don't know if this has an application or is just a toy problem.

The Unity engine offers sampling from the inside of the unit circle and sphere, so I guess it is used in 3D games.

Perhaps I should have been clearer: we're sampling from the edge of the unit circle. (Since we return a pair of coordinates, it would be easy to mis-understand this as sampling from the disc.)

The language that I was using is the following:

  • "circle" = { (x, y) | x^2 + y^2 = R }
  • "disk" = { (x, y) | x^2 + y^2 <= R }
  • "sphere surface" = { (x, y, z) | x^2 + y^2 + z^2 = R }
  • "sphere volume" = { (x, y, z) | x^2 + y^2 + z^2 <= R }

I thought that would be fairly clear, but since we don't have UnitDisk, maybe I should be more explicit?

I'll look into the remaining failure and rebasing/squashing again.

vks added 3 commits July 27, 2018 15:22
This uses a method by Marsaglia (1972), which was found to be the
fastest of the methods given by [MathWorld].

[MathWorld]: http://mathworld.wolfram.com/SpherePointPicking.html
This introduces an external dev-dependency on the `average` crate.
@vks vks force-pushed the sample-sphere branch from 5da69c6 to f8767d0 Compare July 27, 2018 13:31
@vks
Copy link
Collaborator Author

vks commented Jul 27, 2018

I clarified we sample from the edge of the circle, fixed the type inference failure and rebased/squashed.

@vks vks changed the title Implement sampling from the unit sphere Implement sampling from the unit sphere and circle Jul 27, 2018
@dhardy
Copy link
Member

dhardy commented Jul 27, 2018

Great.

We could rename UnitSphereSurface to UnitShell, but this is not very descriptive of the shape, so probably not the best idea.

@dhardy dhardy merged commit c4cbb55 into rust-random:master Jul 27, 2018
@vks vks deleted the sample-sphere branch July 27, 2018 14:16
@dhardy dhardy mentioned this pull request Jul 31, 2018
28 tasks
@dhardy dhardy mentioned this pull request Dec 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants