-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specialized vector rand! for many distributions #1879
Specialized vector rand! for many distributions #1879
Conversation
This pull request will probably cause a repeat of #1687 on a larger scale, but I don't think it's really an issue, as this is consistent with Julia behavior. Weird trivia example regarding Random.jl:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1879 +/- ##
==========================================
+ Coverage 85.99% 86.08% +0.08%
==========================================
Files 144 144
Lines 8666 8693 +27
==========================================
+ Hits 7452 7483 +31
+ Misses 1214 1210 -4 ☔ View full report in Codecov by Sentry. |
@devmotion Could I get this reviewed if the scope of the pull request is not too wide? |
Co-authored-by: David Widmann <[email protected]>
@quildtide can you fix the conflict? I'm busy until Wednesday this week but please ping me on Friday in case I haven't reviewed the PR by then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add newline characters at the end of the newly added (and modified) pages? Thank you!
Co-authored-by: David Widmann <[email protected]>
@devmotion Requested changes have been made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @quildtide!
The much-larger sequel to #1269 and #1680
All claims tested on julia 1.11.0-rc1, Windows 10, AMD Ryzen 9 5900X, RTX 3080
I was originally planning on making the normal dispatch path faster for all univariate distributions on
rand!
, but it turns out that the current dispatch path is probably about as fast as we will ever get with a generic rand -> rand! fallback path. I tested a few alternatives and they were all similar or slower.The root cause of why this dispatch path is slower than specialized vector methods is because Julia's base
rand
/randn
/randexp
functions haverand!
specializations for both Xoshiro and Mersenne that perform much faster than calling scalarrand
many times. Xoshiro actually has a separate SIMD implementation in Random.jl and both Mersenne and Xoshiro branch based off of vector length to choose optimal algorithms.Creating specialized
rand!
methods yields some pretty impressive performance gains as a result. I plan on continuing further with this, but this pull request contains:Vectorized
rand!
specializations for:Laplace (~185% faster)InverseGaussian (~102% faster)PGeneralizedGaussian (~60% faster)Benchmarking was done in a very ad hoc way; I grabbed 3 values from
@time
, excluding any runs that included compilation or GC time. This is a small sample size, but results were pretty consistent and I don't really feel a need to prove significance when they're all consistently >40% faster.Positive side-effects of vectorized specializations
Improved GPU compatibility
I don't have any non-CUDA GPUs to test with, so testing has been CUDA-only, but these things are probably true for other GPU types too.
CUDA.jl does not like the for-loop in the generic fallback
rand!
dispatch path in Distributions.jl. However, specialized dispatch paths can help alleviate this issue.These distributions now have functional out-of-the-box CUDA.jl support for
rand!
:These distributions would probably work with the addition of a
randexp
implementation in CUDA.jl:Laplacerand!(::PGeneralizedGaussian)
is dependent onrand(::Gamma)
. CUDA.jl does not like functions that callrand(::Gamma)
.I'm not sure whyrand(::InverseGaussian)
isn't working. I'll have to revisit that later.Improved Dagger.jl compatibility
The current default fallback
rand!
methods perform so poorly on Dagger.jl that they cannot even really be benchmarked. This is mostly due to issues on the Dagger.jl side that are currently being addressed. I only did a tiny bit of testing with these new methods, but they do perform better than the generic fallbacks.Minor adjustments to scalar
rand
methods:While I was implementing the specialized
rand!
methods, I noticed opportunities to make small changes to some scalarrand
methods that would, in theory, yield small performance or accuracy benefits. In practice, none of these changes seem to have significant observable changes in speed:muladd
nowmuladd
nowmuladd
nowrand(rng, Bool)
instead ofrand(rng) < 0.5
nowAdditional bloating for the test suite
One negative drawback to increased specialization of methods is that there's a larger surface for bugs to emerge. This is doubly so when I'm also making small adjustments to the scalar
rand
methods.I compensated by adding new functionality to
test_distr
andtest_samples
(enabled by keyword) that allows these functions to call the scalarrand
functions multiple times. I also addedtest_distr
entries for all of the distributions that I've added new methods for.Unfortunately, I'm pretty sure a lot of the testing I added also duplicates work done by existing tests. 🤷
I didn't remove any existing tests because I was unsure exactly which parts were redundant. I think code coverage probably went up a bit as a side effect.
I also had to make minor tolerance adjustments to 2 tests.
Future Plans
A lot of distributions are dependent on
rand(::Gamma)
, which is non-trivial to translate to a vectorized form. Accomplishing this would probably enable out-of-the-box GPU support and large performance gains for all dependent distributions.Most of the distributions addressed in this pull request were picked because I thought they would be trivial or easy to implement, but there's still a ton of others remaining in this category.