faster randn with ifelse #9126

rfourquet · 2014-11-23T14:48:25Z

Such a small change makes randn (all versions) faster by about 40-50 % (or more) on my machine. One week ago, I wanted to ask on the user list what was the point of ifelse, now I know!

StefanKarpinski · 2014-11-23T15:14:18Z

branch-free code for the win!

ViralBShah · 2014-11-23T15:22:59Z

This is awesome!

I have another suggestion - would be great if you can try it out. If we can separate out the else part of if rabs < ki[idx+1], which gets called very rarely into a separate function, then randn can become much shorter and @inline may yield a speedup.

faster randn with ifelse

ViralBShah · 2014-11-23T15:41:46Z

Cc: @JuliaBackports

ivarne · 2014-11-23T16:14:43Z

Seems like have the same code on the release-0.3 branch (but cherry-pick creates a merge conflict). What do you think about a backport?

Edit seems like Viral had the same idea, but I didn't see it before posting

ViralBShah · 2014-11-23T16:36:30Z

This change is interesting enough to be a blog post about ifelse!

StefanKarpinski · 2014-11-23T16:38:23Z

This is the poster child for ifelse because the branch is inherently unpredictable, so you're going to get pipeline stalls 50% of the time – which is awful for performance.

tkelman · 2014-11-23T16:46:08Z

-1 on backporting this, especially right as we're trying to get a tag out the door. r % Bool is 0.4-only syntax, isn't it?

ViralBShah · 2014-11-23T16:47:09Z

This also paves the way for @simd. It will need a bit more restructuring, but the algorithm is inherently data parallel.

ivarne · 2014-11-23T16:53:17Z

I'm also negative on holding back 0.3.3 for this. We should rather do 0.3.4 in two weeks, if we think it is that important.

r % Bool is 0.4 syntax, but is just pretty syntax for (r&1)!=0. Backporting is getting harder all the time 😄

ViralBShah · 2014-11-23T16:55:24Z

Ok, let's not have this block the 0.3.3 release. Agree this can wait for 0.3.4, or even 0.4.

ViralBShah · 2014-11-23T17:52:25Z

I verify the 50% speedup on my mac as well.

@alanedelman You might love a 50% performance bump in randn.

@ViralBShah

All credits to @ViralBShah (cf. #8941 and #9126). This change probably allows better inlining.

@ViralBShah

All credits to @ViralBShah (cf. #8941 and #9126). This change probably allows better inlining.

tkelman · 2014-11-24T16:28:32Z

0.3.4-pre is open so we could consider backporting a 0.3-syntax version of this, as long as the same performance comparison applies there.

Also a broader issue of the large number of RNG-related changes on 0.4 recently which have been diverging further and further from 0.3. I have no idea which, if any, would be safe/appropriate to backport at this time, and I don't think Ivar or Elliot know either. We should either:

make a decision to leave the 0.3 RNG's as they are and not backport any of the more complicated changes, or
have @ViralBShah and @rfourquet go through all the recent work, preferably early in a backport window like over the next couple weeks, to decide which pieces of it are okay to backport and resolve any conflicts or syntax differences, thoroughly test, etc

@rfourquet

Thanks to @rfourquet Backport of #9126 (from 376afcf)

ivarne · 2014-11-24T17:52:29Z

This seemed pretty easy to backport, so I took the chance in 9f76ed3. Unless someone steps up and want to redo some part of the recent work on release-0.4, I think we should leave the APIs as is.

ViralBShah · 2014-11-25T03:17:04Z

There are a couple of things. The segfault fixed recently and faster array fill. The refactoring of randn. There were a few other perf improvements too, for random integers and such.

But we can't do any of the api changes and the work is all mixed up. So all this has to be pretty much done again on 0.3.

rfourquet · 2014-11-25T05:45:09Z

I could do some backport, but am not sure yet what amount of work this implies and if it's worth it. I have few questions:

is it OK that the "stream" of random numbers changes between two 0.3 sub-releases? (this is what happen with the fill_array stuff)
is it preferred that each commit in the release corresponds to a specific commit/PR from master?
no API changes: besides not breaking exisiting 0.3 code, does this mean not adding new API, so that a program valid in 0.3.4 is guarranted to work in 0.3.3? Or is it enough to not document new possibilities?

I wonder if the simplest wouldn't be to start from master's file and then to disable things to make the API match with that of 0.3.

ViralBShah · 2014-11-25T05:57:47Z

I think the stream changing is a bad idea in a point release. As you point out - that will happen with fill_array and we should avoid it.
Not necessarily.
Yes, we cannot add APIs in a point release, since code on any 0.3.x release should work with any other 0.3.x release.

ViralBShah · 2014-11-25T05:58:11Z

Perhaps all this suggests that it is best to leave the 0.3 branch as is.

rfourquet · 2014-11-25T06:47:29Z

I wouldn't be against that!

ivarne · 2014-11-25T07:43:03Z

Changing previously correct results means that we should have very good reasons for the change.
Only if it makes sense. If the diffs look significantly differently to preserve APIs, it doesn't make sense. You can still link to issues, PRs and commits from master in the new commits.
The premise that code written with 0.3.3 should work with 0.3.0 is fundamentally flawed, as long as we fix anything other than performance problems. If a piece of 0.3.3 code depends on a bugfix that avoids a segfault, the code is guaranteed to segfault every time on 0.3.0. If you want your code to run on a platform with known bugs, you have to test it on that platform. What we should guarantee is that code written for 0.3.0 should work on later releases, unless they depend on the wrong result from a library function (eg. itrunc(::Bigfloat) rounding instead of truncating).

nalimilan · 2014-11-25T08:06:25Z

Yes, we cannot add APIs in a point release, since code on any 0.3.x release should work with any other 0.3.x release.

As @ivarne says, I think adding new APIs from 0.4 to 0.3.x minor releases should be considered as a good thing on the contrary, as it allows writing code that works on both versions. That's e.g. how GTK works: the last GTK2 version included almost all APIs of 3.0, except for breaking changes. This makes porting much smoother, and means you also benefit from new stuff in the last release. What's needed is just to note in the documentation when the function was introduced.

tkelman · 2014-11-25T17:48:42Z

Agreed with @ivarne and @nalimilan, API additions in 0.3.x should not be completely off the table but care needs to be taken, and it should only be done for very good bugfix reasons. The ease-of-porting issue can hopefully be addressed mostly by Compat.jl, but we'll see. For example we did sneak in backporting a Julia implementation of chmod last minute into 0.3.3 (it had been on master for 3 months just fine though), because it fixed a bug of not being able to delete read-only files. But as @nalimilan said we do probably need to document this kind of thing as clearly as we can.

ViralBShah · 2014-11-26T03:43:09Z

I am ok with bugfixes, but wary about introducing new APIs. If I have julia on two computers, one that is on 0.3.1 in a lab, and my laptop, say at 0.3.3, it would not be nice if code written in 0.3.3 did not work on 0.3.1. In many of the new RNG APIs, order of arguments is changed, AbstractRNG is allowed as an argument now in almost all cases, and so on. Performance updates, bugfixes and internal APIs are all ok to update. I am just wary about changing the behaviour of published APIs.

@rfourquet The recent segfault that you fixed, does it happen on 0.3 also? Since we did not have the array fill generators in 0.3, it is perhaps safe from the segfault?

rfourquet · 2014-11-26T04:04:33Z

No the segfault was caused exclusively by the transition to fill_array functions, so 0.3 is safe from it (as long as Array(Int32, 770), in dSFMT.jl, is 16-aligned).

ViralBShah · 2014-11-26T04:27:37Z

I just verified that, and 0.3 is indeed safe.

faster randn with ifelse

376afcf

rfourquet added the randomness Random number generation and the Random stdlib label Nov 23, 2014

ViralBShah added a commit that referenced this pull request Nov 23, 2014

Merge pull request #9126 from JuliaLang/rf/randn-ifelse

a331fec

faster randn with ifelse

ViralBShah merged commit a331fec into master Nov 23, 2014

ivarne added the backport pending label Nov 23, 2014

ViralBShah mentioned this pull request Nov 23, 2014

Final tweaks before 0.3.3 tagging #9120

Merged

ViralBShah mentioned this pull request Nov 23, 2014

attempt faster randn implementation #5105

Closed

rfourquet deleted the rf/randn-ifelse branch November 24, 2014 03:37

rfourquet added a commit that referenced this pull request Nov 24, 2014

faster randn by separating out unlikely branch in a function

616e1d7

All credits to @ViralBShah (cf. #8941 and #9126). This change probably allows better inlining.

rfourquet mentioned this pull request Nov 24, 2014

faster randn by separating out unlikely branch in a function #9132

Merged

rfourquet added a commit that referenced this pull request Nov 24, 2014

faster randn by separating out unlikely branch in a function

b99ea92

All credits to @ViralBShah (cf. #8941 and #9126). This change probably allows better inlining.

ivarne added a commit that referenced this pull request Nov 24, 2014

faster randn with ifelse

9f76ed3

Thanks to @rfourquet Backport of #9126 (from 376afcf)

ivarne removed the backport pending label Nov 24, 2014

garborg mentioned this pull request Nov 26, 2014

allow all bits values as type parameters. closes #6081 #9161

Merged

rfourquet added the performance Must go faster label Oct 6, 2020

rfourquet mentioned this pull request Oct 7, 2020

quite faster rand(::MersenneTwister, ::Type{Float64}) etc... #37916

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

faster randn with ifelse #9126

faster randn with ifelse #9126

rfourquet commented Nov 23, 2014

StefanKarpinski commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ivarne commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

StefanKarpinski commented Nov 23, 2014

tkelman commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ivarne commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

tkelman commented Nov 24, 2014

ivarne commented Nov 24, 2014

ViralBShah commented Nov 25, 2014

rfourquet commented Nov 25, 2014

ViralBShah commented Nov 25, 2014

ViralBShah commented Nov 25, 2014

rfourquet commented Nov 25, 2014

ivarne commented Nov 25, 2014

nalimilan commented Nov 25, 2014

tkelman commented Nov 25, 2014

ViralBShah commented Nov 26, 2014

rfourquet commented Nov 26, 2014

ViralBShah commented Nov 26, 2014

faster randn with ifelse #9126

faster randn with ifelse #9126

Conversation

rfourquet commented Nov 23, 2014

StefanKarpinski commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ivarne commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

StefanKarpinski commented Nov 23, 2014

tkelman commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ivarne commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

tkelman commented Nov 24, 2014

ivarne commented Nov 24, 2014

ViralBShah commented Nov 25, 2014

rfourquet commented Nov 25, 2014

ViralBShah commented Nov 25, 2014

ViralBShah commented Nov 25, 2014

rfourquet commented Nov 25, 2014

ivarne commented Nov 25, 2014

nalimilan commented Nov 25, 2014

tkelman commented Nov 25, 2014

ViralBShah commented Nov 26, 2014

rfourquet commented Nov 26, 2014

ViralBShah commented Nov 26, 2014