Colon and Range indexing performance regressions #14651

mbauman · 2016-01-12T01:43:53Z

Thanks to @KristofferC, I was able to chase down quite a few of the indexing performance regressions since 0.4.2. There are still a few perf tests remaining with large regressions, however:

Perf test name	Min time relative to 0.4.2
"sumcolonFb ArrayLS"	2.54728
"sumcolonFb ArrayLF"	2.50111
"sumcolonFb ArrayLSLS"	2.36786
"sumrangeFb ArrayLSLS"	1.94905
"sumrangeFb ArrayLS"	1.55122
"sumrangeFb ArrayLF"	1.53517
"sumcolonIb ArrayLSLS"	1.3634
"sumrangeIb ArrayLSLS"	1.31532
"sumcolonIb ArrayLF"	1.25546
"sumrangeIb ArrayLS"	1.23067
"sumcolonIb ArrayLS"	1.22831
"sumcolonFb Array"	1.22608

(This is using LLVM 3.3, with #14650 applied)

I poked at these briefly, but I wasn't able to spot the problem immediately. I'm afraid I won't be able to spend much more time on this.

Quite a few of the array tests are significantly faster than 0.4.2; here's the whole distribution for the array tests:

mbauman · 2016-01-15T01:19:22Z

There's something odd going on with inlining and codegen. Here's a somewhat reduced testcase:

using Benchmarks
function h1(A)
    dest = similar(A, (300,))
    h2(dest, A)
end
@inline function h2(dest, src)
    for i_2 = 1:size(dest,2)
        for i_1 = 1:size(dest,1)
            unsafe_setindex!(dest,i_1*i_2,i_1)
        end
    end
    dest
end

A = Array{Float32}(300,500)
dest = similar(A, (300,))

@show @benchmark similar(A, (300,))
@show @benchmark h2(dest, A)
@show @benchmark h1(A)

On 0.4.2, I see 250ns, 350ns, and 510ns, respectively. On master, I get 210ns, 350ns, and 660ns. Removing the inlining annotation on _unsafe_getindex! seems to be a sensible workaround (and may be the right thing to do on its own merits, too)… but it'd be nice to figure out what's happening here.

blakejohnson · 2016-01-15T03:25:38Z

The timings for that script on my machine tell a different story. On 0.4.2 I see 145ns, 465ns, and 655ns... On master I see 150ns, 470ns, and 475ns.

mbauman · 2016-02-06T16:54:49Z

These are all different on LLVM 3.7, and I'm not seeing anything too terribly drastic anymore. I'm closing this guy as being stale.

mbauman mentioned this issue Jan 12, 2016

Performance regression in unsafe_getindex #14594

Closed

mbauman added performance Must go faster regression Regression in behavior compared to a previous version labels Jan 12, 2016

mbauman closed this as completed Feb 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Colon and Range indexing performance regressions #14651

Colon and Range indexing performance regressions #14651

mbauman commented Jan 12, 2016

mbauman commented Jan 15, 2016

blakejohnson commented Jan 15, 2016

mbauman commented Feb 6, 2016

Colon and Range indexing performance regressions #14651

Colon and Range indexing performance regressions #14651

Comments

mbauman commented Jan 12, 2016

mbauman commented Jan 15, 2016

blakejohnson commented Jan 15, 2016

mbauman commented Feb 6, 2016