You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks to @KristofferC, I was able to chase down quite a few of the indexing performance regressions since 0.4.2. There are still a few perf tests remaining with large regressions, however:
There's something odd going on with inlining and codegen. Here's a somewhat reduced testcase:
using Benchmarks
functionh1(A)
dest =similar(A, (300,))
h2(dest, A)
end@inlinefunctionh2(dest, src)
for i_2 =1:size(dest,2)
for i_1 =1:size(dest,1)
unsafe_setindex!(dest,i_1*i_2,i_1)
endend
dest
end
A =Array{Float32}(300,500)
dest =similar(A, (300,))
@show@benchmarksimilar(A, (300,))
@show@benchmarkh2(dest, A)
@show@benchmarkh1(A)
On 0.4.2, I see 250ns, 350ns, and 510ns, respectively. On master, I get 210ns, 350ns, and 660ns. Removing the inlining annotation on _unsafe_getindex! seems to be a sensible workaround (and may be the right thing to do on its own merits, too)… but it'd be nice to figure out what's happening here.
Thanks to @KristofferC, I was able to chase down quite a few of the indexing performance regressions since 0.4.2. There are still a few perf tests remaining with large regressions, however:
(This is using LLVM 3.3, with #14650 applied)
I poked at these briefly, but I wasn't able to spot the problem immediately. I'm afraid I won't be able to spend much more time on this.
Quite a few of the array tests are significantly faster than 0.4.2; here's the whole distribution for the array tests:
The text was updated successfully, but these errors were encountered: