Refactor API for unconventionally-indexed arrays #17137

timholy · 2016-06-27T01:59:42Z

This is effectively "cleanup" from #16260. It gets rid of shape and renames allocate_for to be similar. It achieves these by introducing a new Range type, OneTo(n), which is equivalent (in a value-sense) to 1:n. These are changes recommended by @JeffBezanson. I'm hoping to avoid having OneTo "leak" out into user space, so it's not exported, but I suspect we may need to export it at some point (let's first see how far we can get without doing that).

Despite the name of this branch, it does not make unconventional indices safer (ref @arraysafe in #16973); it seems better to do that in a separate PR.

A key thing here is going to be to find out whether I've done something nasty for performance, so @nanosoldier runbenchmarks(ALL, vs=:master).

pabloferz · 2016-06-27T06:33:26Z

base/abstractarray.jl

-_similar(::IndicesBehavior, a::AbstractArray, T::Type)   = similar(a, T, indices(a))
+to_shape(::Tuple{}) = ()
+to_shape(dims::Dims) = dims
+to_shape(dims::DimsOrInds) = map(to_shape, dims)


Another idea (I don't know how convenient or cumbersome it'd be) would be to have a type Shape of Shp. This way this could changed to convert methods so one can write Shape(dims) and convert(Shape, dim).

to_shape is basically a compatibility call, packaging AbstractUnitRange ~~objects~~ tuples as Dims-tuples when possible (when the range can be guaranteed, via the type system, to start at 1). In that sense we already have the type(alias).

timholy · 2016-06-27T09:59:28Z

I seem to have messed up the nanosoldier call, so let's try again after adding quotes: @nanosoldier runbenchmarks(ALL, vs=":master").

nanosoldier · 2016-06-27T12:31:37Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

timholy · 2016-06-29T20:09:08Z

I've canceled the CI here; while the runtime looks good, 1b684d5 ("eliminate the indicesbehavior trait") absolutely hammers compile time.

Master:

julia> using Base.Test

shell> cd test/linalg/
/home/tim/src/julia-0.5b/test/linalg

julia> @time include("schur.jl")
 17.479364 seconds (13.13 M allocations: 516.591 MB, 1.47% gc time)

julia> @time include("schur.jl")
  1.011176 seconds (445.60 k allocations: 23.307 MB, 1.33% gc time)

This PR:

julia> using Base.Test

shell> cd test/linalg/
/home/tim/src/julia-0.5/test/linalg

julia> @time include("schur.jl")
 44.836752 seconds (72.31 M allocations: 3.040 GB, 2.34% gc time)

julia> @time include("schur.jl")
  0.902519 seconds (447.55 k allocations: 23.480 MB, 1.02% gc time)

I'll see what I can figure out. Meanwhile, it's a nice test case for folks who like to look at compiler performance 😉 .

timholy · 2016-06-29T20:32:03Z

Wow, the ProfileView looks like cathedrals of colorful inference.jl sand, reaching towards the heavens. So pretty and spiritual, I think I'll frame it.

timholy · 2016-06-29T20:51:55Z

Interesting observation: if I comment out these loops and just manually set the values, then master and this PR have similar compile times. But set eltya = Int and leave the atype loop intact, and there's a nearly 3x difference in compile times. Also possibly-relevant, changing the for loop to a while loop,

diff --git a/test/linalg/schur.jl b/test/linalg/schur.jl
index 5370afd..02a07dc 100644
--- a/test/linalg/schur.jl
+++ b/test/linalg/schur.jl
@@ -16,11 +16,17 @@ srand(1234321)
 areal = randn(n,n)/2
 aimg  = randn(n,n)/2

-for eltya in (Float32, Float64, Complex64, Complex128, Int)
+# for eltya in (Float32, Float64, Complex64, Complex128, Int)
+eltya = Int
     a = eltya == Int ? rand(1:7, n, n) : convert(Matrix{eltya}, eltya <: Complex ? complex(areal, aimg) : areal)
     asym = a'+a                  # symmetric indefinite
     apd  = a'*a                 # symmetric positive-definite
-    for atype in ("Array", "SubArray")
+    indx = 1
+    atypes = ["Array", "SubArray"]
+    while indx <= length(atypes)
+#    for atype in ("Array", "SubArray")
+        atype = atypes[indx]
+        indx += 1
         if atype == "Array"
             a = a
         else
@@ -96,4 +102,4 @@ for eltya in (Float32, Float64, Complex64, Complex128, Int)
         @test NS[:S] ≈ sS
         @test NS[:Z] ≈ sZ
     end
-end
+# end

does not fix the performance, so this seems different from #16122.

timholy · 2016-06-29T21:05:21Z

Here's a minimal reproducer:

using Base.Test

a = rand(1:7, 10, 10)
for atype in ("Array", "SubArray")
    d,v = eig(a)
end

Master:

julia> @time include("/tmp/tim/schur.jl")
  1.740690 seconds (1.93 M allocations: 84.396 MB, 2.91% gc time)

This PR:

julia> @time include("/tmp/tim/schur.jl")
 13.267818 seconds (46.09 M allocations: 1.985 GB, 5.30% gc time)

But now comment out the for loop, and they are the same.

JeffBezanson · 2016-06-29T21:49:45Z

Wow, interesting. What happens if you use an array instead of a tuple: for atype in ["Array", "SubArray"]?

timholy · 2016-06-29T22:09:56Z

Wow, interesting.

In the Chinese curse-sense, yes. 😄

What happens if you use an array instead of a tuple: for atype in ["Array", "SubArray"]?

Still slow.

I've debugged this a little further; one relevant point is that I can call

a = rand(1:7, 10, 10)
eig(a)

before includeing the file with the loop (or copy/pasting into the REPL), and it's still dirt-slow. But I can't take the eig call out of the loop without making it fast. Finally, it's all one giant toplevel-inference call: this diff

+
+type InfRef
+    val::Bool
+end
+const debug = InfRef(false)
 function typeinf_ext(linfo::LambdaInfo)
     if isdefined(linfo, :def)
+        local tstart
+        if debug.val
+            print(linfo.def.name, " start: ", ccall(:jl_clock_now, Float64, ()))
+        end
         # method lambda - infer this specialization via the method cache
         (code, _t, inferred) = typeinf_edge(linfo.def, linfo.specTypes, linfo.sparam_vals, true, true, true, linfo)
         if inferred && code.inferred && linfo !== code
@@ -1567,13 +1576,22 @@ function typeinf_ext(linfo::LambdaInfo)
             linfo.inferred = true
             linfo.inInference = false
         end
+        if debug.val
+            print(linfo.def.name, " stop: ", ccall(:jl_clock_now, Float64, ()))
+        end
         return code
     else
         # toplevel lambda - infer directly
+        if debug.val
+            println("toplevel inference")
+        end
         linfo.inInference = true
         frame = InferenceState(linfo, true, true)
         typeinf_loop(frame)
         @assert frame.inferred # TODO: deal with this better
+        if debug.val
+            println("toplevel inference done")
+        end
         return linfo
     end
 end

gives this output (having already called eig once):

julia> for atype in ("Array", "SubArray")
           d,v = eig(a)
       end
toplevel inference
toplevel inference done
:put! start: 1.46724e+09:put! stop: 1.46724e+09:start start: 1.46724e+09:start stop: 1.46724e+09:indexed_next start: 1.46724e+09:indexed_next stop: 1.46724e+09

and all the waiting is before the "toplevel inference done".

StirlingNewberry · 2016-06-29T22:29:59Z

宁為太平犬，莫做亂离人

Though the actual curse is English.

On Wed, Jun 29, 2016 at 6:10 PM, Tim Holy [email protected] wrote:

Wow, interesting.

In the Chinese curse-sense, yes. 😄

What happens if you use an array instead of a tuple: for atype in
["Array", "SubArray"]?

Still slow.

I've debugged this a little further; one relevant point is that I can call

a = rand(1:7, 10, 10)
eig(a)

before includeing the file with the loop (or copy/pasting into the REPL),
and it's still dirt-slow. But I can't take the eig call out of the loop
without making it fast. Finally, it's all one giant toplevel-inference
call: this diff

++type InfRef+ val::Bool+end+const debug = InfRef(false)
function typeinf_ext(linfo::LambdaInfo)
if isdefined(linfo, :def)+ local tstart+ if debug.val+ print(linfo.def.name, " start: ", ccall(:jl_clock_now, Float64, ()))+ end
# method lambda - infer this specialization via the method cache
(code, _t, inferred) = typeinf_edge(linfo.def, linfo.specTypes, linfo.sparam_vals, true, true, true, linfo)
if inferred && code.inferred && linfo !== code@@ -1567,13 +1576,22 @@ function typeinf_ext(linfo::LambdaInfo)
linfo.inferred = true
linfo.inInference = false
end+ if debug.val+ print(linfo.def.name, " stop: ", ccall(:jl_clock_now, Float64, ()))+ end
return code
else
# toplevel lambda - infer directly+ if debug.val+ println("toplevel inference")+ end
linfo.inInference = true
frame = InferenceState(linfo, true, true)
typeinf_loop(frame)
@Assert frame.inferred # TODO: deal with this better+ if debug.val+ println("toplevel inference done")+ end
return linfo
end
end

gives this output (having already called eig once):

julia> for atype in ("Array", "SubArray")
d,v = eig(a)
end
toplevel inference
toplevel inference done
:put! start: 1.46724e+09:put! stop: 1.46724e+09:start start: 1.46724e+09:start stop: 1.46724e+09:indexed_next start: 1.46724e+09:indexed_next stop: 1.46724e+09

and all the waiting is before the "toplevel inference done".

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#17137 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ATKyInTNYpCYEmrmPTZtDE8iCzFUBMtXks5qQu1FgaJpZM4I-tDS
.

JeffBezanson · 2016-06-29T23:00:51Z

Inference seems to be encountering a large number of different signatures for _indices, all similar to

Tuple{Base.#_indices, Tuple{Base.OneTo{Int64}, Base.OneTo, Base.OneTo, Base.OneTo, Base.OneTo, Base.OneTo, Base.OneTo, Base.OneTo, Base.OneTo}, Int64, Base.SubArray{Base.Complex{Float32}, 2, A<:Union{Base.ReshapedArray{T<:Any, N<:Any, A<:DenseArray, MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, N<:Any}}}, DenseArray}, Tuple{Union{Base.UnitRange{Int64}, Base.Colon}, Vararg{Union{Base.AbstractCartesianIndex, Base.Range{Int64}, Int64, Base.Colon}, N<:Any}}, true}, Union{Base.AbstractCartesianIndex{N<:Any}, Base.Range{Int64}, Int64, Base.Colon}, Vararg{Union{Base.AbstractCartesianIndex{N<:Any}, Base.Range{Int64}, Int64, Base.Colon}, N<:Any}}

timholy · 2016-06-30T00:50:41Z

Surprising, because there are exactly 6 definitions in this PR, fewer than the 8 there are in current Base. But of course they're called differently. The example you showed suggests something is calling it with 9 indices...given that we're talking about matrix algebra (2 dimensions), what the heck is that from?

JeffBezanson · 2016-06-30T01:23:19Z

It's from recursion in the inference process itself.

timholy · 2016-07-06T11:33:25Z

@nanosoldier runbenchmarks(ALL, vs = ":master").

timholy · 2016-07-06T15:41:59Z

@jrevels, OK to start nanosoldier now? (It's this PR I'm most interested in right now.)

jrevels · 2016-07-06T16:55:07Z

The daily build ran successfully, so I think it's good now:

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2016-07-06T19:26:22Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

timholy · 2016-07-07T02:51:28Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2016-07-07T05:22:56Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

Better to have this logic in the main functions rather than in specific packages

The reason to move earlier is to test whether the methods corrupt other operations.

…king

Splatting forces dynamic method lookup in places where that's a major cost

This leads to noticeable performance improvements for several benchmarks

timholy · 2016-07-07T21:41:23Z

I have a good feeling (based on local benchmarking) about this one: @nanosoldier runbenchmarks(ALL, vs = ":master").

nanosoldier · 2016-07-08T00:12:54Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

timholy · 2016-07-08T01:12:58Z

Fewer exports, cleaner API, fewer lines of code, and more performant---what's not to like? Advance warning, I'll be merging as soon as CI finishes---I don't have any reason to suspect that the one test failure (so far) has anything to do with this PR.

tkelman · 2016-07-09T13:58:52Z

It looks like this broke several packages with a MethodError: no method matching DataArrays.DataArray{T,N}(::Type{Bool}, ::Tuple{Base.OneTo{Int64}}) or MethodError: no method matching Array{T,N}(::Type{Any}, ::Tuple{Base.OneTo{Int64}}) etc, see PkgEval results for DataArrays and others

tkelman · 2016-07-10T01:17:40Z

Looks like the issue in DataArrays stems from it using broadcast_shape. That now returns OneTo objects that packages don't know what to do with. Tying into the unexported broadcast internals isn't ideal, but I suspect DataArrays is not the only package that may have needed to do that. I'm leaning in the direction of walking back on trying to support unconventionally indexed arrays for 0.5, since we really would like to have a package ecosystem that at least half works as it does on 0.4 by the time we have the final 0.5.0 done.

timholy · 2016-07-10T13:57:21Z

If they need a quick-fix, they should be able to just call Base.to_shape on the output of broadcast_shape. That said, just waiting for some consensus might be the better option.

tkelman · 2016-07-11T16:53:29Z

I'm stuck at JuliaStats/DataArrays.jl#205. Help appreciated.

pabloferz reviewed Jun 27, 2016
View reviewed changes

timholy force-pushed the teh/safer_indices branch from 454e43f to 3f7f844 Compare June 27, 2016 09:54

timholy force-pushed the teh/safer_indices branch from 3f7f844 to e1ef74a Compare June 29, 2016 18:47

timholy mentioned this pull request Jun 29, 2016

Allow passing a type as first argument to similar #17201

Closed

JeffBezanson mentioned this pull request Jun 30, 2016

compiler performance #14743

Closed

This was referenced Jul 1, 2016

Make unvetted size throw an error for arrays with non-1 indexing #17228

Merged

Inference (?) hang on recursive methods #17278

Closed

timholy force-pushed the teh/safer_indices branch from e1ef74a to 9a8941e Compare July 6, 2016 11:31

jrevels mentioned this pull request Jul 6, 2016

Error loading immutables when using Julia master JuliaIO/JLD.jl#77

Closed

timholy force-pushed the teh/safer_indices branch from 9a8941e to dfb8cfb Compare July 6, 2016 15:03

timholy force-pushed the teh/safer_indices branch from dfb8cfb to 88359dc Compare July 6, 2016 17:25

timholy force-pushed the teh/safer_indices branch from 88359dc to 2dcef54 Compare July 7, 2016 02:50

timholy added 14 commits July 7, 2016 16:41

Iteration improvements for arrays with non-1 indices

ec07812

Better to have this logic in the main functions rather than in specific packages

Replace shape with indices, and redesign similar

c318a9f

Update broadcast and its callers to the new indices behavior

15a522d

Updates for subarray for new indices

cac487d

Rename allocate_for -> similar. Fixes #17124.

1741ced

Eliminate the indicesbehavior trait

24f3534

Simpler sub2ind/ind2sub for Tuple{OneTo}

513af18

Fix offsetarray tests and move them earlier

66b6a61

The reason to move earlier is to test whether the methods corrupt other operations.

subarray indices: work around an inference hang (#17278)

5ebb702

Performance improvements for indices

90fc0b6

Elim. indicesperformance and use only a single path for bounds-chec…

cf9dd6b

…king

Reduce use of splatting in sparse matrix size operations

18f7bf5

Splatting forces dynamic method lookup in places where that's a major cost

Fixes for indices1(::Subarray) and indices1 specializations

5a52ff3

This leads to noticeable performance improvements for several benchmarks

Update the docs

1b2f11f

timholy merged commit efa23ad into master Jul 8, 2016

timholy deleted the teh/safer_indices branch July 8, 2016 09:18

This was referenced Jul 8, 2016

Use broadcast for array operators #17313

Merged

Display broken for SimpleVector (svec) #17338

Closed

andreasnoack mentioned this pull request Jul 9, 2016

Fix setindex! with SubDArray source JuliaParallel/DistributedArrays.jl#74

Closed

timholy mentioned this pull request Jul 9, 2016

Revise checkbounds again #17355

Merged

timholy mentioned this pull request Jul 11, 2016

Safe non-traditional array indexing #16973

Closed

ararslan mentioned this pull request Jul 12, 2016

Fix broadcast_shape when Base.OneTo is defined JuliaLang/Compat.jl#250

Closed

ranjanan mentioned this pull request Jul 27, 2016

WIP: Get things working on v0.5 JuliaCollections/DataStructures.jl#212

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor API for unconventionally-indexed arrays #17137

Refactor API for unconventionally-indexed arrays #17137

timholy commented Jun 27, 2016

pabloferz Jun 27, 2016

timholy Jun 27, 2016 •

edited

Loading

timholy commented Jun 27, 2016

nanosoldier commented Jun 27, 2016

timholy commented Jun 29, 2016

timholy commented Jun 29, 2016 •

edited

Loading

timholy commented Jun 29, 2016

timholy commented Jun 29, 2016

JeffBezanson commented Jun 29, 2016

timholy commented Jun 29, 2016

StirlingNewberry commented Jun 29, 2016

JeffBezanson commented Jun 29, 2016

timholy commented Jun 30, 2016

JeffBezanson commented Jun 30, 2016

timholy commented Jul 6, 2016

timholy commented Jul 6, 2016

jrevels commented Jul 6, 2016

nanosoldier commented Jul 6, 2016

timholy commented Jul 7, 2016

nanosoldier commented Jul 7, 2016

timholy commented Jul 7, 2016

nanosoldier commented Jul 8, 2016

timholy commented Jul 8, 2016

tkelman commented Jul 9, 2016 •

edited

Loading

tkelman commented Jul 10, 2016

timholy commented Jul 10, 2016

tkelman commented Jul 11, 2016

Refactor API for unconventionally-indexed arrays #17137

Refactor API for unconventionally-indexed arrays #17137

Conversation

timholy commented Jun 27, 2016

pabloferz Jun 27, 2016

Choose a reason for hiding this comment

timholy Jun 27, 2016 • edited Loading

Choose a reason for hiding this comment

timholy commented Jun 27, 2016

nanosoldier commented Jun 27, 2016

timholy commented Jun 29, 2016

timholy commented Jun 29, 2016 • edited Loading

timholy commented Jun 29, 2016

timholy commented Jun 29, 2016

JeffBezanson commented Jun 29, 2016

timholy commented Jun 29, 2016

StirlingNewberry commented Jun 29, 2016

JeffBezanson commented Jun 29, 2016

timholy commented Jun 30, 2016

JeffBezanson commented Jun 30, 2016

timholy commented Jul 6, 2016

timholy commented Jul 6, 2016

jrevels commented Jul 6, 2016

nanosoldier commented Jul 6, 2016

timholy commented Jul 7, 2016

nanosoldier commented Jul 7, 2016

timholy commented Jul 7, 2016

nanosoldier commented Jul 8, 2016

timholy commented Jul 8, 2016

tkelman commented Jul 9, 2016 • edited Loading

tkelman commented Jul 10, 2016

timholy commented Jul 10, 2016

tkelman commented Jul 11, 2016

timholy Jun 27, 2016 •

edited

Loading

timholy commented Jun 29, 2016 •

edited

Loading

tkelman commented Jul 9, 2016 •

edited

Loading