Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Speedier, simpler and more systematic index conversions #19730

Merged
merged 15 commits into from
Jan 13, 2017

Conversation

mbauman
Copy link
Member

@mbauman mbauman commented Dec 27, 2016

This patch dramatically simplifies the fallback indexing algorithms for arrays
by standardizing the way in which the passed indices are converted to_index
and by rigorously defining what a supported index type is.

The very first thing that occurs within the fallback indexing methods is that
the passed indices are converted to a supported indexing type by
to_indices(...). A supported index type is either Int or an AbstractArray
of indices. This means that it is at this point that both Colon and arrays of
booleans get converted to collections of indices (as specialized array types:
Slice and LogicalIndex, respectively). This drastically simplifies much
of the internal logic. Whereas we had previously encoded the special behaviors
of : indexing in about four places, this patch does it once and then stores
the result in a Slice AbstractVector. Similarly, CartesianIndex is now
converted within to_indices, as well.

In addition to simplifying the internal definitions, this patch also makes it
easier for both external packages and Base to add new behaviors in a systematic
fashion. Simple extensions would include things like Not indices and
EndpointRanges.jl.

Additionally, logical indexing is now represented by a specialized
LogicalIndex type that enables fast iteration over the boolean mask without
allocations. This allows index conversions to occur before checking bounds,
and should enable further optimizations for LinearSlow arrays.

Breaking Behaviors:

  • The stored type of indices in SubArrays change from Colon to Slice.

Non-breaking behavior changes:

  • Fully deprecates using Colon as pseudo-collections of integers.
  • Logical indices are converted to a lazy, iterative collection of indices that doesn't support indexing. A deprecation provides indexing with O(n) search.

TODO:

  • CI pass
  • @nanosoldier runbenchmarks(ALL, vs = ":master")
  • Decide: should we export and formally document to_indices? YES. I think I'd be inclined to continue to not export to_index, particularly now that it's not the entire story. You must call to_indices to get the complete conversion, but you can extend to_index to provide simpler customizations. Exported to_indices.

@mbauman
Copy link
Member Author

mbauman commented Dec 27, 2016

I'd just like to note that this PR is at a net-negative LOC… and that includes ~30 lines of deprecations and ~40 lines of docstrings. :)

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@mbauman mbauman force-pushed the mb/too_many_indices branch 3 times, most recently from de5c8b4 to 477f28e Compare December 29, 2016 00:00
@mbauman
Copy link
Member Author

mbauman commented Dec 29, 2016

Let's see if Nanosoldier is any happier with the latest iteration. @nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@mbauman
Copy link
Member Author

mbauman commented Dec 29, 2016

Third time's the charm? @nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@mbauman mbauman changed the title WIP: Simpler and more systematic index conversions RFC: Simpler and more systematic index conversions Dec 29, 2016
@mbauman
Copy link
Member Author

mbauman commented Dec 29, 2016

Alright, this is looking pretty good. I think I should have mitigated the remaining regressions. One unfortunate thing is that this increases the impact of a type-instability within indexing code — hence the cat regressions.

@andreasnoack
Copy link
Member

So the broadcast+fusion regression is gone with the latest changes?

@mbauman
Copy link
Member Author

mbauman commented Dec 30, 2016

Yes, I believe so. I'm feeling a little bad about co-opting nanosoldier time, but we should run it again. I can't reproduce the fusion regression locally on my old hardware so it's hard for me to debug it.

I should sell the case for the refactor a little better, but I strongly believe it's a much better and saner design.

@tkelman
Copy link
Contributor

tkelman commented Dec 30, 2016

I'm feeling a little bad about co-opting nanosoldier time

Please don't, this is exactly what it's for and why we're going a little out of our way to try not to break it right now. If there are a lot of people using it, it queues jobs, which is fine. I don't think the nanosoldier queue has ever been especially deep.

@KristofferC
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs = ":master")

@jrevels
Copy link
Member

jrevels commented Dec 30, 2016

retriggering Nanosoldier, ref my comment here

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: ErrorException("failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(139)) [139]")

Logs and partial data can be found here
cc @jrevels

@jrevels
Copy link
Member

jrevels commented Dec 30, 2016

Looks like Nanosoldier's benchmark script segfaulted. The script it generates can be found here, we can see from the STDOUT log that it at least executed the cset command before it died.

r
end
function _setindex!(::LinearFast, A::AbstractVector, v, I1::Real, I::Real...)
function _setindex!(::LinearFast, A::AbstractVector, v, I1::Int, I::Int...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right that this signature is for allowing extra trailing 1's?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

@@ -1143,23 +1140,29 @@ function cat_t(dims, T::Type, X...)
return _cat(A, shape, catdims, X...)
end

_cat_inds(i, x, ::Tuple{}, offsets, concat) = ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could do with some comments, what these recursions are going for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is more than is necessary here… and probably has compile-time performance implications. I reverted it in favor of a simpler work-around.

"""
to_index(A, i)

Convert index `i` for use into indexing into array `A`. Custom array types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for use in indexing" ?

index `i`. More complicated index types, however, require more context about
the dimension into which they index. To support those cases, `to_indices(A, I)`
calls `to_indices(A, indices(A), I)`, which then recursively walks through both
the given tuple of indices and the indices in `A` in tandem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"indices in A" or "indices of A" ?

@tkelman
Copy link
Contributor

tkelman commented Jan 2, 2017

This is pretty neat. Hasn't gotten a whole lot of review, but should we be trying to get it in right now as we're finalizing feature freeze for 0.6?

@tkelman tkelman added the arrays [a, r, r, a, y, s] label Jan 2, 2017
@mbauman
Copy link
Member Author

mbauman commented Jan 2, 2017

Thanks, Tony. I think the standardization here is really valuable. It'd be nice if this could make the cut.

That said, I'd like some feedback on the API of to_indices and to_index. This PR makes to_indices great to use as a caller. Note how it will automatically make view work with any custom index type implements its conversion within to_indices. This will make it much easier to add custom index types and have them just work everywhere. It's also really nice internally; we only need to worry about indices of type Int and AbstractArray within the all internal functions.

The trouble is that implementing a custom to_indices specialization is rather messy because it has to fit into a complicated recursion setup. For example, here's a simplistic take on Not indexing:

julia> immutable Not{T}
           vals::T
       end

julia> import Base: to_indices, uncolon, tail, _maybetail
       @inline to_indices(A, inds, I::Tuple{Not, Vararg{Any}}) =
           (setdiff(uncolon(inds, (:, tail(I)...)), I[1].vals), to_indices(A, _maybetail(inds), tail(I))...)
to_indices (generic function with 9 methods)

julia> A = reshape(1:12, 3, 4);
       A[Not(2), Not(2:3)]
2×2 Array{Int64,2}:
 1  10
 3  12

julia> view(A, Not(2), Not(2:3))
2×2 SubArray{Int64,2,Base.ReshapedArray{Int64,2,UnitRange{Int64},Tuple{}},Tuple{Array{Int64,1},Array{Int64,1}},false}:
 1  10
 3  12

Of course, some of this is nasty because of generalized linear indexing; that's why I'm leaning on the internal uncolon. But the recursion alone is tricky. A simpler example that's still tough to get right is if a custom array wishes to support scalar indexing with non-integers. For example, to support the SymInt example I posited in #12567, you can "just" protect the symbols within a 0-dimensional array:

import Base: to_indices, _maybetail, tail
@inline to_indices(A::SymInt, inds, I::Tuple{Symbol, Vararg{Any}}) =
    (fill(I[1]), to_indices(A, _maybetail(inds), tail(I)))

So to make this simpler, I added fallbacks to the existing to_index function, allowing you to not need to worry about recursion or the parent's indices if you don't need to. So for the symbol example, you can simply write to_index(::SymInt, s::Symbol) = fill(s). I don't really like this, though, since not all indices will fall back here. Only those indices that don't need the extra information will hit to_index. I could see someone getting really frustrated when their custom to_index(::MyArray, ::Colon) isn't getting called.

So we can either document that extending to_index might be crazy-making if someone has intercepted your method within to_indices… or we can nix to_index entirely and just document how to extend to_indices very carefully. Thoughts?

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. A couple of questions:

  • I'm a little confused about transforming Colon->Slice, can you explain a bit why that's necessary?
  • You haven't by any chance tested the impact on AxisArrays.jl and Interpolations.jl, have you?

first(Iterators.drop(A, i-1))
end
function to_indexes(I...)
depwarn("to_indexes is deprecated; pass both the source array and tuple of indices to `to_indices(A, I)` instead.", :to_indexes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be more helpful to say "pass both the source array A and indices as to_indices(A, $(I...))"?

@@ -1144,6 +1144,39 @@ eval(Base.Dates, quote
recur{T<:TimeType}(fun::Function, start::T, stop::T; step::Period=Day(1), negate::Bool=false, limit::Int=10000) = recur(fun, start:step:stop; negate=negate)
end)

# Index conversions revamp; #19730
function getindex(A::LogicalIndex, i::Int)
depwarn("indexing into logical indices is deprecated; use iteration or collect the indices into an array instead.", :getindex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not be entirely clear; maybe couple it more directly to the inputs?

_to_index(i) = to_index(I)
_to_index(c::Colon) = c
function getindex(::Colon, i)
depwarn("indexing into Colons is deprecated; convert Colons to Slices with to_indices", :getindex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise an example would be helpful, here and below.

Colons (:) are used to signify indexing entire objects or dimensions at once.
Very few operations are defined on Colons directly; instead they are converted
to `Slice`s upon indexing (within `to_indices`).
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should go into the stdlib doc index

@@ -320,7 +320,7 @@ A = copy(reshape(1:120, 3, 5, 8))
sA = view(A, 2:2, 1:5, :)
@test strides(sA) == (1, 3, 15)
@test parent(sA) == A
@test parentindexes(sA) == (2:2, 1:5, :)
@test parentindexes(sA) == (2:2, 1:5, Base.Slice(1:8))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably deprecate parentindexes, it doesn't seem that useful anymore.

@@ -4,7 +4,7 @@
module IteratorsMD
import Base: eltype, length, size, start, done, next, last, in, getindex,
setindex!, linearindexing, min, max, zero, one, isless, eachindex,
ndims, iteratorsize, to_index
ndims, iteratorsize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, with GitHub's new behavior of not showing big diffs, I almost missed the most interesting part of this...

index_lengths() = ()
@inline index_lengths(::Real, rest...) = (1, index_lengths(rest...)...)
@inline index_lengths(A::AbstractArray, rest...) = (length(A), index_lengths(rest...)...)
@inline index_lengths(A::Slice, rest...) = (length(indices1(A)), index_lengths(rest...)...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

mask::A
sum::Int
LogicalIndex(mask::A) = new(mask, countnz(mask))
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be interesting to see whether there are any performance consequences---could potentially be an extra trip through mask for some algorithms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; in general I was surprised how hard I had to fight to match the allocating find algorithm. find(::BitArray) is wonderfully optimized and tough to beat.

calls `to_indices(A, indices(A), I)`, which then recursively walks through both
the given tuple of indices and the dimensional indices of `A` in tandem.
"""
to_indices(A, I::Tuple) = (@_inline_meta; to_indices(A, indices(A), I))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need to pass A once you've called indices(A)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but the advantage is that it allows for a simpler to_index(A, i) method for array types to override.

@inline function _getindex{N}(l::LinearIndexing, A::AbstractArray, I::Vararg{Union{Real, AbstractArray, Colon},N}) # TODO: DEPRECATE FOR #14770
@boundscheck checkbounds(A, I...)
_unsafe_getindex(l, _maybe_reshape(l, A, I), I...)
@generated function _getindex(l::LinearIndexing, A::AbstractArray, I::Union{Real, AbstractArray}...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real or Int?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this could be tightened eventually. It's not strictly necessary, though.

@vchuravy
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@oscardssmith
Copy link
Member

Is it concerning that some of these tests took 0.00 times as long on this branch as master? That suggests to me, that something is broken. Assuming these were not a mistake, nanosoldier seems to quite like this.

@mbauman
Copy link
Member Author

mbauman commented Jan 12, 2017

The results here are definitely real. This refactor allows Julia+LLVM to drastically reduce and/or eliminate the creation of some heap-allocated temporaries inside the loops. This leads directly to improvements of several orders of magnitude since the only other "work" being done each iteration is a simple addition. And with ranges it's even better than 100x: the loop becomes simple enough that it can be entirely replaced with a constant-time algorithm.

I've added two commits: one addresses your comment @tkelman, and the other adds an optimization for indexing into linear fast arrays with multidimensional logical masks by allowing the LogicalIndex to be a collection of linear indices in that case.

@vchuravy vchuravy merged commit 54ffc51 into master Jan 13, 2017
@vchuravy vchuravy deleted the mb/too_many_indices branch January 13, 2017 02:06
@StefanKarpinski
Copy link
Member

This is lovely work, @mbauman.

@JeffBezanson
Copy link
Member

Yes, this is an awesome improvement.

I still don't really understand the purpose of the Slice type. It seems very similar to OneTo. Could we normalize colons to those instead?

@mbauman
Copy link
Member Author

mbauman commented Jan 13, 2017

Thanks!

@JeffBezanson - Slice is really only required to support the current semantics of OffsetArrays, where indexing A[:] returns a similarly offset array. We use the offset-ness of the indices to determine the shape of the result, so in those cases : represents an offset vector. I suppose we could ask offset array packages to implement this, but I wanted to keep this non-breaking.

Edit: Ah, the other reason for Slice as opposed to OneTo is that some SubArray and BitArray optimizations need to know that the range spans the entire dimension. Slice communicates that.

@tkelman
Copy link
Contributor

tkelman commented Jan 13, 2017

This has been causing an abstractarray test failure, but the exit status gets lost when tests run in parallel

mbauman added a commit to mbauman/julia that referenced this pull request Jan 13, 2017
Fixes the test failure accidentally introduced in JuliaLang#19730
@mbauman
Copy link
Member Author

mbauman commented Jan 13, 2017

That's a simple fix, but do we have an issue for make test missing failures?

@StefanKarpinski
Copy link
Member

This has been causing an abstractarray test failure, but the exit status gets lost when tests run in parallel

Oops – that's two bugs. The lost exit status one is actually worse.

@tkelman
Copy link
Contributor

tkelman commented Jan 13, 2017

not yet, but it's been happening for at least a couple weeks. first thing I intend to try is disabling the anonymous module change Yichao made

edit: opened as #20027

@Sacha0 Sacha0 added deprecation This change introduces or involves a deprecation needs news A NEWS entry is required for this change labels May 19, 2017
@Sacha0
Copy link
Member

Sacha0 commented May 19, 2017

@mbauman, this work seems worth a news entry. I'm afraid I wouldn't do it justice. Would you have time to sketch an appropriate entry? Best! (Ref. #21475)

mbauman added a commit that referenced this pull request May 31, 2017
Sacha0 pushed a commit that referenced this pull request Jun 6, 2017
tkelman pushed a commit that referenced this pull request Jun 17, 2017
* NEWS for to_indices (#19730)
(cherry picked from commit 91f08bb)
@mbauman mbauman removed the needs news A NEWS entry is required for this change label Jul 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.