-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Speedier, simpler and more systematic index conversions #19730
Conversation
I'd just like to note that this PR is at a net-negative LOC… and that includes ~30 lines of deprecations and ~40 lines of docstrings. :) |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
de5c8b4
to
477f28e
Compare
Let's see if Nanosoldier is any happier with the latest iteration. @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Third time's the charm? @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Alright, this is looking pretty good. I think I should have mitigated the remaining regressions. One unfortunate thing is that this increases the impact of a type-instability within indexing code — hence the cat regressions. |
So the |
Yes, I believe so. I'm feeling a little bad about co-opting nanosoldier time, but we should run it again. I can't reproduce the fusion regression locally on my old hardware so it's hard for me to debug it. I should sell the case for the refactor a little better, but I strongly believe it's a much better and saner design. |
Please don't, this is exactly what it's for and why we're going a little out of our way to try not to break it right now. If there are a lot of people using it, it queues jobs, which is fine. I don't think the nanosoldier queue has ever been especially deep. |
@nanosoldier |
retriggering Nanosoldier, ref my comment here @nanosoldier |
Looks like Nanosoldier's benchmark script segfaulted. The script it generates can be found here, we can see from the STDOUT log that it at least executed the |
r | ||
end | ||
function _setindex!(::LinearFast, A::AbstractVector, v, I1::Real, I::Real...) | ||
function _setindex!(::LinearFast, A::AbstractVector, v, I1::Int, I::Int...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right that this signature is for allowing extra trailing 1's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup.
@@ -1143,23 +1140,29 @@ function cat_t(dims, T::Type, X...) | |||
return _cat(A, shape, catdims, X...) | |||
end | |||
|
|||
_cat_inds(i, x, ::Tuple{}, offsets, concat) = () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could do with some comments, what these recursions are going for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is more than is necessary here… and probably has compile-time performance implications. I reverted it in favor of a simpler work-around.
""" | ||
to_index(A, i) | ||
|
||
Convert index `i` for use into indexing into array `A`. Custom array types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for use in indexing" ?
index `i`. More complicated index types, however, require more context about | ||
the dimension into which they index. To support those cases, `to_indices(A, I)` | ||
calls `to_indices(A, indices(A), I)`, which then recursively walks through both | ||
the given tuple of indices and the indices in `A` in tandem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"indices in A
" or "indices of A
" ?
This is pretty neat. Hasn't gotten a whole lot of review, but should we be trying to get it in right now as we're finalizing feature freeze for 0.6? |
Thanks, Tony. I think the standardization here is really valuable. It'd be nice if this could make the cut. That said, I'd like some feedback on the API of The trouble is that implementing a custom julia> immutable Not{T}
vals::T
end
julia> import Base: to_indices, uncolon, tail, _maybetail
@inline to_indices(A, inds, I::Tuple{Not, Vararg{Any}}) =
(setdiff(uncolon(inds, (:, tail(I)...)), I[1].vals), to_indices(A, _maybetail(inds), tail(I))...)
to_indices (generic function with 9 methods)
julia> A = reshape(1:12, 3, 4);
A[Not(2), Not(2:3)]
2×2 Array{Int64,2}:
1 10
3 12
julia> view(A, Not(2), Not(2:3))
2×2 SubArray{Int64,2,Base.ReshapedArray{Int64,2,UnitRange{Int64},Tuple{}},Tuple{Array{Int64,1},Array{Int64,1}},false}:
1 10
3 12 Of course, some of this is nasty because of generalized linear indexing; that's why I'm leaning on the internal import Base: to_indices, _maybetail, tail
@inline to_indices(A::SymInt, inds, I::Tuple{Symbol, Vararg{Any}}) =
(fill(I[1]), to_indices(A, _maybetail(inds), tail(I))…) So to make this simpler, I added fallbacks to the existing So we can either document that extending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. A couple of questions:
- I'm a little confused about transforming Colon->Slice, can you explain a bit why that's necessary?
- You haven't by any chance tested the impact on AxisArrays.jl and Interpolations.jl, have you?
first(Iterators.drop(A, i-1)) | ||
end | ||
function to_indexes(I...) | ||
depwarn("to_indexes is deprecated; pass both the source array and tuple of indices to `to_indices(A, I)` instead.", :to_indexes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more helpful to say "pass both the source array A
and indices as to_indices(A, $(I...))
"?
@@ -1144,6 +1144,39 @@ eval(Base.Dates, quote | |||
recur{T<:TimeType}(fun::Function, start::T, stop::T; step::Period=Day(1), negate::Bool=false, limit::Int=10000) = recur(fun, start:step:stop; negate=negate) | |||
end) | |||
|
|||
# Index conversions revamp; #19730 | |||
function getindex(A::LogicalIndex, i::Int) | |||
depwarn("indexing into logical indices is deprecated; use iteration or collect the indices into an array instead.", :getindex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might not be entirely clear; maybe couple it more directly to the inputs?
_to_index(i) = to_index(I) | ||
_to_index(c::Colon) = c | ||
function getindex(::Colon, i) | ||
depwarn("indexing into Colons is deprecated; convert Colons to Slices with to_indices", :getindex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise an example would be helpful, here and below.
Colons (:) are used to signify indexing entire objects or dimensions at once. | ||
Very few operations are defined on Colons directly; instead they are converted | ||
to `Slice`s upon indexing (within `to_indices`). | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should go into the stdlib doc index
@@ -320,7 +320,7 @@ A = copy(reshape(1:120, 3, 5, 8)) | |||
sA = view(A, 2:2, 1:5, :) | |||
@test strides(sA) == (1, 3, 15) | |||
@test parent(sA) == A | |||
@test parentindexes(sA) == (2:2, 1:5, :) | |||
@test parentindexes(sA) == (2:2, 1:5, Base.Slice(1:8)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably deprecate parentindexes, it doesn't seem that useful anymore.
@@ -4,7 +4,7 @@ | |||
module IteratorsMD | |||
import Base: eltype, length, size, start, done, next, last, in, getindex, | |||
setindex!, linearindexing, min, max, zero, one, isless, eachindex, | |||
ndims, iteratorsize, to_index | |||
ndims, iteratorsize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, with GitHub's new behavior of not showing big diffs, I almost missed the most interesting part of this...
index_lengths() = () | ||
@inline index_lengths(::Real, rest...) = (1, index_lengths(rest...)...) | ||
@inline index_lengths(A::AbstractArray, rest...) = (length(A), index_lengths(rest...)...) | ||
@inline index_lengths(A::Slice, rest...) = (length(indices1(A)), index_lengths(rest...)...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
mask::A | ||
sum::Int | ||
LogicalIndex(mask::A) = new(mask, countnz(mask)) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be interesting to see whether there are any performance consequences---could potentially be an extra trip through mask
for some algorithms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; in general I was surprised how hard I had to fight to match the allocating find
algorithm. find(::BitArray)
is wonderfully optimized and tough to beat.
calls `to_indices(A, indices(A), I)`, which then recursively walks through both | ||
the given tuple of indices and the dimensional indices of `A` in tandem. | ||
""" | ||
to_indices(A, I::Tuple) = (@_inline_meta; to_indices(A, indices(A), I)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to pass A
once you've called indices(A)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but the advantage is that it allows for a simpler to_index(A, i)
method for array types to override.
@inline function _getindex{N}(l::LinearIndexing, A::AbstractArray, I::Vararg{Union{Real, AbstractArray, Colon},N}) # TODO: DEPRECATE FOR #14770 | ||
@boundscheck checkbounds(A, I...) | ||
_unsafe_getindex(l, _maybe_reshape(l, A, I), I...) | ||
@generated function _getindex(l::LinearIndexing, A::AbstractArray, I::Union{Real, AbstractArray}...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Real
or Int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this could be tightened eventually. It's not strictly necessary, though.
@nanosoldier |
3bf75e6
to
1b2b738
Compare
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Is it concerning that some of these tests took 0.00 times as long on this branch as master? That suggests to me, that something is broken. Assuming these were not a mistake, nanosoldier seems to quite like this. |
The results here are definitely real. This refactor allows Julia+LLVM to drastically reduce and/or eliminate the creation of some heap-allocated temporaries inside the loops. This leads directly to improvements of several orders of magnitude since the only other "work" being done each iteration is a simple addition. And with ranges it's even better than 100x: the loop becomes simple enough that it can be entirely replaced with a constant-time algorithm. I've added two commits: one addresses your comment @tkelman, and the other adds an optimization for indexing into linear fast arrays with multidimensional logical masks by allowing the LogicalIndex to be a collection of linear indices in that case. |
This is lovely work, @mbauman. |
Yes, this is an awesome improvement. I still don't really understand the purpose of the |
Thanks! @JeffBezanson - Slice is really only required to support the current semantics of OffsetArrays, where indexing Edit: Ah, the other reason for |
This has been causing an abstractarray test failure, but the exit status gets lost when tests run in parallel |
Fixes the test failure accidentally introduced in JuliaLang#19730
That's a simple fix, but do we have an issue for |
Oops – that's two bugs. The lost exit status one is actually worse. |
not yet, but it's been happening for at least a couple weeks. first thing I intend to try is disabling the anonymous module change Yichao made edit: opened as #20027 |
This patch dramatically simplifies the fallback indexing algorithms for arrays
by standardizing the way in which the passed indices are converted
to_index
and by rigorously defining what a supported index type is.
The very first thing that occurs within the fallback indexing methods is that
the passed indices are converted to a supported indexing type by
to_indices(...)
. A supported index type is eitherInt
or anAbstractArray
of indices. This means that it is at this point that both
Colon
and arrays ofbooleans get converted to collections of indices (as specialized array types:
Slice
andLogicalIndex
, respectively). This drastically simplifies muchof the internal logic. Whereas we had previously encoded the special behaviors
of
:
indexing in about four places, this patch does it once and then storesthe result in a
Slice
AbstractVector. Similarly,CartesianIndex
is nowconverted within
to_indices
, as well.In addition to simplifying the internal definitions, this patch also makes it
easier for both external packages and Base to add new behaviors in a systematic
fashion. Simple extensions would include things like
Not
indices andEndpointRanges.jl.
Additionally, logical indexing is now represented by a specialized
LogicalIndex
type that enables fast iteration over the boolean mask withoutallocations. This allows index conversions to occur before checking bounds,
and should enable further optimizations for LinearSlow arrays.
Breaking Behaviors:
Colon
toSlice
.Non-breaking behavior changes:
Colon
as pseudo-collections of integers.TODO:
runbenchmarks(ALL, vs = ":master")
to_indices
? YES. I think I'd be inclined to continue to not exportto_index
, particularly now that it's not the entire story. You must callto_indices
to get the complete conversion, but you can extendto_index
to provide simpler customizations. Exported to_indices.