Deprecate conversion to Int for "non-obvious" index types #10458

jakebolewski · 2015-03-09T20:21:08Z

Deprecate to_index(::Real) in favor of to_index(::Integer).
In the future require index values which are floating point, rationals, etc. to be explicitly converted to Int.
Delete tests that currently test for floating point indexing support.

mbauman · 2015-03-09T20:33:18Z

base/deprecated.jl

@@ -451,3 +451,27 @@ function push!(A)
    depwarn("push!(A) has been deprecated", :push!)
    A
 end
+
+typealias DepIdxTypes Union(FloatingPoint,MathConst,Rational)


It looks like you don't actually use this alias?

mbauman · 2015-03-09T20:33:44Z

👍

deprecate `to_index(::Real)` in favor of `to_index(::Integer)` remove tests that exercise floating point indexing minor doc change, add NEWS item

mbauman · 2015-03-09T21:13:15Z

Also, perhaps you should leave a note in deprecated.jl to "tighten up" the signatures of get/setindex once the Real to_index methods are removed. E.g., ack '(get|set)index.*\bReal\b' shows 52 lines to change.

timholy · 2015-03-09T22:25:49Z

I think I like this idea, but I'd ask to hold off on merging it. It may indeed be that this is the right direction to go in, but I'd like to make sure that folks have thought this through very carefully. In particular, I strongly advise against some vague notion of "cleanliness" or "minimalism" in making this decision: the much more important goal is handling ambiguities that arise for package authors when making argument 1 more specialized but argument 2 less specialized. Related: #10440.

In other words, I think we have to consider how do/will package authors overload getindex and setindex!? Can we set up base in a manner that decreases the likelihood of packages needing a half-screenful of

# Avoid ambiguity warning with blah blah
getindex(A::MyArray, ::Int) = ...
# Avoid ambiguity warning with ...
getindex(A::MyArray, ...)
# Avoid ...
...
# Now here is finally the definition I really wanted:
getindex(A::MyArray, ...)

Some of those "interception" definitions will be unavoidable, but requiring a whole heap of them gets pretty annoying.

As one example of where I think this gets complicated, for an InterpolationArray you'd want to be able to define indexing with FloatingPoint types. But you'd also like to allow DualNumbers as a really sweet way of testing your calculation of derivatives, and a DualNumber is not even a Real. So you'd want to define a method

getindex(A::InterpolationArray, x::Number...)

That works fine if Base defines a method

getindex(A::AbstractArray, x::Number...)

because argument 1 is more specialized, and the remaining arguments are "equally specialized", so this one definition suffices to control dispatch. But it will be ambiguous against

getindex(A::AbstractArray, x::Int...)

because the arguments have opposite specialization behavior. You could solve that particular ambiguity with a

getindex(A::InterpolationArray, x::Int...)

but having

getindex(A::AbstractArray, x::Real...)

is probably the worst of all worlds.

What makes this problem somewhat harder is that there is another goal that seems, in practice, to be partially competing. It would be really nice if most package authors only had to define indexing for scalar types, and that Base handles "hard stuff" like this:

getindex(A::AbstractArray, x::Union(ScalarType, AbstractVector)...)   # mixed scalar/vector indexing
getindex(A::AbstractArray, x::Union(ScalarType, CartesianIndex)...)   # fancy scalar indexing, coming soon

automatically for most array types. To disambiguate those, you need to have

getindex(A::AbstractArray, x::ScalarType...)

defined. If we make ScalarType=Int, this is likely to reduce some class of ambiguities but forces authors of an InterpolationArray, because they need to support something broader than Int, to write definitions to handle these "hard" cases. In other words, indexing over regions or lumping multiple axes into a single object are difficult tasks that, if made too narrow, cannot apply broadly to all AbstractArrays.

In other words, I think this is a complicated problem 😄. I've only just begun to think about this seriously, but perhaps someone out there has a clear analysis or argument that points the way to the fewest ambiguities for package authors, without making Base overly fussy for users? If so, please do share 😄.

Again, any change we make here is guaranteed to cause ambiguity warnings for package authors, no matter what kind of change we make. So, let's make the change just once in the 0.4 cycle, because (in my opinion) ambiguities are in practice among the more annoying problems to cope with.

timholy · 2015-03-09T22:46:03Z

I should add that my basic intuition is that we want to migrate Base to the extremes: for any methods for a specific array type, the type of the indexes should also be as narrowly-defined as is reasonable. For any method that we'd like to define as a fallback for all AbstractArrays (for example mixed scalar/vector indexing or very fancy scalar indexing), we should make the scalar type be as broad as possible (e.g., Number). That way it seems certain that package authors can overload without fear of ambiguity.

JeffBezanson · 2015-03-10T00:53:24Z

I'm not sure there should be any methods for indexing AbstractArray with all scalars.

timholy · 2015-03-10T01:25:50Z

I think it's worth considering whether we should define

getindex(A::AbstractArray, indexes::Union(ScalarType,AbstractVector)...)

because that would mean users wouldn't have to figure out how to implement indexing over a region---most users would only have to define scalar indexing and the rest would follow automatically from this fallback. But if we do that, we probably need to at least specify a "stub" method

getindex(A::AbstractArray, indexes::ScalarType...) = error("getindex not defined for ", typeof(A))

to avoid ambiguities with

getindex(A::MyArray, indexes::Union(ScalarType, SomeOtherType)...).

Having thought about this a little more: I think the design principle is to minimize the number of nodes in the method graph that possess branches.

timholy · 2015-03-10T01:54:06Z

In particular, the definition will soon be

getindex(A::AbstractArray, indexes::Union(ScalarType,AbstractVector)...) = slice(A, indexes...)

(or sub depending on people's taste).

mbauman · 2015-03-10T18:37:10Z

Very sensible and thoughtful, as usual, @timholy. Since packages (and base) have stabilized on ScalarType == Real, defining getindex(::AbstractArray, ::Union(Integer, AbstractVector, Colon)...) yields many pages of ambiguities against just about every single AbstractArray out there since we've just upped the precedence of Arg2 over Real.

Which is why I actually think the course of action here (doing the deprecation within to_index, without sharpening the method signatures) is very sensible.

I also really like the idea of giving AbstractArrays fancy indexing behavior by default, only requiring getindex(x, ::Int) for fast linear indexing arrays, or with the exact number of dimensions otherwise. Perhaps the best way to achieve this would be to only define getindex(A::AbstractArray, idxs...), which would dispatch to an internal function that we can more freely define methods for without worrying about ambiguity.

timholy · 2015-03-10T22:00:40Z

I initially assumed the scope of this PR was bigger, which is part of why I was a bit concerned. I agree that doing the deprecation via to_index is great, and that this PR is actually quite minimalistic in what it changes. Now that I've had a close look at the implementation, I have no objections to this being merged. But I also think we may want to take some additional steps.

I'm in the middle of creating a PR that changes scalar indexing to use ScalarIndex, currently typealiased to Real. We can encourage package authors to write their indexing in terms of ScalarIndex, and then during the 0.5 dev period switch it to typealias Integer. For any package that has already made the switch, they shouldn't see any ambiguity warnings.

I'm also trying to implement the fallback fancy indexing methods for a broader class of array types. This might not be very useful if 0.4 returns views from indexing, but to me that's beginning to look a bit unlikely, so perhaps we should at least fix up our current getindex behaviors.

It's an interesting idea to dispatch to functions not named getindex. We do some of that already, but indeed perhaps we should do more. EDIT: on the other hand, that's reminiscent of #10312, which in the end had a better solution.

timholy · 2015-03-11T02:25:46Z

@mbauman,

only requiring getindex(x, ::Int) for fast linear indexing arrays, or with the exact number of dimensions otherwise

definitely hit a chord with me---it seems like it might save a lot of duplicated code in trying to handle situations in which the number of indexes does not match the dimensionality of the array. I would be tempted to make the linear part optional, to be called if the user set the trait linearindexing(::MyArrayType) = LinearFast(). Otherwise, base indexing would "hand off" scalar indexing by filling out the dimensions with 1s or ind2sub as needed.

I don't think we have a good way to dispatch splats based on the number of arguments, and this little test:

f1(x...) = f2(x)

@noinline f2{N,T}(x::NTuple{N,T}) = N

function callf1(n, x...)
    s = 0
    for i = 1:n
        s += f1(x...)
    end
    s
end

callf1(1, 2, 5)
@allocated callf1(10^6, 2, 5)

does seem to suggest that life will not be all roses if we use a tuple as an intermediary. (At least, with our current tuples.)

So it almost seems we'd have to establish a new convention, for example that the new functions
getindexN and setindexN! get called only when the number of indexes matches the array dimensionality. People who write their own AbstractArray types might only have to define these two functions, plus size, ndims, and eltype, and yet have indexing that is as rich as that provided by Array, BitArray, and SubArray; they can of course more heavily overload getindex/setindex! when it makes sense.

Would introducing getindexN and setindexN! be reasonable? Or, is there another way to do this?

jakebolewski · 2015-03-11T20:29:17Z

@timholy I thought that this was the right balance between what we want (constraining the index types to integers for AbstractArray subtypes in Base) but leaves the door open to broader design decisions you talked about.

I am skeptical as well to defining one catchall fallback for AbstractArray's. Ambiguity warnings will eventually be shifted to runtime so making design decisions based on current behavior does not seem to be the best way to go. Whatever default behavior lands in Base will be hugely breaking to change at a later date so I agree that caution is warranted here.

timholy · 2015-03-11T21:15:49Z

Yep, I'm fine with merging this; if you're ready, I say go for it. My not yet submitted PR is based on this one; I'll need to think a bit more about whether that PR is a direction we want to go in.

Ambiguity warnings will eventually be shifted to runtime so making design decisions based on current behavior does not seem to be the best way to go.

The ambiguity warnings that would arise here are not the "useless" category like Warning: I have no clue how to add a DataArray{Int} and Image{Int} (not a problem unless I intended to try that very operation). In this case it's a question of precedence of the container type (the AbstractArray) and the index type, and that's a genuine ambiguity that affects dispatch on everyday operations.

There was a "baby" version of this in #10314, where we had to delete a class of conflicting methods, filter(r::Regex, container), because we decided that the container needed to have precedence. @mbauman's suggestion is basically applying the same logic here---don't put any types on the indexes whatsoever. That way getindex is almost infinitely overloadable. I think the key question is whether we can provide some convenient services to package authors and users without typing the index arguments. It would appear that this requires dispatching to internal hidden functions, or one giant stagedfunction.

Deprecate conversion to Int for "non-obvious" index types

mbauman · 2015-08-06T15:33:24Z

@ScottPJones moving the discussion here.

I know I was a pushing force for this, but I'm waffling on this a bit more. There are two reasons I liked this change.

It makes errors of the sort A[end/2] much more obvious — you should most likely be using A[end÷2].
I had been thinking it'd be nice to have a greater separation here so custom types could do something different when indexed by a float.

I still think 1 is a pretty good motivation. But I'm coming around to the fact that 42 and 42. are just too darn similar to have very different effects, even in a custom package. And if that's true, then the converse is as well.

toivoh · 2015-08-07T07:47:46Z

Regarding the difference between indexing with 42 and 42.0: I guess eg when interpolating integer data, indexing with 42 could give an integer and with 42.0 a float. So somewhat different effects at least.

…ng i/j with i÷j (see JuliaLang#10458)

…ng i/j with i÷j (see #10458) (cherry picked from commit d2ad66c) ref #14049

jakebolewski changed the title ~~Deprecate conversion of indexes not subtypes of Integer~~ Deprecate conversion to Int for "non-obvious" index types Mar 9, 2015

mbauman reviewed Mar 9, 2015
View reviewed changes

deprecate to_index(::Real), constrain indexing to Integers

173bd57

deprecate `to_index(::Real)` in favor of `to_index(::Integer)` remove tests that exercise floating point indexing minor doc change, add NEWS item

jakebolewski force-pushed the jcb/depfpidx branch from ada64ce to 173bd57 Compare March 9, 2015 20:39

jakebolewski added a commit that referenced this pull request Mar 12, 2015

Merge pull request #10458 from JuliaLang/jcb/depfpidx

59ae257

Deprecate conversion to Int for "non-obvious" index types

jakebolewski merged commit 59ae257 into master Mar 12, 2015

jakebolewski deleted the jcb/depfpidx branch March 12, 2015 01:57

mbauman mentioned this pull request Mar 13, 2015

RFC: Move Colon translation out of the parser #10331

Merged

4 tasks

timholy mentioned this pull request Mar 13, 2015

WIP: ReshapedArrays #10507

Closed

Jutho mentioned this pull request Mar 15, 2015

Generalize scalar indexing and implement reducedim without stagedfunctions #10524

Merged

mbauman mentioned this pull request Mar 16, 2015

Should we expect floating point indexing to be implemented? #10154

Closed

timholy added a commit that referenced this pull request Mar 17, 2015

Change CartesianIndex to use Integer, consistent with #10458

4cdd5bc

timholy added a commit that referenced this pull request Mar 17, 2015

Change CartesianIndex to use Integer, consistent with #10458

705efdc

timholy mentioned this pull request Aug 6, 2015

Stop preventing indexing with Numbers #12486

Closed

stevengj added a commit to stevengj/julia that referenced this pull request Nov 18, 2015

improve deprecation warning for non-integer indexing: suggest replaci…

c9e0704

…ng i/j with i÷j (see JuliaLang#10458)

stevengj mentioned this pull request Nov 18, 2015

improve deprecation warning for non-integer indexing #14049

Merged

stevengj added a commit to stevengj/julia that referenced this pull request Nov 19, 2015

improve deprecation warning for non-integer indexing: suggest replaci…

9004756

…ng i/j with i÷j (see JuliaLang#10458)

stevengj added a commit to stevengj/julia that referenced this pull request Nov 19, 2015

improve deprecation warning for non-integer indexing: suggest replaci…

8a94e6a

…ng i/j with i÷j (see JuliaLang#10458)

stevengj added a commit to stevengj/julia that referenced this pull request Nov 19, 2015

improve deprecation warning for non-integer indexing: suggest replaci…

d2ad66c

…ng i/j with i÷j (see JuliaLang#10458)

zhmz90 pushed a commit to zhmz90/julia that referenced this pull request Nov 21, 2015

improve deprecation warning for non-integer indexing: suggest replaci…

52f0cfe

…ng i/j with i÷j (see JuliaLang#10458)

stevengj added a commit that referenced this pull request Nov 29, 2015

improve deprecation warning for non-integer indexing: suggest replaci…

231dd4a

…ng i/j with i÷j (see #10458) (cherry picked from commit d2ad66c) ref #14049

mbauman mentioned this pull request May 24, 2016

Arraypocalypse Now and Then JuliaLang/LinearAlgebra.jl#255

Closed

27 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate conversion to Int for "non-obvious" index types #10458

Deprecate conversion to Int for "non-obvious" index types #10458

jakebolewski commented Mar 9, 2015

mbauman Mar 9, 2015

mbauman commented Mar 9, 2015

mbauman commented Mar 9, 2015

timholy commented Mar 9, 2015

timholy commented Mar 9, 2015

JeffBezanson commented Mar 10, 2015 via email

timholy commented Mar 10, 2015

timholy commented Mar 10, 2015

mbauman commented Mar 10, 2015

timholy commented Mar 10, 2015

timholy commented Mar 11, 2015

jakebolewski commented Mar 11, 2015

timholy commented Mar 11, 2015

mbauman commented Aug 6, 2015

toivoh commented Aug 7, 2015 via email

Deprecate conversion to Int for "non-obvious" index types #10458

Deprecate conversion to Int for "non-obvious" index types #10458

Conversation

jakebolewski commented Mar 9, 2015

mbauman Mar 9, 2015

Choose a reason for hiding this comment

mbauman commented Mar 9, 2015

mbauman commented Mar 9, 2015

timholy commented Mar 9, 2015

timholy commented Mar 9, 2015

JeffBezanson commented Mar 10, 2015 via email

timholy commented Mar 10, 2015

timholy commented Mar 10, 2015

mbauman commented Mar 10, 2015

timholy commented Mar 10, 2015

timholy commented Mar 11, 2015

jakebolewski commented Mar 11, 2015

timholy commented Mar 11, 2015

mbauman commented Aug 6, 2015

toivoh commented Aug 7, 2015 via email