Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change findfirst/findlast/findnext/findprev to return the same index type as keys() #25577

Merged
merged 2 commits into from
Jan 18, 2018

Conversation

nalimilan
Copy link
Member

@nalimilan nalimilan commented Jan 15, 2018

This is consistent with find. Also add needed last(::EachStringIndex) and prevind(::AbstractArray, ::CartesianIndex) methods.

Part of #10593.

findnext/findprev are not as generic as find/findfirst/findlast since their interface assumes that indices are ordered (comparable with <) and that they are an efficient way of keeping track of the iteration state. This works for arrays, tuples and strings, but not e.g. for dicts (and not even e.g. for OrderedDict, since the ordering of keys isn't reflected by <). Only a findeach function returning an iterator would allow that (we can always introduce it later).

It would be possible to support passing linear indices to findnext/findprev, and have them return linear indices in that case. This behavior would probably have to be restricted to AbstractArray. That would limit breakage compared with 0.6 a bit (though findfirst and findlast would still break). Should I add it?

@nalimilan nalimilan added the search & find The find* family of functions label Jan 15, 2018
base/array.jl Outdated
```
"""
findfirst(A) = findnext(A, 1)
findfirst(A) = isempty(A) ? nothing : findnext(A, first(keys(A)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling isempty seems to be necessary for full generality, since there is no guaranty that keys returns a range, and first could throw an error for non-ranges. Yet it's quite annoying to have to do that. Actually, this should be done in findnext, but I wonder about the performance impact, given that findnext isn't likely to be inlined (so the check will be performed on every call).

That's related to #25385. Maybe we should require that keys returns something on which first never fails for numeric indices?

Following that idea, I've added a first(::EachStringIndex) = 1 below for consistency with ranges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If findfirst is going to be generalized this much, it probably makes sense to rewrite it to iterate over pairs instead of calling findnext.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, my focus on findnext/findprev made me forget the big picture. I've added a commit to use the most generic approach, similar to find.

function Base.nextind(a::AbstractArray{<:Any,N}, i::CartesianIndex{N}) where {N}
_, ni = next(CartesianIndices(axes(a)), i)
return ni
end
function Base.prevind(a::AbstractArray{<:Any,N}, i::CartesianIndex{N}) where {N}
_, ni = next(Iterators.reverse(CartesianIndices(axes(a))), i)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works but looks a bit convoluted. Is there are simpler way of doing this? @timholy

@@ -1492,26 +1492,45 @@ cat(n::Integer, x::Integer...) = reshape([x...], (ntuple(x->1, n-1)..., length(x

## find ##

_pairs(A::Union{AbstractArray, AbstractDict, AbstractString, Tuple, NamedTuple}) = pairs(A)
_pairs(iter) = _pairs(IteratorSize(iter), iter)
_pairs(::Union{HasLength, HasShape}, iter) = zip(1:length(iter), iter)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When writing this I realized we could return CartesianIndex objects for HasShape iterators. But for consistency with arrays (and for convenience), we should return Int linear indices rather than one-dimensional CartesianIndex objects for 1D HasShape iterators. That's not possible at the moment without adding dimension type parameter, i.e. HasShape{N} as in #25356 (comment). Can easily be changed in a subsequent PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #25655.

…type as keys()

This is consistent with find(). We need to check whether the iterator is
empty first, since calling first(keys(A)) may throw an error. Also
add needed last(::EachStringIndex) and prevind(::AbstractArray, ::CartesianIndex) methods.
Also add tests, and remove @inferred calls which did not actually check
whether functions were inferred, but only whether the equality test was.
Change the _pairs() fallback to use Iterators.countfrom() so that an error
is raised by findlast() rather than assuming the last index is typemax(Int),
and use length() for iterators which support it so that findlast() works.
@JeffBezanson JeffBezanson merged commit 87cf094 into master Jan 18, 2018
@JeffBezanson JeffBezanson deleted the nl/findfirstlast branch January 18, 2018 22:18
@nalimilan nalimilan restored the nl/findfirstlast branch January 19, 2018 19:22
@nalimilan nalimilan deleted the nl/findfirstlast branch January 19, 2018 19:33
nalimilan added a commit that referenced this pull request Sep 11, 2022
…/`argmax`

This is the sentence used for `find*` functions, introduced by #25577.

Also change "the domain of `f`" to "`domain`" as the domain of `f`
can be a superset of the passed `domain`.
giordano pushed a commit that referenced this pull request Oct 31, 2023
…/`argmax` (#46705)

This is the sentence used for `find*` functions, introduced by #25577.

Also change "the domain of `f`" to "`domain`" as the domain of `f` can
be a superset of the passed `domain`.

(Spotted at JuliaData/DataFrames.jl#3139.)

---------

Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Lilith Orion Hafner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
search & find The find* family of functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants