-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
findfirst and findnext for general iterables #15755
Conversation
Looks like this passes all tests. Since this fixes a bug, any concerns about merging? Despite the bigger-picture concerns raised above (which could be addressed separately), we don't do anything very useful here now. Master: julia> p = Base.product(1:3,1:2)
Base.Prod2{UnitRange{Int64},UnitRange{Int64}}(1:3,1:2)
julia> findfirst(p, (2,2))
ERROR: MethodError: no method matching getindex(::Base.Prod2{UnitRange{Int64},UnitRange{Int64}}, ::Int64)
in findnext(::Base.Prod2{UnitRange{Int64},UnitRange{Int64}}, ::Tuple{Int64,Int64}, ::Int64) at ./array.jl:732
in findfirst(::Base.Prod2{UnitRange{Int64},UnitRange{Int64}}, ::Tuple{Int64,Int64}) at ./array.jl:738
in eval(::Module, ::Any) at ./boot.jl:237
julia> using Iterators
julia> findfirst(chain(1:3, ['a', 'b', 'c']), 'b')
ERROR: MethodError: no method matching length(::Iterators.Chain)
in findnext(::Iterators.Chain, ::Char, ::Int64) at ./array.jl:731
in findfirst(::Iterators.Chain, ::Char) at ./array.jl:738
in eval(::Module, ::Any) at ./boot.jl:237
julia> findfirst(Iterators.product(1:3,1:2), (2,2))
ERROR: MethodError: no method matching getindex(::Iterators.Product, ::Int64)
in findnext(::Iterators.Product, ::Tuple{Int64,Int64}, ::Int64) at ./array.jl:732
in findfirst(::Iterators.Product, ::Tuple{Int64,Int64}) at ./array.jl:738
in eval(::Module, ::Any) at ./boot.jl:237 This PR: julia> p = Base.product(1:3,1:2)
Base.Prod2{UnitRange{Int64},UnitRange{Int64}}(1:3,1:2)
julia> state = findfirst(p, (2,2))
Nullable{Tuple{Int64,Int64,Nullable{Int64},Bool}}((2,3,Nullable{Int64}(2),false))
julia> item, newstate = next(p, get(state)); item
(2,2)
julia> using Iterators
julia> findfirst(chain(1:3, ['a', 'b', 'c']), 'b')
Nullable{Tuple{Int64,Int64}}((2,2))
julia> findfirst(Iterators.product(1:3,1:2), (2,2))
Nullable{Tuple{Array{Any,1},Array{Any,1}}}((Any[3,3],Any[2,2])) |
My only concern here would be consistency. How many other functions in Base return a If this is the route we want to go, I'd rather see it done on a larger scale, if useful. At the least, I'd prefer the different versions of (I'm ok if the decision goes otherwise, of course.) |
-2 on using zero to indicate failure .. for the same reasons you each mention (at least)
|
Yeah. Now that we have Nullable (which we certainly didn't when this infrastructure was first developed), it seems like the right way to return values from "find"-type operations that might not find anything. Conceptually, the same problem is encountered with julia> match(r"b", "abcd")
RegexMatch("b")
julia> match(r"e", "abcd")
julia> @code_warntype match(r"e", "abcd")
Variables:
#self#::Base.#match
r::Regex
s::ASCIIString
Body:
begin # regex.jl, line 180:
return (Base.match)(_2::Regex,$(Expr(:new, :((top(getfield))(Core,:UTF8String)::Type{UTF8String}), :((top(convert))(Array{UInt8,1},(top(getfield))(_3::ASCIIString,:data)::Array{UInt8,1})::Array{UInt8,1}))),1,(Base.box)(UInt32,(Base.checked_trunc_uint)(UInt32,0)))::Union{RegexMatch,Void}
end::Union{RegexMatch,Void} But with both of these, changing the type of the return value in a non-compatible way seems to require a deprecation strategy like the (complicated) one I outlined above. Unless someone else has a bright idea? |
Returning a But the inconsistency depending on the array type is annoying. I'd be in favor of keeping the API consistent, whatever the solution we choose. An alternative deprecation path would be to add a Cc: @johnmyleswhite |
I would love to see us use Nullables more often, but I think this is a place where API design is being held back by performance: until all Nullables are stack allocated, I doubt we can move over to using them consistently (even though they should be stack allocated in this use case). Maybe open an issue for standardizing on them for 0.6? |
Given that the deprecation path is complex, we could start providing a syntax for it ASAP, so that we're ready when performance allows flipping the switch. |
Marking it as 0.5 since we need to decide now whether to add a deprecation or not, to allow for a change in 0.6. |
Adding a "decision" label here. The coding issues are trivial, the key issue is whether we do it and, if so, agreement on a deprecation strategy. I propose doing it and introducing the |
I think this needs to be worked out with the rest of the collections API, which also needs work. |
Which is to say, not yet, unless there's a real deal breaker in here. I don't think we can fix the collections API in bits and pieces – it needs to be done holistically – and we don't really have the API design bandwidth to figure this out in 0.5. |
Very reasonable; thanks for making a decision! |
As a data point, Scala's |
I think functions like this that operate on indices should require some kind of indexed collection --- it's not really useful to get the index of something in a general iterable. This question also comes up in #22907. |
If we only consider collections where indices are integers, then |
Another thought: returning an index-value pair and use |
Rather than introducing an ad-hoc approach like this, I'd rather use the |
I think instead we can add searching iterators, so this can be handled by |
This fixes #15723. While on balance I think it's an improvement, there are a few negatives:
findnext
returns anInt
forAbstractArray
s, and aNullable{T}
for any other iterable. This is a consequence of our choice to use 0 to indicate failure-to-find for arrays; for a general iterable, there is no type-stable construct that can safely return anInt
. It's regrettable whenever the general iterable API disagrees with the one for arrays.start - 1
and then test whetherfindnext(A, start) < start
. But presumably everyone who usesfindnext
tests whetherfindnext(A, start) == 0
, so changing the sentinel value would break a lot of code. Of course, if we change to returning aNullable
even for arrays, this issue goes away, too.The only solution I can think of for these problems is to introduce
findnext(FromFuture, A, start)
, deprecatefindnext(A, start)
, and make a julia release. Then deprecatefindnext(FromFuture, A, start)
in favor of the new version offindnext(A, start)
in the following cycle.