Add eachindex() for AbstractString and Associative #11708

nalimilan · 2015-06-14T18:26:02Z

Fixes #11649 which was only about strings, but in the end I didn't see the point in restricting this to strings: it could be useful for any other type which would use unevenly spaced or non-integer indices. Actually, it's very similar to Enumerate, but returning the actual indices which can be passed to getindex (while Enumerate returns the element number), and of course not returning the value.

The potential issue with this very general approach is that I'm using next instead of nextind, which is wasteful since we don't use the value. We could restore nextind, which is currently deprecated for non-strings, if we want to keep this general approach. But in my tests using nex or nextind doesn't make any difference for strings (see below).

Either way, it's not very efficient. Calling eachindex and indexing the string is about twice slower as iterating over it directly for a UTF8String. Indeed, it appears that next is not inlined in either case. Is that expected?

Another funny result is that with an ASCIIString, using nextind in eachindex is actually faster than iterating over the string. This is because next is inlined in that case, but not when iterating. I'm not sure why, but this seems to indicate something is wrong in the standard behavior.

julia> function f(s)
           [c for c in s]
       end
f (generic function with 1 method)

julia> function g(s)
           [s[i] for i in eachindex(s)]
       end
g (generic function with 1 method)

julia> s = "some string but quite long to make the test interesting €"
"some string but quite long to make the test interesting €"

julia> f(s); g(s);

julia> @time for i in 1:10000000; f(s); end
   7.724 seconds      (10000 k allocations: 3204 MB, 1.48% gc time)

julia> @time for i in 1:10000000; g(s); end
  15.677 seconds      (20000 k allocations: 3357 MB, 1.21% gc time)

And with nextind:

julia> Base.next(e::EachIndex, state) = (state, nextind(e.itr, state))
next (generic function with 60 methods)

julia> function g(s)
           [s[i] for i in eachindex(s)]
       end
g (generic function with 1 method)

julia> g(s);

julia> @time for i in 1:10000000; g(s); end
  16.803 seconds      (20000 k allocations: 3357 MB, 1.13% gc time)

JeffBezanson · 2015-06-14T18:50:30Z

I'm not sure we want a definition as generic as this. The second value returned by next is not necessarily an index.

nalimilan · 2015-06-14T18:54:35Z

@JeffBezanson Right. So would you prefer a definition just for strings, or extending nextind for all iterables?

nalimilan · 2015-06-18T21:16:25Z

Bump!

mbauman · 2015-06-18T21:27:39Z

I'm not convinced that this should be defined for all iterables. This is currently wrong for lots of base iterables — dict, set, task, zip, …

I think that this is something that non-array types should opt into themselves on a case-by-case basis.

nalimilan · 2015-06-18T21:33:32Z

OK, I'll change it to be specific to strings.

nalimilan · 2015-06-18T21:47:33Z

Don't you think there should also be a (different) definition for Dicts too? After all, they can be indexed too: you just need to iterate over keys.

mbauman · 2015-06-18T22:09:39Z

Don't you think there should also be a (different) definition for Dicts too

Yes, possibly. I'm trying to think of use-cases where it'd be useful to code in the general instead of just using keys. Really, it'd be more comparable to a hypothetical each_stored_index for sparse matrices.

nalimilan · 2015-06-19T21:08:47Z

I've updated the PR. Actually, eachindex for Associative can be exactly the same as keys, which is cool since it avoids duplication.

mbauman · 2015-06-20T04:07:26Z

👍 LGTM.

nalimilan · 2015-06-20T09:21:13Z

Thanks! I'll merge this today if nobody objects.

kmsquire · 2015-06-20T15:25:51Z

Are there thoughts about the fact that eachindex(d::Dict) and keys(d::Dict) are the same function? On one hand, I like the fact that eachindex is consistent across (some) indexible containers. On the other, I don't know if two functions with different names should do exactly the same thing.

ScottPJones · 2015-06-20T16:35:46Z

It doesn't bother me, really... length and sizeof do the same thing for ASCIIString, for example, but for other types, they don't.

nalimilan · 2015-06-20T17:44:21Z

Yeah. I wondered whether deprecating keys in favor of eachindex would make sense, but keys sounds a much nicer name for working with dicts. As dicts are only a corner case for eachindex, I don't think the duplication is really an issue.

nalimilan · 2015-06-21T10:50:15Z

Thanks to @mbauman's new docs about interfaces, I realized it would be useful to add eltype(::EachStringIndex) too. I've updated the PR, and will merge it if tests pass.

There's no reason not to allow this. Fixes #11649.

Add eachindex() for AbstractString and Associative

nalimilan mentioned this pull request Jun 14, 2015

Support eachindex on strings? #11649

Closed

nalimilan force-pushed the nl/eachindex branch from 9fb0dae to a0a0a84 Compare June 19, 2015 21:06

nalimilan changed the title ~~Add eachindex() fallback for arbitrary iterables~~ Add eachindex() for AbstractString and Associative Jun 19, 2015

nalimilan mentioned this pull request Jun 19, 2015

next not inlined for ASCIIStrings #11768

Closed

nalimilan force-pushed the nl/eachindex branch from a0a0a84 to d09adda Compare June 21, 2015 10:48

nalimilan force-pushed the nl/eachindex branch from d09adda to 909ef0a Compare June 21, 2015 13:00

Add eachindex() for AbstractStrings and Associative

9ef327d

There's no reason not to allow this. Fixes #11649.

nalimilan force-pushed the nl/eachindex branch from 909ef0a to 9ef327d Compare June 21, 2015 13:39

nalimilan added a commit that referenced this pull request Jun 21, 2015

Merge pull request #11708 from JuliaLang/nl/eachindex

a94da23

Add eachindex() for AbstractString and Associative

nalimilan merged commit a94da23 into master Jun 21, 2015

nalimilan deleted the nl/eachindex branch June 21, 2015 17:43

timholy mentioned this pull request Jun 26, 2015

Rename and generalize the linear indexing trait? #11879

Closed

nalimilan mentioned this pull request Aug 26, 2015

Add keys(::Array) and values(::Array) #12811

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add eachindex() for AbstractString and Associative #11708

Add eachindex() for AbstractString and Associative #11708

nalimilan commented Jun 14, 2015

JeffBezanson commented Jun 14, 2015

nalimilan commented Jun 14, 2015

nalimilan commented Jun 18, 2015

mbauman commented Jun 18, 2015

nalimilan commented Jun 18, 2015

nalimilan commented Jun 18, 2015

mbauman commented Jun 18, 2015

nalimilan commented Jun 19, 2015

mbauman commented Jun 20, 2015

nalimilan commented Jun 20, 2015

kmsquire commented Jun 20, 2015

ScottPJones commented Jun 20, 2015

nalimilan commented Jun 20, 2015

nalimilan commented Jun 21, 2015

Add eachindex() for AbstractString and Associative #11708

Add eachindex() for AbstractString and Associative #11708

Conversation

nalimilan commented Jun 14, 2015

JeffBezanson commented Jun 14, 2015

nalimilan commented Jun 14, 2015

nalimilan commented Jun 18, 2015

mbauman commented Jun 18, 2015

nalimilan commented Jun 18, 2015

nalimilan commented Jun 18, 2015

mbauman commented Jun 18, 2015

nalimilan commented Jun 19, 2015

mbauman commented Jun 20, 2015

nalimilan commented Jun 20, 2015

kmsquire commented Jun 20, 2015

ScottPJones commented Jun 20, 2015

nalimilan commented Jun 20, 2015

nalimilan commented Jun 21, 2015