Skip to content
This repository has been archived by the owner on May 4, 2019. It is now read-only.

map revisions #128

Merged
merged 4 commits into from
Jul 12, 2016
Merged

map revisions #128

merged 4 commits into from
Jul 12, 2016

Conversation

davidagold
Copy link
Contributor

@davidagold davidagold commented Jul 6, 2016

This is an attempt to transition away from the metaprogramming-heavy approach to map we took for Julia v0.4. Admittedly, this does introduce the use of generated functions in n-arg map/map!. Does anybody have any suggestions as to how best maintain compatibility with 0.4? What I have here seems kind of crude.

There's a lot of code repetition due to the 1-arg, 2-arg and n-arg methods for map. The Base map(f, A) regime calls collect_similar(A, Generator(f, A)), but I haven't found a way to collect over a Generator{NullableArray, F} without creating a lot of unnecessary Nullable objects -- so, hooking into that interface isn't really an option. I'm not sure the best way to reduce the code footprint here.

Here are the significant points for this PR:

  • fix map is broken on v0.0.5 #130
  • remove as much dependence on metaprogramming as possible
  • Settle on well-defined behavior for the two "emtpy" cases: (i) map over empty NullableArray and (ii) map over non-empty but all-nullNullableArray`.

For both cases (i) and (ii) above, the call to map can be lifted or not. If map is not lifted, it falls back on the implementation in Base, which differs between 0.4 and 0.5 (and also between the 1-arg, 2-arg and n-arg implementations). On 0.5, we have the following four combinations of behaviors:

  1. map(f, X; lift=false), where X is empty. On 0.5, now that 0.5-style comprehensions JuliaLang/julia#16622 has been merged, this uses type inference (Base.return_types) to select the type parameter for the returned empty NullableArray{T}. Note that, since f is not lifted, return_types is called on f, (eltype(X),). If somewhere down the line a method specialized on Nullable arguments is missing, this will return an empty NullableArray{Union{}}:
julia> X = NullableArray(Int[])
0-element NullableArrays.NullableArray{Int64,1}

julia> f(x) = 5*x
f (generic function with 1 method)

julia> map(f, X)
0-element NullableArrays.NullableArray{Union{},1}

julia> h(x::Nullable) = Nullable(5*x.value, x.isnull)
h (generic function with 1 method)

julia> map(h, X)
0-element NullableArrays.NullableArray{Int64,1}
  1. map(f, X; lift=true), where X is empty. This uses type inference (Base.return_types) to select the type parameter for the returned empty NullableArray. Since f is lifted, return_types is called on f, (inner_eltype(X),), where inner_eltype(X::NullableArray{T}) = T:
julia> map(f, X; lift=true)
0-element NullableArrays.NullableArray{Int64,1}
  1. map(f, X; lift=false) where X is non-empty but all null. The returned NullableArray{T} does not depend on type inference --- it depends entirely on the behavior of f applied to empty Nullable{T} objects.
julia> map(f, Y)
ERROR: MethodError: no method matching *(::Int64, ::Nullable{Int64})
Closest candidates are:
  *(::Any, ::Any, ::Any, ::Any...)
  *{S}(::Nullable{Union{}}, ::Nullable{S})
  *{T<:Union{Int128,Int16,Int32,Int64,Int8,UInt128,UInt16,UInt32,UInt64,UInt8}}(::T<:Union{Int128,Int16,Int32,Int64,Int8,UInt128,UInt16,UInt32,UInt64,UInt8}, ::T<:Union{Int128,Int16,Int32,Int64,Int8,UInt128,UInt16,UInt32,UInt64,UInt8})
  ...
 in _collect(::NullableArrays.NullableArray{Int64,1}, ::Base.Generator{NullableArrays.NullableArray{Int64,1},#f}, ::Base.EltypeUnknown, ::Base.HasShape) at ./array.jl:283
 in #map#11 at /Users/David/.julia/v0.5/NullableArrays/src/map.jl:108 [inlined]
 in map(::Function, ::NullableArrays.NullableArray{Int64,1}) at /Users/David/.julia/v0.5/NullableArrays/src/map.jl:108
 in eval(::Module, ::Any) at ./boot.jl:234
 in macro expansion at ./REPL.jl:92 [inlined]
 in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:46

julia> map(h, Y)
3-element NullableArrays.NullableArray{Int64,1}:
 #NULL
 #NULL
 #NULL
  1. map(f, X; lift=true) where X is non-empty but all null. This does use type inference to determine the return type of the returned NullableArray.
julia> map(f, Y; lift=true)
3-element NullableArrays.NullableArray{Int64,1}:
 #NULL
 #NULL
 #NULL

Brief performance summary (on 0.5) (NOTE: revised as of 7/12/16):

using NullableArrays
using BenchmarkTools
n = 1_000_000
A = rand(n)
M = rand(Bool, n)
X = NullableArray(A)
Y = NullableArray(A, M)
As = [ rand(n) for i in 1:5 ]
Ms = [ rand(Bool, n) for i in 1:5 ]
Xs = [ NullableArray(A) for A in As ]
Ys = [ NullableArray(A, M) for (A, M) in zip(As, Ms)]
f(x) = 5*x
f(x::Nullable) = Nullable(5)*x
f(xs...) = prod(xs)

julia> @benchmark map(f, A)
BenchmarkTools.Trial: 
  samples:          1094
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  7.63 mb
  allocs estimate:  4
  minimum time:     2.79 ms (0.00% GC)
  median time:      3.38 ms (0.00% GC)
  mean time:        4.57 ms (21.49% GC)
  maximum time:     21.65 ms (0.00% GC)

julia> @benchmark map(f, X)
BenchmarkTools.Trial: 
  samples:          400
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  8.58 mb
  allocs estimate:  12
  minimum time:     9.47 ms (0.00% GC)
  median time:      12.01 ms (0.00% GC)
  mean time:        12.51 ms (8.02% GC)
  maximum time:     36.04 ms (0.00% GC)

julia> @benchmark map(f, X; lift=true)
BenchmarkTools.Trial: 
  samples:          895
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  8.58 mb
  allocs estimate:  9
  minimum time:     3.83 ms (0.00% GC)
  median time:      4.59 ms (0.00% GC)
  mean time:        5.58 ms (16.47% GC)
  maximum time:     65.67 ms (0.00% GC)

julia> @benchmark map(f, Y)
BenchmarkTools.Trial: 
  samples:          298
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  8.58 mb
  allocs estimate:  12
  minimum time:     14.41 ms (0.00% GC)
  median time:      16.95 ms (0.00% GC)
  mean time:        16.82 ms (5.38% GC)
  maximum time:     21.59 ms (18.36% GC)

julia> @benchmark map(f, Y; lift=true)
BenchmarkTools.Trial: 
  samples:          582
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  8.58 mb
  allocs estimate:  9
  minimum time:     6.05 ms (0.00% GC)
  median time:      7.84 ms (0.00% GC)
  mean time:        8.60 ms (12.49% GC)
  maximum time:     28.21 ms (44.18% GC)

julia> @benchmark map(f, As...)
BenchmarkTools.Trial: 
  samples:          222
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  7.63 mb
  allocs estimate:  9
  minimum time:     19.83 ms (0.00% GC)
  median time:      22.67 ms (0.00% GC)
  mean time:        22.59 ms (4.04% GC)
  maximum time:     30.93 ms (8.61% GC)

@benchmark map(f, Xs...)
BenchmarkTools.Trial: 
  samples:          2
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  2.10 gb
  allocs estimate:  61995904
  minimum time:     4.87 s (9.77% GC)
  median time:      4.97 s (9.63% GC)
  mean time:        4.97 s (9.63% GC)
  maximum time:     5.07 s (9.49% GC)

julia> @benchmark map(f, Xs...; lift=true)
BenchmarkTools.Trial: 
  samples:          291
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  8.58 mb
  allocs estimate:  15
  minimum time:     12.57 ms (0.00% GC)
  median time:      15.94 ms (0.00% GC)
  mean time:        17.18 ms (5.67% GC)
  maximum time:     80.23 ms (0.00% GC)

julia> @benchmark map(f, Ys...)
BenchmarkTools.Trial: 
  samples:          1
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  2.10 gb
  allocs estimate:  61995904
  minimum time:     5.30 s (9.30% GC)
  median time:      5.30 s (9.30% GC)
  mean time:        5.30 s (9.30% GC)
  maximum time:     5.30 s (9.30% GC)

@benchmark map(f, Ys...; lift=true)
BenchmarkTools.Trial: 
  samples:          350
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  8.58 mb
  allocs estimate:  15
  minimum time:     12.21 ms (0.00% GC)
  median time:      14.08 ms (0.00% GC)
  mean time:        14.31 ms (6.22% GC)
  maximum time:     22.85 ms (11.79% GC)

It seems pretty clear that, as far as performance goes, we ought to transition towards lifting by default. I think this seems reasonable for most statistics/data science-related use cases. To do so, would we need to deprecate the current behavior for a release cycle? [7/12/16]: Lifting will probably yield best performance, but clearly something is sub-optimal with the (n-arg especially) non-lifted map implementation.

@nalimilan @johnmyleswhite thoughts?

-Specifically include tests for 2 arg map/map!
-Reorganize module along 1 arg, 2 arg, n arg 
methods
-Correct mistake (n args tests tested lifted map!
twice, failed to test lifted map)
@davidagold davidagold force-pushed the lmap branch 3 times, most recently from 6478caa to 3bc6fa2 Compare July 6, 2016 18:25
Remove metaprogramming from previous regime
(though introduce use of generated functions for n
arg map)

include compatibility for versions <=
JuliaLang/julia@a2ce230
@nalimilan
Copy link
Member

I wouldn't worry too much about 0.4 compatibility, keeping the existing code in a separate file is fine.

Regarding lifting or not, that's a tough call. In practice, I guess most people will use the lifting behavior, but doing this by default would mean we break the AbstractArray API by not applying the function on elements of the type of eltype(x). Performance-wise, the benchmarks show a lot of allocations which I wouldn't expect: are you sure there isn't a problem somewhere? (I don't master that code at all.)

Finally, I wonder about the difference between these two benchmarks:

julia> @benchmark map(f, As...)
BenchmarkTools.Trial: 
  samples:          158
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  7.63 mb
  allocs estimate:  10
  minimum time:     27.83 ms (0.00% GC)
  median time:      30.86 ms (0.00% GC)
  mean time:        31.65 ms (2.66% GC)
  maximum time:     40.85 ms (6.13% GC)

julia> @benchmark map(f, Xs...; lift=true)
BenchmarkTools.Trial: 
  samples:          354
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  8.58 mb
  allocs estimate:  32
  minimum time:     12.27 ms (0.00% GC)
  median time:      13.62 ms (0.00% GC)
  mean time:        14.18 ms (6.01% GC)
  maximum time:     23.19 ms (10.96% GC)

How it is that the NullableArray version is twice faster? :-)

@davidagold
Copy link
Contributor Author

@nalimilan

w/r/t the allocations on non-lifted map: I thought these are due to all the Nullable objects being created by indexing. Or should they not show up? There may be a problem.

w/r/t beating map(f, As...): Haha, well... there are two primary differences (in my eyes) between the implementation of map(f, As...) here and in Base. The first is that the Base version collects over Generator(f, As...), and the latter zips all of the As together in a Zip object. I don't know if this introduces overhead, but it might be less efficient than working with the As directly, as we do here. The second difference has to do with indexing. The Base implementation uses ith_all(i, As)... to access the ith element of each argument Array, which involves a fair amount of splatting. What we do here is use a macro @fcall to splice f(Xs[1][i], Xs[2][i], Xs[3][i], ...) into the _liftedmap_to! loop body. This does require that _liftedmap_to! be a generated function.

Whatever the case, if you write map for regular Arrays in the style I've written them for NullableArrays, you do see a speed up over the current Base implementation: https://gist.github.com/davidagold/d7088aae22f23d383e5bf1f26aa1a045

using BenchmarkTools
n = 1_000_000
As = [ rand(n) for i in 1:5 ]
f(xs...) = prod(xs)

julia> @benchmark mymap(f, As...)
BenchmarkTools.Trial: 
  samples:          786
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  7.63 mb
  allocs estimate:  8
  minimum time:     4.59 ms (0.00% GC)
  median time:      5.35 ms (0.00% GC)
  mean time:        6.36 ms (14.65% GC)
  maximum time:     34.15 ms (0.00% GC)

julia> mymap(f, As...) == map(f, As...)
true

This suggests that the current Base n-arg map implementation leaves some performance on the table --- around 5x in this case.

@nalimilan
Copy link
Member

w/r/t the allocations on non-lifted map: I thought these are due to all the Nullable objects being created by indexing. Or should they not show up? There may be a problem.

Not sure. But since it's an immutable, it should be possible to avoid the allocation at least in some scenarios. For example, by inlining the function which creates the Nullable (the rest of the work could be wrapped in a @noinline function). You'd need to ask more expert people about that.

As regards being faster than Base's map: could you open an issue to discuss this, or ask for feedback on Slack?

Removes metaprogramming-heavy 0.4 map regime,
which seems to be responsible for strange
intermittent Travis failures.
@codecov-io
Copy link

Current coverage is 80.24%

Merging #128 into master will decrease coverage by 1.37%

@@             master       #128   diff @@
==========================================
  Files            13         14     +1   
  Lines           729        825    +96   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            595        662    +67   
- Misses          134        163    +29   
  Partials          0          0          

Powered by Codecov. Last updated by 5ed6f7c...3a49e22

Specifications described in detail in #128
@davidagold
Copy link
Contributor Author

I'm going to merge this, since it passes on 0.4 and the map tests pass on nightly -- it's just the other stuff that's broken due to the introduction of the Base.OneTo range object that's causing tests to fail. The design choices above and their implementation don't need to be the final word, but I think they're a step in the right direction. And I feel it's important to fix #130.

@davidagold davidagold changed the title WIP: map for 0.5 map for 0.5 Jul 12, 2016
@davidagold davidagold changed the title map for 0.5 map revisions Jul 12, 2016
@davidagold davidagold merged commit e350e40 into master Jul 12, 2016
@ararslan ararslan deleted the lmap branch July 12, 2016 21:31
@ararslan
Copy link
Member

or ask for feedback on Slack

There's a Slack group?

@johnmyleswhite
Copy link
Member

A few of us chat on a private Slack channel.

@ararslan
Copy link
Member

Ah, okay. Org owners only, I presume?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

map is broken on v0.0.5
5 participants