Vectorization Roadmap #16285

stevengj · 2016-05-09T21:59:41Z

Now that #15032 is merged, here are the main remaining steps discussed in #8450, roughly in order that they should be implemented:

Improve output type-computation in broadcast to be at least as good as map (broadcast picking incorrect result type #4883). The trickest part is the ongoing discussion of what to do in the empty-array case (see Coherent map and comprehension, the imminent death of type_goto #11034).
Deprecate @vectorized functions like sin(x) in favor of sin.(x). Deprecate functions vectorized via @vectorize_(1|2)arg in favor of compact broadcast syntax #17302
Parse operators like a .+ b as broadcast(+, a, b). (See also support many more Unicode infix operators #6929 (comment), Allow users to define "dot" vectorized operators. #14544, and parse more dot operators, parse .= with assignment precedence #17393.) Existing overloaded functions .+(a, b) = ... can be deprecated in favor of overloading broadcast(::typeof(+), a, b). make "dot" operations (.+ etc) fusing broadcasts #17623

More speculative proposals ~~probably for the 0.6 timeframe~~ (suggested by @yuyichao):

Define nested f.(args...) calls as "fusing" broadcast operations at the syntax level. For example, sin.(x .+ cos.(x .^ sum(x.^2))) would turn (in julia-syntax.scm) into broadcast((x, _s_) -> sin(x + cos(x^_s_)), x, sum(broacast(^, x, 2))). Notice that the sum function (or any non-dot call) would be a "fusion boundary." The caller would be responsible for not using f.(args...) in cases where fusion would screw up side effects. fusion of nested f.(args) calls into a single broadcast call #17300
Define x .= ... as "fusing" calls to broadcast!, e.g. x .= sin.(x .+ y) would act in-place with a single loop. Again, this would occur at the syntax level, so the caller would be responsible for avoiding function calls with problematic side effects. (See make .+= et al. mutating LinearAlgebra.jl#119, but the lack of loop fusion at that time made .+= much less attractive.) treat .= as syntactic sugar for broadcast! #17510

The text was updated successfully, but these errors were encountered:

andyferris · 2016-05-10T03:51:41Z

Regarding the third point, can we do .= as broadcast! without worrying overly about loop fusion?

As in A .= B as broadcast!(x->x, A, B) as an element-wise copy. Of course, more complex expressions would require more complex loop fusion, but this gives some moderate gain. .+= and so-on are very low-hanging fruit for at least removing one layer of allocation very simply.

(Hmm... that last one makes me think if .sin= is good syntax? As in A .sin= B being like Julia 0.4 A[:] = sin(B[:]) I'm being more than a bit silly here, but it works similarly to the . prefix version like .sin(x) rather than suffix sin.(x). It even kinda nests: A .exp.sin= x. Quite ugly though, and messes with scoping/field syntax)

stevengj · 2016-05-10T11:51:31Z

@andyferris, without loop fusion, an in-place .= is pretty useless. An expression like x .= 4y .+ 5y.^2 .- sin.(y) will still allocate lots of temporary arrays. If you just want element-wise copy, you can already do A[:] = B.

andyferris · 2016-05-11T00:15:47Z

@stevengj I agree that it is (almost) useless, except for an alternative copy syntax. It's just "step zero" - it's dirt simple and provides the precedence for "step one" as doing similarly for .+= and all the other op= with a prefix ., which then starts to be useful, for things like x .+= 1 or something to be non-allocating where x is an Array. In fact, the number of op= operators currently defined is relatively small... so it shouldn't be too hard to define them all.

"Step 2" would be the more complex loop fusion or similar, and perhaps generalizations to generic functions.

On the whole, I do have to support what Jeff said in #14544 (comment). Perhaps the cleaner approach is to have some neat syntax for map/broadcast (and hopefully where loop fusion is clear to the user or compiler). I dunno.

stevengj · 2016-05-11T01:11:23Z

@andyferris, that discussion is out of date. We now do have a neat syntax for broadcast, and the possibility for loop fusion is now exposed to the compiler at the syntax level. And the since that syntax is ., generalized dot operators now make a lot more sense.

andyferris · 2016-05-11T02:46:46Z

@stevengj Sorry, I missed the last week or so of #8450 (it's really hard to keep up with all the threads!).

In any case, the progress seems very cool! The parser-based loop fusion (e.g. your #8450 (comment) seems like a great option to me. Any more complex things can still be done with loops, comprehensions, map and broadcast explicitly.

diegozea · 2016-05-11T14:12:14Z

A @fuse macro can be more safe than making the loop fusion at parser level, can't it?

yuyichao · 2016-05-11T14:15:48Z

What's the advantage? We already have that Devectorize.jl and I don't think It will be more flexible or give you more control.

diegozea · 2016-05-11T14:31:17Z

I'm only worry about making loop fusion everywhere. "The caller would be responsible for avoiding function calls with problematic side effects" but... How can the user use the point notation and avoid the fusion at the same time? Also I think that @fuse will be more explicit to write and to read in the code. @fuse can do the same proposed loop fusion, but It can also check the functions to be called (i.e. annotated pure functions). The last isn't possible at syntax level.

yuyichao · 2016-05-11T14:35:15Z

The caller would be responsible for avoiding function calls with problematic side effects

I think this is not an issue for the majority of cases.

How can the user use the point notation and avoid the fusion at the same time?

Split the line

but It can also check the functions to be called (i.e. annotated pure functions).

This is impossible. Or at least it won't be better at doing that.

stevengj · 2016-05-11T14:43:32Z

@diegozea, macros operate at the syntax level, without knowing types, so they can't check pureness. Same for the proposed f.(args...) fusing. (In practice, though, vectorized operations are virtually never used in cases with side effects, much less side effects that would be effected by fusion. And if f.(args...) is defined as always fusing, then you can reliably know what to expect.)

Think of f.(args...) as a much more compact and discoverable syntax for @fuse. (Once we deprecate e.g. sin(x) in favor of sin.(x) etcetera, people doing Matlab-like computations will end up using fused loops automatically in many case due to the deprecation warning, whereas they might never learn about @fuse.)

stevengj · 2016-05-11T14:48:18Z

Another way of saying it is that fusing the loops is a good idea in the vast majority of cases. Not wanting to fuse is the rare exception. That makes it sensible to make fusing the default, and in the rare cases where you don't want to fuse you can either split the line or just call broadcast explicitly.

(Note that this whole discussion was not even possible before the . notation. If you write f(g(x)) in 0.4, there is no practical generic way to look "inside" f and g and discover that they are broadcast operations that ought to be fused, whereas f.(g.(x)) makes the user's intent clear at the syntax level, enabling a syntax-level fusing transformation.)

diegozea · 2016-05-11T15:19:08Z

Sorry, I meant parse not syntax level in my last sentence. However I believed that a macro could access a function metadata.

yuyichao · 2016-05-11T15:23:59Z

However I believed that a macro could access a function metadata.

No it can't. For a start, it doesn't even have any idea about the binding. It is allowed to do sin = cos; sin(1) (or a much more reasonable form of this). @pure is also per method and you can't check that without knowing all the type info.

stevengj · 2016-05-11T15:26:52Z

@diegozea, macros are called at the parse level, which is what we mean by the "syntax" level here. i.e. at the point the macro is called, all that is known is the abstract syntax tree (AST); there is no information about types or bindings.

martinholters · 2016-05-12T06:11:26Z

While fusion by default sounds like a good idea, I find it a bit unsettling that

y = g.(f.(x))

and

a = f.(x)
y = g.(a)

might give different results. Is that just me?

nalimilan · 2016-05-12T06:56:11Z

Not different results, just different ways of returning the same result. That's fine IMHO. In the long-term, the compiler might become smarter and detect more complex cases.

martinholters · 2016-05-12T07:00:10Z

@nalimilan You are assuming pure f and g.

johnmyleswhite · 2016-05-12T19:12:15Z

FWIW, I suspect most functions people are vectorizing are pure.

quinnj · 2016-05-12T19:17:43Z

Maybe we need to do some more exploration around function traits (purity, guaranteed return types, boolean, etc.) and how they could be incorporated into some of these murkier discussions (vectorization, return types, etc.). It seems like having some explicit declarations around function traits would avoid the need to rely on inference or the compiler.

andyferris · 2016-05-12T21:31:20Z

I'm worried that overly complex rules for defining how loop fusion works will just become too confusing for users. The suggestion that A .= B .+ C ... becomes syntactic sugar for map/broadcast means that, once we are all used to it, we will easily be able to reason about code we see and know how to write a simple fused loop expression for vectors of data.

If it is a simple parsing-level rule, then the differences in @martinholters's comment will be obvious. If it is compile-time magic, we will be spending a lot of time trying to figure out if the compiler is really doing what we want it to do.

But coming back to nested . functions/operators, would we be able to avoid allocation when matrix multiplication is in the middle, e.g. v1 .= v2 .+ m1*v3 (where m3 is a matrix, and v* are vectors)? Correct me if I'm wrong, but isn't this BLAS's daxpy? We would want that to be non-allocating, so hopefully whatever syntax we come up with would allow for this kind of thing (where the order of multiplication and additions for the matrix multiplication is cache-optimized as-in BLAS, not direct as-in map and broadcast).

yuyichao · 2016-05-12T21:45:23Z

@andyferris Yes, the hope is that the parser level transformation can cover most use case with a well defined schematics. It is in principle possible to add support for more complicated construct (I very briefly talked about this with @andreasnoack ) but I personally feel like it is hard to come up with a syntax that can cover all the cases besides what can be currently achieved with broadcast (or a mutating version of it).

Thinking loudly, maybe it can be achieved by having a lazy array type (so the computation is done on the fly with in boardcast)? That would be an orthogonal change though. It'll also be hard to recognize that and call BLAS functions.

stevengj · 2016-05-13T11:26:22Z

@andyferris, the whole point is that the proposed loop fusion becomes a simple parsing-level guarantee, not compile-time at all. If you see f.(g.(x)), then you know that the loops are always fused into a single broadcast, regardless of the bindings of f, g, or x. This allows you to reason simply about the code consistently. It is indeed just syntactic sugar.

This is very different from a compile-time optimization that may or may not occur.

andyferris · 2016-05-15T22:37:13Z

@stevengj Yes, I understand and agree completely (my first two paragraphs were directed at @quinnj).

the whole point is that the proposed loop fusion becomes a simple parsing-level guarantee,

To my taste, the keyword is "guarantee". I really do like it.

I was more thinking along the lines of what @yuyichao was thinking "loudly". If Matrix-Vector multiplication was lazy (returned an iterable over i for sum(M[i,:] .* v) or similar) then it would "just work" with no temporaries given a combination of your suggested changes to the parser and introducing the lazy multiplication type (and thus no "compile-time optimizations" are necessary beyond what already exists in Julia). It would be interesting to compare performance to daxpy.

(of course, when you say "compile-time" optimization I interpret that as changes to compilation after lowering, not changes to definitions in Base.LinAlg.)

A similar thing for Matrix-Matrix multiplication is much, much harder. Although we could have arbitrarily clever iterators, they might or might not be not be the correct thing to reimplement a somewhat efficient dgemm in native Julia, or to call out to it "automagically". But this is definitely something worth considering for the future, and for the development of syntax, because superfluous temporaries after multiplying matrices (along with adding them, etc, which hopefully will be fixed in this roadmap) is IMHO currently one of Julia's numerical performance bottlenecks when working with very large matrices/tensors/etc. (On that note, does an interface for dgemm! currently exist for the memory-constrained users that want to extract the last bit of efficiency out of Julia?)

StefanKarpinski · 2016-06-02T16:14:47Z

@stevengj: are you planning on tackling this or should we figure out who else can tackle it?

bramtayl · 2016-10-13T21:17:47Z

OK, I've gotten ChainMap into a place where it's relatively stable, and I'd like to propose an alternate method of dot vectorization based on the work there.

Here's an alternative.

A type, LazyCall, which contains both a function and its arguments (positionals in a tuple and keywords in an OrderedDict)
An @~ macro to create a LazyCall from a woven expression (this is currently @unweave in ChainMap)
New methods for map, broadcast, etc. which take LazyCalls.

Here's how the syntax would change:

map( ~ sin(~x + y) ) goes to map( LazyCall(x -> sin(x + y), x) )
broadcast( ~ sin(~x + ~y) ) goes to broadcast( LazyCall( (x, y) -> sin(x + y), x, y) )
mapreduce( ( ~ sin(~x + y) ), +) goes to mapreduce( LazyCall( x -> sin(x + y), x), +)

Splats (needs more thought, I think)
~ f(~(x...) ) goes to LazyCall( (x...) -> f(x...), x...)
~ f(~(;x...) ) goes to LazyCall( (;x...) -> f(;x...); x...)

Advantages over the current system:

No ambiguity over what counts as a scalar and what doesn't
Reduction in the amount of dots necessary for a complicated expression
Support for non-broadcast operations
Support for passing multiple LazyCalls to one function
Partial support for lazy evaluation
No need for fusion

Disadvantages:

~ is already taken as bit negation. @~ is taken for formulas. I like the symbol because it reminds me of a woven thread. But maybe another ascii character will do?
Not backwards compatible; people will need to learn a new system.
Less verbose
Not sure about performance impacts

All of this is implemented in ChainMap at the moment, aside from the new methods for map and friends. It can probably stay in a package for now if people aren't interested in adding it to base.

ChrisRackauckas · 2016-10-13T22:01:28Z

map( ~ sin(~x + y) ) goes to map( LazyCall(x -> sin(x + y), x) )
broadcast( ~ sin(~x + ~y) ) goes to broadcast( LazyCall( (x, y) -> sin(x + y), x, y) )
mapreduce( ( ~ sin(~x + y) ), +) goes to mapreduce( LazyCall( x -> sin(x + y), x), +)

I'm sorry, but I'm not seeing how this proposal would clean up code at all. What would ~x + ~y be? And

Reduction in the amount of dots necessary for a complicated expression

Would be true, but there wouldn't be less ~ then there were ..

andyferris · 2016-10-13T22:24:31Z

While we are on the topic of new ideas, yesterday @c42f and I had some fun playing with the idea of "destructuring" an array.

Sometimes you have a vector of composed types ("structs"), and you might want to refer to the all the values of a certain field. Consider this operation and potential syntax we could implement here with the dot-call idea:

v::Vector{Complex{Float64}}

# get the real components using current syntax
broadcast(x -> x.re, v) 
broadcast(x -> getfield(x, :re), v)  # alternative syntax

# Kind-of works, but need to add scalar-size methods to `Symbol` 
# Also is slow because it doesn't inline the :re properly
getfield.(v, :re) 

# Potential new convenience syntax (dot call on dot reference):
v..re

There is also a similar thing where you might have a vector of tuples or a vector of vectors, and we can currently do something like

v::Vector{Vector{Int}}

# get the first elements
getindex.(v, 1)

# potential new syntax
v.[1]

v_tuple::Vector{Tuple}

# Doesn't seem to be optimal speed (tuple indexing with Const is special)
getindex.(v_tuple, 1)

To conclude: as a performance fix we could (probably should) support faster getfield.() for vectors of structs and faster getindex.() of vectors of vectors/tuples.

As a new syntax proposal, we might consider supporting operators .. and .[] as broadcasted syntax of getfield and getindex respectively.

bramtayl · 2016-10-13T22:30:05Z

@ChrisRackauckas

Agreed, less verbose, more flexible.

~x + ~y would just be ~x + ~y (it doesn't pass through a macro)
~ ~x + ~y would be LazyCall( (x, y) -> x + y, x, y)

In some cases, the amount of dots compared to ~ is cut down on. Eg.
sin.(cos.(tan.(x .+ 1) vs broadcast( ~ sin(cos(tan( ~x + 1) ) ) )

ChrisRackauckas · 2016-10-13T22:35:32Z

But broadcast( ~ sin(cos(tan( ~x + 1) ) ) ) is still a much worse option because broadcast should not be in the mathematical expression. The syntax is to get rid of the "programming" from the "math". I like to be able to glance at the LaTeX and then glance at the code and see the correctness, while getting performance.

ChrisRackauckas · 2016-10-13T22:38:24Z

Also, I don't see the reason for the LazyCall in this case. Making the type and the tuple container for the arguments adds to the amount of allocations, when what we really want is simple syntax for allocation-free code.

bramtayl · 2016-10-13T22:46:31Z

Agreed that the pure math aspect of things gets lost.

There are allocation free ways of going about this:

map( ~ sin(~x + y) ) goes to map( x -> sin(x + y), x)
broadcast( ~ sin(~x + ~y) ) goes to broadcast( (x, y) -> sin(x + y), x, y)
mapreduce( ( ~ sin(~x + y) ), +) goes to mapreduce( x -> sin(x + y), +, x)

And in fact, ChainMap has some convenience macros for doing this:

@map sin(_ + y) goes to map( _ -> sin(_ + y), _) (_ is used as part of the chaining)
@broadcast sin(~x + ~y) goes to broadcast( (x, y) -> sin(x + y), x, y)

But to do this you need to make assumptions about argument order (put the anonymous function first, the woven arguments last).

LazyCalls do have other advantages. They can be merged, pushed to, revised, etc. This is particularly useful for making plot calls.

StefanKarpinski · 2016-10-13T23:51:06Z

@bramtayl: if you've got a counter-proposal, you should open an issue or start a julia-dev thread.

bramtayl · 2016-10-14T00:02:00Z

Ok, see #18915

wsshin · 2016-12-16T05:54:11Z

I have a couple of questions about the third goal in this vectorization roadmap, which is to "parse operators like a .+ b as broadcast(+, a, b)".

Now that broadcast() works for tuples, will we have a .+ b for tuples a and b when the third goal is achieved?
Currently broadcast(+, a, b) is ~4 times slower than a .+ b for vectors a and b. (See the benchmark result below.) Is this going to be fixed before the third goal is achieved?

julia> using BenchmarkTools

julia> @benchmark [1,2,3] .+ 1
BenchmarkTools.Trial:
  memory estimate:  224.00 bytes
  allocs estimate:  2
  --------------
  minimum time:     109.144 ns (0.00% GC)
  median time:      152.579 ns (0.00% GC)
  mean time:        203.520 ns (19.59% GC)
  maximum time:     5.679 μs (94.30% GC)
  --------------
  samples:          10000
  evals/sample:     916
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark broadcast(+, [1,2,3], 1)
BenchmarkTools.Trial:
  memory estimate:  320.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     583.749 ns (0.00% GC)
  median time:      599.648 ns (0.00% GC)
  mean time:        784.025 ns (14.38% GC)
  maximum time:     74.077 μs (98.31% GC)
  --------------
  samples:          10000
  evals/sample:     179
  time tolerance:   5.00%
  memory tolerance: 1.00%

The above benchmark results were obtained with the following Julia version:

julia> versioninfo()
Julia Version 0.6.0-dev.1565
Commit 0408aa2* (2016-12-15 03:02 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin16.0.0)
  CPU: Intel(R) Core(TM)2 Duo CPU     T9900  @ 3.06GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, penryn)

stevengj · 2016-12-16T14:06:41Z

@wshin, yes, .+ etcetera now (pending merge) works for tuples, but it probably needs some performance work in that case.

julia> (3,4,5) .+ 7
(10,11,12)

julia> (3,4,5) .+ (8,9,10)
(11,13,15)

As for your benchmark, currently there is an inlining failure that is slowing down broadcast (#19608) that should hopefully be fixed soon. As a result, for very small vectors, the overhead is significant.

(Of course, for very small vectors like [1,2,3] you will always get better performance from StaticArrays.jl, since in StaticArrays the length of the vector is known at compile-time too, and the entire loop can be unrolled and inlined, not to mention that the result does not need to be heap-allocated compared to [1,2,3].)

stevengj · 2016-12-20T13:58:03Z

Closing, since all items are now merged.

Further improvements are possible, such as a generic syntax like sum:(x .+ 5) for fusing any function, making x[x .> 5] fusing, and so on, and of course performance improvements (e.g. @Sacha0's planned work on improving sparseness preserving in broadcast), but they can have their own issues marked with the vectorization label.

stevengj · 2016-12-20T14:00:36Z

(By the way, the performance problems noted by @wsshin were fixed by #19639.)

stevengj added the julep Julia Enhancement Proposal label May 9, 2016

stevengj mentioned this issue May 9, 2016

Alternative syntax for map(func, x) #8450

Closed

ViralBShah added this to the 0.5.0 milestone May 10, 2016

stevengj mentioned this issue May 10, 2016

Coherent map and comprehension, the imminent death of type_goto #11034

Closed

JeffBezanson mentioned this issue May 11, 2016

compiler optimization tracker #3440

Closed

65 tasks

StefanKarpinski assigned stevengj Jun 2, 2016

stevengj mentioned this issue Oct 6, 2016

Broadcast benchmarks JuliaCI/BaseBenchmarks.jl#30

Merged

bramtayl mentioned this issue Oct 14, 2016

Woven functional programming #18915

Closed

This was referenced Nov 2, 2016

Notation for lazy map #19198

Open

Document function argument for prod #19146

Closed

Sacha0 mentioned this issue Nov 28, 2016

map/map! for sparse matrices #19438

Merged

Sacha0 mentioned this issue Dec 7, 2016

generic broadcast/broadcast! for sparse matrices, with #19372 semantics #19518

Merged

pabloferz mentioned this issue Dec 17, 2016

Reduce allocations in broadcast #19639

Merged

stevengj closed this as completed Dec 20, 2016

wsshin mentioned this issue Jan 4, 2017

Retrieving elements from an inhomogeneous tuple is slow #19850

Closed

stevengj mentioned this issue Feb 7, 2017

Nested . syntax #20502

Open

Sacha0 mentioned this issue Jul 15, 2017

multi-threaded (@threads) dotcall/broadcast? #19777

Open

tkf mentioned this issue Feb 7, 2019

Lazy broadcasting macro JuliaArrays/LazyArrays.jl#21

Merged

Vectorization Roadmap #16285

Vectorization Roadmap #16285

Comments

stevengj commented May 9, 2016 • edited Loading

andyferris commented May 10, 2016

stevengj commented May 10, 2016 • edited Loading

andyferris commented May 11, 2016

stevengj commented May 11, 2016 • edited Loading

andyferris commented May 11, 2016

diegozea commented May 11, 2016

yuyichao commented May 11, 2016

diegozea commented May 11, 2016 • edited Loading

yuyichao commented May 11, 2016 • edited Loading

stevengj commented May 11, 2016 • edited Loading

stevengj commented May 11, 2016 • edited Loading

diegozea commented May 11, 2016

yuyichao commented May 11, 2016 • edited Loading

stevengj commented May 11, 2016 • edited Loading

martinholters commented May 12, 2016

nalimilan commented May 12, 2016

martinholters commented May 12, 2016

johnmyleswhite commented May 12, 2016

quinnj commented May 12, 2016

andyferris commented May 12, 2016 • edited Loading

yuyichao commented May 12, 2016

stevengj commented May 13, 2016 • edited Loading

andyferris commented May 15, 2016

StefanKarpinski commented Jun 2, 2016

bramtayl commented Oct 13, 2016 • edited Loading

ChrisRackauckas commented Oct 13, 2016

andyferris commented Oct 13, 2016 • edited Loading

bramtayl commented Oct 13, 2016 • edited Loading

ChrisRackauckas commented Oct 13, 2016 • edited Loading

ChrisRackauckas commented Oct 13, 2016

bramtayl commented Oct 13, 2016 • edited Loading

StefanKarpinski commented Oct 13, 2016

bramtayl commented Oct 14, 2016

wsshin commented Dec 16, 2016

stevengj commented Dec 16, 2016 • edited Loading

stevengj commented Dec 20, 2016

stevengj commented Dec 20, 2016

stevengj commented May 9, 2016 •

edited

Loading

stevengj commented May 10, 2016 •

edited

Loading

stevengj commented May 11, 2016 •

edited

Loading

diegozea commented May 11, 2016 •

edited

Loading

yuyichao commented May 11, 2016 •

edited

Loading

stevengj commented May 11, 2016 •

edited

Loading

stevengj commented May 11, 2016 •

edited

Loading

yuyichao commented May 11, 2016 •

edited

Loading

stevengj commented May 11, 2016 •

edited

Loading

andyferris commented May 12, 2016 •

edited

Loading

stevengj commented May 13, 2016 •

edited

Loading

bramtayl commented Oct 13, 2016 •

edited

Loading

andyferris commented Oct 13, 2016 •

edited

Loading

bramtayl commented Oct 13, 2016 •

edited

Loading

ChrisRackauckas commented Oct 13, 2016 •

edited

Loading

bramtayl commented Oct 13, 2016 •

edited

Loading

stevengj commented Dec 16, 2016 •

edited

Loading