WIP: implement f.(args...) as a synonym for broadcast(f, args...) #15032

stevengj · 2016-02-11T22:44:03Z

As discussed in #8450, this implements f.(args...) as a synonym for ~~map(f, args...)~~ broadcast(f, args...).

(I think we need some "vectorized-function" syntax like this even if we also have some f(x) over x
syntax for map in the future. People in scientific computing are just too used to vectorized functions to give that up completely.)

Also, this PR supports f.:x as a synonym for getfield(f, :x), so that one no longer needs to do f.(:x). For example, it was common to use Base.(:+) and similar, and this can now be written Base.:+. For backwards compatibility, we can also support the former syntax via a deprecated broadcast(m::Module, s::Symbol) = getfield(m, s) method. Update: there should now be deprecation warnings for all existing usages of x.(i) for getfield or setfield!, so this PR is not breaking.

I wanted to post a prototype first to see what people think. To do:

Decide whether this is the syntax we want.
Documentation
broadcast(m::Module, s::Symbol) = getfield(m, s) deprecation for Base.(:+) etc.
Deprecate @vectorize? Blocked by broadcast picking incorrect result type #4883. (Should be a separate PR anyway.)

tkelman · 2016-02-12T01:33:49Z

Should it lower to map or broadcast ?

stevengj · 2016-02-12T01:51:14Z

@tkelman, I thought that the discussion was that it should lower to map, but it's easy to change it to broadcast if people prefer. Our current @vectorize_2arg is broadcast-like, and broadcast is a bit more flexible, so maybe that is better.

stevengj · 2016-02-12T16:26:25Z

Okay, this is weird. tests/core.jl is failing on the #9534 regression test because I'm getting:

julia> is(IOBuffer(), IOBuffer)
false

julia> typeof(IOBuffer()) == IOBuffer
true

Not sure how I could have broken this. 😿

StefanKarpinski · 2016-02-12T17:01:48Z

+1 for doing broadcast for vectorized operations.

vtjnash · 2016-02-12T17:21:43Z

@stevengj i think you meant isa(3, Int), because 3 !== Int

stevengj · 2016-02-12T18:28:34Z

Oh, right, I was just confusing is with isa. isa seems to work.

stevengj · 2016-02-12T18:30:47Z

test/core.jl

@@ -2239,10 +2240,10 @@ f9534d() = (a=(1,2,4,6,7); a[7])
 f9534d(x) = (a=(1,2,4,6,7); a[x])
 @test try; f9534d() catch ex; (ex::BoundsError).a === (1,2,4,6,7) && ex.i == 7; end
 @test try; f9534d(-1) catch ex; (ex::BoundsError).a === (1,2,4,6,7) && ex.i == -1; end
-f9534e(x) = (a=IOBuffer(); a.(x) = 3)
+f9534e(x) = (a=IOBuffer(); setfield!(a, x, 3))
 @test try; f9534e(-2) catch ex; is((ex::BoundsError).a,Base.IOBuffer) && ex.i == -2; end


How did this test work? @vtjnash, why isn't this isa, not is?

the BoundsError is from attempting to access the properties of field -2 of the IOBuffer type

f9534e accesses field -2 (a.(x) = 3) of an instance of the IOBuffer type (a=IOBuffer()), not the IOBuffer type per se, no?

Isn't ex.a the object for which the bounds error is thrown? i.e., it should be an instance of the IOBuffer type, and not the type itself, and hence you should be checking isa(ex.a, IOBuffer) rather than is.

When I change it to isa, the tests pass for me, but I'm confused about why they ever passed with is.

ah, i see. this is a difference in codegen vs. the intrinsics. previously, this assignment was going through codegen, which emits the error based on the invalidity of trying to examine the type. however, in the jl_f_getfield builtin function, the error gets thrown with the value.

toivoh · 2016-02-12T20:44:58Z

+1 for doing broadcast for vectorized operations from me too.

stevengj · 2016-02-13T01:04:46Z

Changed to broadcast.

nalimilan · 2016-02-13T11:12:27Z

Thanks for taking the lead @stevengj!

Though I'm worried by the fact that this implementation doesn't allow composing functions without copies. If we offer e.g. sqrt.(x), people will want to use it for exp.(sqrt.(x)), which doesn't fuse the loops and allocates a useless copy. I think we should find a way to combine successive f.(...) calls into a single one, or this feature is going to be a trap.

stevengj · 2016-02-13T13:47:54Z

@nalimilan, please see the discussion in #8450. The f.(x) syntax is intended as a replacement for @vectorized functions, which will be deprecated (probably quasi-permanently) — people can already do exp(sqrt(x)), so this is not new functionality. An abbreviated syntax for fused loops, like sin(exp(x)) over x, would be in addition to this. Rationale:

We want vectorized methods like sin(x::Vector) to go away (or at least be deprecated) now that map is fast, because vectorizing an arbitrarily chosen set of functions is much less clean than providing a nice syntax for vectorizing anything. See this post by @JeffBezanson.
People in scientific computing are used to the convenience of vectorized functions; eliminating them entirely from the language will be a big drawback for people coming from Matlab, Python, R, etc.
Vectorized functions have proved "good enough" for many applications for 30 years; you don't always need the extra boost of fusing the loops.
Every proposed generic syntax for fusing the loops (like over) is more verbose than just adding dots, and ends up being less flexible than vectorized functions in some cases — as discussed in Alternative syntax for map(func, x) #8450, there always seem to be cases where you have to explicitly call map or similar.
Deprecating @vectorized functions is going to cause a zillion warnings in Julia code; it will be much easier to port old Julia code to 0.5 if all you have to do is to add some dots, as opposed to re-thinking your logic to make sure that loops are fused correctly.
There is always the possibility that future compiler optimizations will be able to fuse the loops in f.(g.(x)) for you.

nalimilan · 2016-02-13T15:26:17Z

@stevengj I agree with all of these points. I was just wondering whether the implementation of this syntax would allow, now or in the future, to avoid unnecessary allocations. Clearly, creating copies and not fusing operations has been seen as "good enough" in other languages. But Julia generally offers more than its predecessors, even with simple syntax. So my point was about this:

There is always the possibility that future compiler optimizations will be able to fuse the loops in f.(g.(x)) for you.

I think we should consider whether this sounds like something the compiler might be able to do at some point without too many tricks, or whether an alternative implementation would make it more likely to work. I'm all for merging this PR, even if it still creates copies, if we know it can be improved at some point.

eschnett · 2016-02-13T15:40:26Z

Fortran compilers regularly perform such optimizations. With Fortran 90, a vectorized syntax was introduced (targeting e.g. the vector machines prominent at that time), and of course these days, this notation has to be undone to allow for loop fusion.

We might -- in the future, and automatically -- annotate the temporary arrays created from this notation to tell the optimizer that they are freshly created and thus don't alias any existing arrays. This alone should be a big step towards optimizing such expressions.

Of course, another way to go would be the equivalent of C++ expression templates. Julia's type system is certainly powerful enough for this.

timholy · 2016-02-13T16:18:29Z

Presumably this is obvious to everyone, but can't performance-sensitive users fuse them manually with

(x->sin(exp(x))).(A)

? And then to fuse automatically, it's "just" a matter of detecting f.(g.(args)) and replacing it with the form above?

eschnett · 2016-02-13T16:38:46Z

The problem is that this works with sin, but not with sin!. To do this automatically, the compiler needs to prove that the functions are side-effect free, and that the arrays don't alias other arrays.

Julia could introduce run-time assertions to catch these cases.

StefanKarpinski · 2016-02-13T17:48:23Z

I have to say that it feels a bit backwards to introduce a feature that is hard to optimize when it's just as notationally convenient to express the scalar kernel directly and then vectorize that.

tbreloff · 2016-02-13T18:54:59Z

Just a thought... What if the notation effectively creates a future, so
that "sin.(x) --> FutureBroadcast(sin, x)" and it stays unevaluated until
some later operation. Then you could compose arbitrary futures together and
it gets evaluated explicitly or implicitly depending on context.

Again this is not well thought out, but I could try to put together a rough
prototype.

On Saturday, February 13, 2016, Stefan Karpinski [email protected]
wrote:

I have to say that it feels a bit backwards to introduce a feature that is
hard to optimize when it's just as notationally convenient to express the
scalar kernel directly and then vectorize that.

—
Reply to this email directly or view it on GitHub
#15032 (comment).

stevengj · 2016-02-13T20:07:27Z

First, I should emphasize that vectorized functions are mainly for convenience, not for attaining the absolute maximum performance in critical inner loops. There is a huge base of people who have been trained to make use of vectorized operations and to find this abstraction natural, and deleting the vectorized-math abstraction from Julia would be a mistake, in my opinion. (And it's not like we are getting rid of .+ etc. anyway.)

@eschnett, when it comes to hypothetical future loop-fusing optimizations (not to be implemented in this PR!), the case of mutating functions like the hypothetical sin! is not relevant if you are mapping over a collection of immutable objects. As for other side effects — that sort of thing is what the @pure annotation (#13555) is for. @StefanKarpinski, as a result I'm not convinced this is "hard to optimize" (Fortran has already done it, after all). The key thing is to make the user explicitly indicate that an operation is a map; compiler opportunities follow.

The goal of this PR, however, is simply to have a generalized replacement for the @vectorized functions, while keeping the same spirit and a reasonably intuitive syntax for users trained in that abstraction. The fact that it exposes possibilities for future compiler optimization is a merely a bonus.

stevengj · 2016-02-13T21:33:58Z

(A corollary to this PR would be, in a future PR, to get rid of .+ etcetera, and change it so that instead any .[operator] is equivalent to a broadcast.)

StefanKarpinski · 2016-02-13T23:47:39Z

@stevengj, Fortran tends to have much more information about what's happening when it analyzes code due to its static and relatively un-polymorphic nature. I do appreciate that this PR is about removing @vectorize, but I'm just not convinced that writing vectorized code and then optimizing it is the best way to express things. I'm also really happy that you're pushing this forward with actual code and I'm happy to try it out and see how it feels.

stevengj · 2016-02-14T00:54:37Z

@StefanKarpinski, if you have f.(g.(x)), the eltypes are immutable, and g has been marked @pure, wouldn't that be enough information for Julia to fuse the loops?

Furthermore, as far as I can tell there are four options:

Keep the @vectorized functions as-is.
Deprecate the @vectorized functions in favor of manually calling map etc.
Deprecate the @vectorized functions in favor of a generalized syntax that is similar in effect, but works for any function and explicitly exposes the user's map intention to the compiler: f.(x) or similar (which may or may not benefit from compiler optimization in the future).
Deprecate the @vectorized functions in favor of some completely different syntax like f(x) over x. (So far, as you yourself pointed out in Alternative syntax for map(func, x) #8450, none of the proposed syntaxes can compactly express things like exp(log(A) .- sum(A,1)) .* A[:,1].)

Which option are you advocating? (My position: people like doing vectorized math, and are used to it. We should support it, whatever else we do for advanced users.)

ViralBShah · 2016-02-14T05:50:04Z

I love this. Not sure about the syntax but certainly want to try it out. I thought that parallel accelerator should be able to optimise this already.

ViralBShah · 2016-02-14T05:51:13Z

We not only should support vector math, but support it really well. This is a huge step in that direction.

vchuravy · 2016-02-14T13:03:39Z

I was wondering how this compares with generators:(sin(exp(x)) for x in A) vs sin.(exp.(A)) is a bit longer, but is already loop-fused without any compiler optimizations. For me one of the big questions is what does the general vectorized syntax offer over map(f, A) or generators?

For me personally the f.(x) syntax makes it harder to read and understand a statement.

nalimilan · 2016-02-14T13:25:36Z

@stevengj Another implementation (which I explored before generators existed in #8450) would be to return a form of generator, i.e. a lazy map. This would make it easy to fuse operations. It could be annoying if you want an actual array (just like the recent discussions about LinSpace have shown), but it could also be useful for efficient in-place assignments, i.e. x[:] = exp.(sqrt.(x)).

Another similar solution has just been proposed by @tbreloff. It would be more convenient but may not be possible: make .( syntax for a future/promise, i.e. a kind of generator which would be collected as soon as an assignment happens. But no such object exists AFAIK in Julia at the moment.

@vchuravy I think @stevengj is right that people coming from scientific languages expect to be able to write things like sin.(x) very concisely. They can't write it as sin(x) in Julia, which is already hurting they usual workflow, so we must provide an alternative. Generators are cool, but for simple element-wise operations they are less clear as they introduce a lot of noise. (The signal/noise ratio gets lower when you combine several operations, though.)

…, i); this PR should no longer be a breaking change

stevengj · 2016-05-07T13:25:54Z

Just the usual timeout problem on OSX, it seems.

jebej · 2016-05-09T18:11:55Z

Just by curiosity, would it be possible to know why the f.(x) syntax was chosen over .f(x)? It does seem inconsistent with all the "regular" vectorized operations like .* etc.

This also means .+([1 2],[1 2]) and +.([1 2],[1 2]) do the same, which is a bit weird to me. (What does .+.([1 2],[1 2]) do?)

yuyichao · 2016-05-09T18:14:34Z

#15032 (comment) AFAICT

papamarkou · 2016-05-13T09:49:46Z

My two cents, knowing that I am subjective about this, but I guess it is ok to simply express a personal preference. I just find the idea sin.(x) ugly :)

To reason about this, first of all, I think that it is much cleaner in a functional style of coding to express the operation as map(sin, x). If you count the excess of characters of this latter expression, it is only 3 characters longer than the former (or 4 if you want to count the space too).

Secondly, if we do really need to use additional syntax to complement map, then I am more in favor of @tbreloff's line of thinking. An infix or postfix operator could be chosen more sensibly than this "dot" which for some reason hurts my eyes... We could have sth like x ~> sin, which is reminiscent of dplyr's piping style in R or of Factor's stack allocated syntax, it just seems more intuitive. If for some reason related to Julia's idiom ~> is not good enough, I still feel nearly certain that there are nicer symbols to represent the vectorization that a cryptic postfix dot.

tkelman · 2016-05-13T10:48:40Z

Making ~> parse as arrow-like is a good idea. It's currently either a syntax error if anything follows it, or unary ~ called on the > operator which doesn't seem useful.

JeffBezanson · 2016-05-13T14:01:56Z

I don't buy the idea that sin.(x) is ugly and horrible, but x ~> sin is beautiful and wonderful. However there are other threads on piping syntax, and we can certainly continue to develop that.

StefanKarpinski · 2016-05-13T14:26:23Z

I'm surprised that I don't actually hate the f.(v) syntax as much as I thought I would. I might even like it.

ChrisRackauckas · 2016-05-20T03:47:59Z

I don't think the dots cause a problem at all. In fact, I think they "solve" the problem of unfused loops by making multiple vectorized operations together look ugly. For example, sin.(exp.(2.^x)) doesn't look very good with all the dots around, but that's basically telling you that you should define f(x) = sin(exp(2^x)) and use f.(x), or just call map.

But I don't see the problem with having a special infix operator for vectorization which is unicode. Julia already has \div for a reason: it's easy to read and you know what it does.

JeffreySarnoff · 2016-05-24T23:34:13Z

fwiw, <~ could be to ~> as decrement is to increment (rather than as left is to right)

     `sines_of_xs = xs ~> sin`
     `recovers_xs = sines_of_xs <~ sin`

I was working a bit on macro expansion - particularly `quote` (quasiquote) expansion with `$` interpolations - and I've found that it's weird and inconvenient that we parse `a.b` into `(. a (quote b))`. Specifically, the part that's weird here is that we emit `(quote b)` for the field name even though this is "not quote syntax": this should not yield a syntax literal during lowering, and is thus a semantic mismatch with actual quote syntax of the form `:(a + b)` or `quote a+b end`. * Why is this a problem? It means we need special rules to distinguish actual syntax literals from field names. * But can we really change this? Surely this AST form had a purpose? Yes! A long time ago Julia supported `a.(b)` syntax to mean `getfield(a, b)`, which would naturally have been parsed as `(. a b)`. However this was deprecated as part of adding broadcast syntax in JuliaLang/julia#15032 Here we simplify by parsing `a.b` as `(. a b)` instead, with the second argument implied to be a field name.

stevengj force-pushed the vectorize_dot branch from b112c59 to dce2cf6 Compare February 12, 2016 00:12

stevengj reviewed Feb 12, 2016
View reviewed changes

stevengj force-pushed the vectorize_dot branch from dce2cf6 to 02df6b6 Compare February 13, 2016 01:00

stevengj changed the title ~~WIP: implement f.(args...) as a synonym for map(f, args...)~~ WIP: implement f.(args...) as a synonym for broadcast(f, args...) Feb 13, 2016

stevengj force-pushed the vectorize_dot branch from 02df6b6 to 2ac9c87 Compare February 13, 2016 01:02

stevengj force-pushed the vectorize_dot branch from 2ac9c87 to 50090e6 Compare February 13, 2016 02:25

stevengj added 2 commits May 6, 2016 10:06

expand deprecation warnings to cover all uses of x.(i) for getfield(x…

fe53f21

…, i); this PR should no longer be a breaking change

tests for f.(args...) syntax

8ad0eab

stevengj force-pushed the vectorize_dot branch from 14da2ad to 8ad0eab Compare May 6, 2016 14:16

stevengj removed the needs tests Unit tests are required for this change label May 6, 2016

JeffBezanson merged commit 3531671 into JuliaLang:master May 9, 2016

stevengj deleted the vectorize_dot branch May 9, 2016 21:48

This was referenced May 9, 2016

Vectorization Roadmap #16285

Closed

Support foo.:bar JuliaLang/JuliaParser.jl#49

Closed

@compat for f.(args...) and foo.:bar JuliaLang/Compat.jl#196

Merged

This was referenced May 11, 2016

Add vectorized mod2pi #15797

Closed

Add vectorized "in" (.∈) and "notin" (.∉) #12406

Closed

martinholters mentioned this pull request May 31, 2016

Tag ACME v0.1.1 [git://github.com/HSU-ANT/ACME.jl.git] JuliaLang/METADATA.jl#5289

Merged

stevengj mentioned this pull request Jul 6, 2016

Interaction between compact broadcast syntax and splatting? #17304

Closed

timholy mentioned this pull request Jul 14, 2016

Generalize f.(A) to tuples? #17411

Closed

nalimilan mentioned this pull request Oct 19, 2016

broadcast with only one named matrix doesn't work davidavdav/NamedArrays.jl#39

Closed

davidavdav mentioned this pull request Oct 30, 2016

Deprecate $ as bitwise xor, replace by xor(), as discussed in #18696 #18977

Merged

stevengj mentioned this pull request Jan 30, 2017

support dots for unary operators #20249

Merged

ZacCranko mentioned this pull request Jun 24, 2017

Automatic broadcasting syntax f.(A) is inconsistent #22501

Closed

c42f mentioned this pull request Jul 8, 2023

Rethinking the . AST JuliaLang/JuliaSyntax.jl#324

Closed

WIP: implement f.(args...) as a synonym for broadcast(f, args...) #15032

WIP: implement f.(args...) as a synonym for broadcast(f, args...) #15032

Conversation

stevengj commented Feb 11, 2016 • edited Loading

tkelman commented Feb 12, 2016

stevengj commented Feb 12, 2016

stevengj commented Feb 12, 2016

StefanKarpinski commented Feb 12, 2016

vtjnash commented Feb 12, 2016

stevengj commented Feb 12, 2016

stevengj Feb 12, 2016

Choose a reason for hiding this comment

vtjnash Feb 12, 2016

Choose a reason for hiding this comment

stevengj Feb 12, 2016

Choose a reason for hiding this comment

vtjnash Feb 12, 2016

Choose a reason for hiding this comment

toivoh commented Feb 12, 2016

stevengj commented Feb 13, 2016

nalimilan commented Feb 13, 2016

stevengj commented Feb 13, 2016

nalimilan commented Feb 13, 2016

eschnett commented Feb 13, 2016

timholy commented Feb 13, 2016

eschnett commented Feb 13, 2016

StefanKarpinski commented Feb 13, 2016

tbreloff commented Feb 13, 2016

stevengj commented Feb 13, 2016

stevengj commented Feb 13, 2016

StefanKarpinski commented Feb 13, 2016

stevengj commented Feb 14, 2016

ViralBShah commented Feb 14, 2016

ViralBShah commented Feb 14, 2016

vchuravy commented Feb 14, 2016

nalimilan commented Feb 14, 2016

stevengj commented May 7, 2016

jebej commented May 9, 2016

yuyichao commented May 9, 2016

papamarkou commented May 13, 2016 • edited Loading

tkelman commented May 13, 2016

JeffBezanson commented May 13, 2016

StefanKarpinski commented May 13, 2016

ChrisRackauckas commented May 20, 2016

JeffreySarnoff commented May 24, 2016

stevengj commented Feb 11, 2016 •

edited

Loading

papamarkou commented May 13, 2016 •

edited

Loading