-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: implement f.(args...) as a synonym for broadcast(f, args...) #15032
Conversation
b112c59
to
dce2cf6
Compare
Should it lower to |
@tkelman, I thought that the discussion was that it should lower to |
Okay, this is weird.
Not sure how I could have broken this. 😿
|
+1 for doing broadcast for vectorized operations. |
@stevengj i think you meant |
Oh, right, I was just confusing |
@@ -2239,10 +2240,10 @@ f9534d() = (a=(1,2,4,6,7); a[7]) | |||
f9534d(x) = (a=(1,2,4,6,7); a[x]) | |||
@test try; f9534d() catch ex; (ex::BoundsError).a === (1,2,4,6,7) && ex.i == 7; end | |||
@test try; f9534d(-1) catch ex; (ex::BoundsError).a === (1,2,4,6,7) && ex.i == -1; end | |||
f9534e(x) = (a=IOBuffer(); a.(x) = 3) | |||
f9534e(x) = (a=IOBuffer(); setfield!(a, x, 3)) | |||
@test try; f9534e(-2) catch ex; is((ex::BoundsError).a,Base.IOBuffer) && ex.i == -2; end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did this test work? @vtjnash, why isn't this isa
, not is
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the BoundsError is from attempting to access the properties of field -2
of the IOBuffer type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f9534e
accesses field -2
(a.(x) = 3
) of an instance of the IOBuffer
type (a=IOBuffer()
), not the IOBuffer
type per se, no?
Isn't ex.a
the object for which the bounds error is thrown? i.e., it should be an instance of the IOBuffer
type, and not the type itself, and hence you should be checking isa(ex.a, IOBuffer)
rather than is
.
When I change it to isa
, the tests pass for me, but I'm confused about why they ever passed with is
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, i see. this is a difference in codegen vs. the intrinsics. previously, this assignment was going through codegen, which emits the error based on the invalidity of trying to examine the type. however, in the jl_f_getfield
builtin function, the error gets thrown with the value.
+1 for doing broadcast for vectorized operations from me too. |
dce2cf6
to
02df6b6
Compare
02df6b6
to
2ac9c87
Compare
Changed to |
2ac9c87
to
50090e6
Compare
Thanks for taking the lead @stevengj! Though I'm worried by the fact that this implementation doesn't allow composing functions without copies. If we offer e.g. |
@nalimilan, please see the discussion in #8450. The
|
@stevengj I agree with all of these points. I was just wondering whether the implementation of this syntax would allow, now or in the future, to avoid unnecessary allocations. Clearly, creating copies and not fusing operations has been seen as "good enough" in other languages. But Julia generally offers more than its predecessors, even with simple syntax. So my point was about this:
I think we should consider whether this sounds like something the compiler might be able to do at some point without too many tricks, or whether an alternative implementation would make it more likely to work. I'm all for merging this PR, even if it still creates copies, if we know it can be improved at some point. |
Fortran compilers regularly perform such optimizations. With Fortran 90, a vectorized syntax was introduced (targeting e.g. the vector machines prominent at that time), and of course these days, this notation has to be undone to allow for loop fusion. We might -- in the future, and automatically -- annotate the temporary arrays created from this notation to tell the optimizer that they are freshly created and thus don't alias any existing arrays. This alone should be a big step towards optimizing such expressions. Of course, another way to go would be the equivalent of C++ expression templates. Julia's type system is certainly powerful enough for this. |
Presumably this is obvious to everyone, but can't performance-sensitive users fuse them manually with (x->sin(exp(x))).(A) ? And then to fuse automatically, it's "just" a matter of detecting |
The problem is that this works with Julia could introduce run-time assertions to catch these cases. |
I have to say that it feels a bit backwards to introduce a feature that is hard to optimize when it's just as notationally convenient to express the scalar kernel directly and then vectorize that. |
Just a thought... What if the notation effectively creates a future, so Again this is not well thought out, but I could try to put together a rough On Saturday, February 13, 2016, Stefan Karpinski [email protected]
|
First, I should emphasize that vectorized functions are mainly for convenience, not for attaining the absolute maximum performance in critical inner loops. There is a huge base of people who have been trained to make use of vectorized operations and to find this abstraction natural, and deleting the vectorized-math abstraction from Julia would be a mistake, in my opinion. (And it's not like we are getting rid of @eschnett, when it comes to hypothetical future loop-fusing optimizations (not to be implemented in this PR!), the case of mutating functions like the hypothetical The goal of this PR, however, is simply to have a generalized replacement for the |
(A corollary to this PR would be, in a future PR, to get rid of |
@stevengj, Fortran tends to have much more information about what's happening when it analyzes code due to its static and relatively un-polymorphic nature. I do appreciate that this PR is about removing |
@StefanKarpinski, if you have Furthermore, as far as I can tell there are four options:
Which option are you advocating? (My position: people like doing vectorized math, and are used to it. We should support it, whatever else we do for advanced users.) |
I love this. Not sure about the syntax but certainly want to try it out. I thought that parallel accelerator should be able to optimise this already. |
We not only should support vector math, but support it really well. This is a huge step in that direction. |
I was wondering how this compares with generators: For me personally the |
@stevengj Another implementation (which I explored before generators existed in #8450) would be to return a form of generator, i.e. a lazy Another similar solution has just been proposed by @tbreloff. It would be more convenient but may not be possible: make @vchuravy I think @stevengj is right that people coming from scientific languages expect to be able to write things like |
…, i); this PR should no longer be a breaking change
Just the usual timeout problem on OSX, it seems. |
Just by curiosity, would it be possible to know why the This also means |
#15032 (comment) AFAICT |
My two cents, knowing that I am subjective about this, but I guess it is ok to simply express a personal preference. I just find the idea To reason about this, first of all, I think that it is much cleaner in a functional style of coding to express the operation as Secondly, if we do really need to use additional syntax to complement |
Making |
I don't buy the idea that |
I'm surprised that I don't actually hate the |
I don't think the dots cause a problem at all. In fact, I think they "solve" the problem of unfused loops by making multiple vectorized operations together look ugly. For example, But I don't see the problem with having a special infix operator for vectorization which is unicode. Julia already has \div for a reason: it's easy to read and you know what it does. |
fwiw,
|
I was working a bit on macro expansion - particularly `quote` (quasiquote) expansion with `$` interpolations - and I've found that it's weird and inconvenient that we parse `a.b` into `(. a (quote b))`. Specifically, the part that's weird here is that we emit `(quote b)` for the field name even though this is "not quote syntax": this should not yield a syntax literal during lowering, and is thus a semantic mismatch with actual quote syntax of the form `:(a + b)` or `quote a+b end`. * Why is this a problem? It means we need special rules to distinguish actual syntax literals from field names. * But can we really change this? Surely this AST form had a purpose? Yes! A long time ago Julia supported `a.(b)` syntax to mean `getfield(a, b)`, which would naturally have been parsed as `(. a b)`. However this was deprecated as part of adding broadcast syntax in JuliaLang/julia#15032 Here we simplify by parsing `a.b` as `(. a b)` instead, with the second argument implied to be a field name.
I was working a bit on macro expansion - particularly `quote` (quasiquote) expansion with `$` interpolations - and I've found that it's weird and inconvenient that we parse `a.b` into `(. a (quote b))`. Specifically, the part that's weird here is that we emit `(quote b)` for the field name even though this is "not quote syntax": this should not yield a syntax literal during lowering, and is thus a semantic mismatch with actual quote syntax of the form `:(a + b)` or `quote a+b end`. * Why is this a problem? It means we need special rules to distinguish actual syntax literals from field names. * But can we really change this? Surely this AST form had a purpose? Yes! A long time ago Julia supported `a.(b)` syntax to mean `getfield(a, b)`, which would naturally have been parsed as `(. a b)`. However this was deprecated as part of adding broadcast syntax in JuliaLang/julia#15032 Here we simplify by parsing `a.b` as `(. a b)` instead, with the second argument implied to be a field name.
As discussed in #8450, this implements
f.(args...)
as a synonym formap(f, args...)
broadcast(f, args...)
.(I think we need some "vectorized-function" syntax like this even if we also have some
f(x) over x
syntax for
map
in the future. People in scientific computing are just too used to vectorized functions to give that up completely.)Also, this PR supports
f.:x
as a synonym forgetfield(f, :x)
, so that one no longer needs to dof.(:x)
. For example, it was common to useBase.(:+)
and similar, and this can now be writtenBase.:+
. For backwards compatibility, we can also support the former syntax via a deprecatedbroadcast(m::Module, s::Symbol) = getfield(m, s)
method. Update: there should now be deprecation warnings for all existing usages ofx.(i)
forgetfield
orsetfield!
, so this PR is not breaking.I wanted to post a prototype first to see what people think. To do:
broadcast(m::Module, s::Symbol) = getfield(m, s)
deprecation forBase.(:+)
etc.@vectorize
? Blocked by broadcast picking incorrect result type #4883. (Should be a separate PR anyway.)