make indexing expressions participate in dot syntax fusion #22858

StefanKarpinski · 2017-07-18T19:51:25Z

Consider an expression like v[p.(v)] .*= 2 where p is a predicate (i.e. returns a boolean). Currently Julia's semantics are that p.(v) is constructed and then used to index into v. However, the optimal implementation of this would to loop through v, testing each element x in v to see if p(x) is true and then multiplying it by 2 if that's the case. While it might be possible for a compiler to optimize the current semantics to this behavior in some circumstances, arguably, the dotted application of p in the indexing expression should participate in dot fusion of the whole expression, and this expression should simply mean the optimal implementation. Marking as 1.0 since if we're going to change the semantics of this expression, we should do so before then.

The text was updated successfully, but these errors were encountered:

KristofferC · 2017-07-22T17:14:16Z

Ref #19169

stevengj · 2017-07-27T18:05:24Z

I would tend to prefer v.[p.(v)] for something like this. If we lowered a.[i] to getindex.(a, i), then v.[p.(v)] would already work on the right-hand-side of assignments since getindex.(v, p.(v)) is fusing.

JeffBezanson · 2017-08-10T17:27:54Z

OK, I guess this is just a parser issue then. I can allow parsing x.[...].

JeffBezanson · 2017-08-10T20:57:08Z

Update: already parses, so just needs a case in the broadcast lowering code.

mbauman · 2017-08-10T21:12:54Z

It'll take quite a bit more than that, particularly in the logical indexing case like Stefan proposed in his example. The naive broadcast will just repeatedly pass a true or false index to the array. See my comment at #19169 (comment).

JeffBezanson · 2017-08-10T21:14:48Z

Yeah, I just realized that and was about to post again. a[i] is just not generally equivalent to getindex.(a, i).

StefanKarpinski · 2017-08-11T15:10:05Z

Writing this as v.[p.(v)] .*= 2 would be fine as well as long as there is a way to write it.

JeffBezanson · 2017-08-14T20:08:35Z

a.[i] would mean [ getindex(a[1], i[1]), getindex(a[2], i[2]), ... ] or [ getindex(a[1], i), getindex(a[2], i), ... ] if i is a scalar. So we'd have to change something else to make that work, but I'm not sure what.

StefanKarpinski · 2017-08-31T17:15:34Z

Since the v.[p.(v)] syntax doesn't mean anything and gives an invalid syntax error, this is not 1.0-blocking:

julia> v.[p.(v)]
ERROR: syntax: invalid syntax v.[p.((v,))]

mbauman · 2017-10-09T18:41:54Z

Now that I've recently thought through this, I figured I'd leave a note for posterity (and future me) on the difficulties here:

An expression like A.[B .& C, D] .* E would naively become something like broadcast((b,c,d,e)->A[b & c, d] * e, B, C, D, E). This obviously cannot work, and it would be extremely difficult to come up with a transformation that could work:

As I note above, it would repeatedly pass true or false as indexes. We need some way of transforming them into the indices where they are true. But this is hardly the biggest challenge.
Broadcast needs to know that B and C are involved in a subexpression that becomes a logical index, and then skip values for which that subexpression produces falses. That is we cannot walk through all four arrays synchronously; D and E must only yield values when the logical subexpression is true.
Indexing with a two-dimensional logical mask yields a vector of results. Broadcast would need to treat all variables participating in the logical subexpression as a single vector.
Broadcast doesn't know the result size without running through and counting the trues in the logical subexpression ahead of time.
Broadcast would have a hard time doing the same kinds of "outer" bounds checking that non-scalar indexing currently performs.

Logical indexing is really the worst case here. But there are other problems. : would have to be handled specially; other fancy index types like .. intervals that some packages have defined would have similar troubles.

Were we to do this, we'd only be able to support arrays of integer indices or we'd have to disable fusion from propagating inside the .[] syntax entirely (we'd broadcast over the indices after they've been transformed into collections of integers by to_indices). If we chose the latter semantics and also transformed the indices to be orthogonal to each other, then this is effectively the status quo with @views @.… except maybe a bit worse since right now broadcast can trust that accesses into views are inbounds! The benefit, on the other hand, would be that this is semantically the same as APL indexing… and we could deprecate nonscalar getindex and setindex! entirely.

StefanKarpinski · 2017-10-10T02:33:28Z

Nice writeup. Seems like the best course of action is to leave this as an error until we can handle it.

mbauman · 2017-10-20T17:15:04Z

Latest thoughts:

In order to get around the difficulties I enumerated in my last post, I had been thinking I'd like to define A.[B .& C, D] .* E as:

broadcast((x,y,z)->A[x,y] * z, orthogonalize(to_indices(A, B .& C, D)...)..., E)

where orthogonalize adds leading singleton dimensions to each successive argument such that the number of leading singletons is equal to the sum of the dimensionalities of all the preceding arguments. We effectively sacrifice our ability to fuse individual index computations with the rest of the expression in order to gain APL indexing semantics and fancy index types. So B .& C allocates an array, but the accesses into A need neither a view nor an allocation — they're done on-demand.

Here's the unfortunate part: orthogonalize requires reshape to do its work. And reshape is a wrapper much like view (or for Array, it's a new header linked to the same data). As long as a view requires allocations, so would this syntax transform. Then the million dollar question is: why would we do all this extra work to define a new syntax that's effectively identical to @views @. A[B & C, D] * E in both functionality and performance!? Let's not do that.

pabloferz · 2017-10-25T14:00:46Z

Sorry, closed it by github-mobile-interface accident.

StefanKarpinski added the broadcast Applying a function over a collection label Jul 18, 2017

StefanKarpinski added this to the 1.0 milestone Jul 18, 2017

JeffBezanson added the parser Language parsing and surface syntax label Aug 10, 2017

JeffBezanson self-assigned this Aug 10, 2017

JeffBezanson added compiler:lowering Syntax lowering (compiler front end, 2nd stage) and removed parser Language parsing and surface syntax labels Aug 10, 2017

JeffBezanson removed their assignment Aug 10, 2017

StefanKarpinski modified the milestones: 1.x, 1.0 Aug 31, 2017

mbauman mentioned this issue Oct 9, 2017

Julep: Generalize indexing with and by Associatives #24019

Open

ChrisRackauckas mentioned this issue Oct 9, 2017

AllIndices type for identity indexing #24069

Closed

pabloferz closed this as completed Oct 25, 2017

pabloferz reopened this Oct 25, 2017

mbauman mentioned this issue Mar 6, 2018

RFC: Deprecate multi-value non-scalar indexed assignment #24368

Closed

mbauman mentioned this issue Dec 19, 2018

Inconsistent dot vectorization? #30450

Closed

mbauman mentioned this issue Jan 26, 2019

What to do about nonscalar indexing? #30845

Open

DilumAluthge removed this from the 1.x milestone Mar 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make indexing expressions participate in dot syntax fusion #22858

make indexing expressions participate in dot syntax fusion #22858

StefanKarpinski commented Jul 18, 2017

KristofferC commented Jul 22, 2017

stevengj commented Jul 27, 2017

JeffBezanson commented Aug 10, 2017

JeffBezanson commented Aug 10, 2017

mbauman commented Aug 10, 2017

JeffBezanson commented Aug 10, 2017

StefanKarpinski commented Aug 11, 2017

JeffBezanson commented Aug 14, 2017

StefanKarpinski commented Aug 31, 2017

mbauman commented Oct 9, 2017 •

edited

Loading

StefanKarpinski commented Oct 10, 2017

mbauman commented Oct 20, 2017

pabloferz commented Oct 25, 2017

make indexing expressions participate in dot syntax fusion #22858

make indexing expressions participate in dot syntax fusion #22858

Comments

StefanKarpinski commented Jul 18, 2017

KristofferC commented Jul 22, 2017

stevengj commented Jul 27, 2017

JeffBezanson commented Aug 10, 2017

JeffBezanson commented Aug 10, 2017

mbauman commented Aug 10, 2017

JeffBezanson commented Aug 10, 2017

StefanKarpinski commented Aug 11, 2017

JeffBezanson commented Aug 14, 2017

StefanKarpinski commented Aug 31, 2017

mbauman commented Oct 9, 2017 • edited Loading

StefanKarpinski commented Oct 10, 2017

mbauman commented Oct 20, 2017

pabloferz commented Oct 25, 2017

mbauman commented Oct 9, 2017 •

edited

Loading