Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalar type for broadcasting #18379

Open
TotalVerb opened this issue Sep 6, 2016 · 21 comments
Open

Scalar type for broadcasting #18379

TotalVerb opened this issue Sep 6, 2016 · 21 comments
Labels
broadcast Applying a function over a collection

Comments

@TotalVerb
Copy link
Contributor

I found myself needing a Scalar type for broadcasting purposes, to trick broadcast into not trying to broadcast something, even though it is a matrix. An initial (bad) implementation is here.

With #16986 close to merging, I wonder if such a feature would be desirable in Base.

@stevengj
Copy link
Member

stevengj commented Sep 6, 2016

@TotalVerb, can you give an example of where you needed this?

@stevengj
Copy link
Member

stevengj commented Sep 6, 2016

Wouldn't storing something in a 0-dimensional array do the trick?

@stevengj stevengj added the broadcast Applying a function over a collection label Sep 6, 2016
@pabloferz
Copy link
Contributor

After #16986 you might also wrap the array with a tuple as pointed out here #17411 (comment).

@TotalVerb
Copy link
Contributor Author

TotalVerb commented Sep 6, 2016

That's a good point; this Scalar type is just a one-element tuple. The only difference is that I envision Scalar(1) .+ 1 to be 2, whereas (1,) .+ 1 to be (2,). That is to say, Scalar would be a zero-dimensional tuple, instead of a one-dimensional tuple with one element.

@TotalVerb
Copy link
Contributor Author

@stevengj The current implementation of Scalar is basically just a 0-dimensional array, but it's a little slower than necessary because of the extra allocation.

@TotalVerb
Copy link
Contributor Author

@stevengj An example of where I needed this is to support broadcasting for Basket, which are a collection of currencies. It is useful to support both broadcasting over the individual instruments within the collection, and to support broadcasting the collections as if they were instruments in their own right.

Another example is given in the README for Scalars.jl.

@stevengj
Copy link
Member

stevengj commented Sep 7, 2016

(An aside: I don't think you need to define show(io, "text/plain", x) if you just want it to call show(io, x), as the former calls the latter by default.)

@TotalVerb
Copy link
Contributor Author

@stevengj It's to override the method inherited from AbstractArray.

@stevengj
Copy link
Member

stevengj commented Sep 7, 2016

Oh, right.

@stevengj
Copy link
Member

stevengj commented Sep 8, 2016

A simpler solution, that wouldn't require us to define any new types, is for broadcast to treat Ref(x) as a 0-dimensional array of x. This seems like something we should do anyway, because that's effectively what Ref is, and it seems relatively easy to add once #16986 is merged.

@StefanKarpinski
Copy link
Member

Related: #18271 (comment), wherein I proposed using Ref(x) with the syntax &x as a means of not dropping a scalar dimension when slicing or reducing. Of course, that entails considering Ref to be a 1-dimensional container, whereas this entails considering it to be a 0-dimensional container.

@TotalVerb
Copy link
Contributor Author

I'm worried that the compiler may not be able to eliminate the allocation of a Ref.

@stevengj
Copy link
Member

stevengj commented Sep 8, 2016

@StefanKarpinski, since you access a ref with ref[], it doesn't make sense to me to treat it as anything other than a 0-dimensional array. (Of course, getindex can always implement whatever method it wants for a Ref argument if you want magic handling of &x.)

@TotalVerb, in the rare cases where the heap allocation of a Ref object would matter compared to the cost of a broadcast over arrays of arrays, presumably in some kind of tight inner loop, couldn't you simply allocate the Ref outside the loop and mutate it as needed?

@stevengj
Copy link
Member

stevengj commented Sep 8, 2016

Put another way, I tend to think that Ref should be treated as a 0-dimensional array in broadcast anyway, regardless of whether we want a Scalar type too. My inclination would be to add Ref support in broadcast first, and then see if the performance really is a practical annoyance before we add another type.

@TotalVerb
Copy link
Contributor Author

@stevengj I think that's a very sane approach.

@andyferris
Copy link
Member

FYI I'm working on a StaticArrays implementation over at JuliaArrays/StaticArrays.jl#50 (it made sense there since this is basically an immutable version of Array{T,0} and won't have that associated allocation penalty).

@TotalVerb
Copy link
Contributor Author

I'd really like this before 1.0; I'll try to make a PR soon. Ref and (x,) do not work for many use cases because they still take priority over Nullable. The Scalar above also implicitly arrayifies the argument.

julia> Nullable(4) .+ (5,)
ERROR: MethodError: no method matching +(::Nullable{Int64}, ::Int64)
Closest candidates are:
  +(::Any, ::Any, ::Any, ::Any...) at operators.jl:424
  +(::Complex{Bool}, ::Real) at complex.jl:239
  +(::Char, ::Integer) at char.jl:40
  ...
Stacktrace:
 [1] _ntuple at ./tuple.jl:135 [inlined]
 [2] ntuple(::Base.Broadcast.##3#4{Base.#+,Tuple{Nullable{Int64},Tuple{Int64}}}, ::Type{Val{1}}) at ./tuple.jl:128
 [3] broadcast(::Function, ::Nullable{Int64}, ::Tuple{Int64}) at ./broadcast.jl:434

Aside from being a problem with Nullable already, other broadcast types presumably also might want the ability to force arguments to not be treated as an array.

@pablosanjose
Copy link
Contributor

Also, while Ref wrapper solution from #18965 does work without Nullable, it seems to me that it is quite slower than it needs to be... Comparing two ways to do e.g. ((1,1),(2,2),(3,3)) .== &(3,3):

julia> f1(x, y) = x .== Ref(y)
f1 (generic function with 1 method)

julia> function f2(x, y)
           r = BitArray(length(x))
           for (i,e) in enumerate(x)
               r[i] = e==y
           end
           return r
       end
f2 (generic function with 1 method)

julia> @btime f1(((1,1),(2,2),(3,3)), (3,3))
  399.085 ns (4 allocations: 4.34 KiB)
3-element BitArray{1}:
 false
 false
  true

julia> @btime f2(((1,1),(2,2),(3,3)), (3,3))
  34.815 ns (2 allocations: 128 bytes)
3-element BitArray{1}:
 false
 false
  true

@pdeffebach
Copy link
Contributor

Confusion about Ref and Scalar has cropped up a bit on Discourse. See:

https://discourse.julialang.org/t/how-to-select-multiple-items-using-dataframesmeta/16231/2

https://discourse.julialang.org/t/dataframes-obtaining-the-subset-of-rows-by-a-set-of-values/15923/12

Just echoing that it would be nice if we had a Scalar function which worked like Ref with the benefit of having a more intuitive name. I understand this is a 1.x possibility.

@stevengj
Copy link
Member

Note that one option we've discussed, without having a new type, is to define &x as sugar for a ref (#27608), which would make it easier to escape from broadcasting.

@MasonProtter
Copy link
Contributor

MasonProtter commented Apr 25, 2020

Here's a minimal implementation of a Scalar type. I think there's an argument to be made that such a thing would be preferable to Ref since it's immutable and will be elided by the compiler more easily.

Is there still interest in putting such a thing Base? I could try opening a PR if people are interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection
Projects
None yet
8 participants