Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic conversion of AbstractArrays? #31

Closed
dbeach24 opened this issue Sep 18, 2016 · 15 comments
Closed

Generic conversion of AbstractArrays? #31

dbeach24 opened this issue Sep 18, 2016 · 15 comments

Comments

@dbeach24
Copy link

Would it make sense to define:

convert(::Type{SVector}, x::AbstractVector) = convert(SVector{length(x), eltype(x)}, x)

and perhaps similar for SMatrix or other arrays?

@andyferris
Copy link
Member

Would it make sense

Well, that depends. At the moment, I'm doing my best to make sure the package users always get fast, type-safe code. When x is a standard Vector, the statement would result in slow code and type instability that can propagate a long way. The current system means you have to try hard to get slow code, like convert(SVector{length(x)}, x).

Arguments for this definition include (a) convenience at the REPL and (b) interop with other statically-sized array packages. (a) might be resolved by a special function like static(x::AbstractVector) = SVector{length(x)}(x), similar to how we have vec() and so-on in Base. (b) could be solved by having a trait defined for fixed-size-ness that multiple packages can hook into.

@timholy
Copy link
Member

timholy commented Sep 19, 2016

I don't think we want to define this, for the reasons of type-stability you describe. In my recent code I'm becoming a fan of dropping helpful hints, like this:

convert(::Type{SVector}, x::AbstractVector) = error("convert(SVector, x) is not defined; you must specify the length, e.g., convert(SVector{$(length(x))}, x)")

There's some risk that users could find that annoying, because they'll wonder why that isn't done automatically. An alternative would be a link to the docs?

@c42f
Copy link
Member

c42f commented Sep 19, 2016

Hmm, it's context-dependent on whether this is ok for performance or not. If you immediately pass it off to a function it's ok, but in other cases it's probably a bad idea. So on balance, probably just a bad idea to define this generically. Tim's option seems pretty good to me.

@dbeach24
Copy link
Author

dbeach24 commented Sep 19, 2016

Thanks, all, for the responses. The use case I had in mind was dispatching to a function, so it would not have been horribly inefficient in my case, but I can see how other uses could cause gross inefficiencies.

Suppose I have the following function:

function sphere_to_cart(sphere::SVector{3})
    r, θ, ϕ = sphere
    sinθ = sin(θ)
    x = r * sinθ * cos(ϕ)
    y = r * sinθ * sin(ϕ)
    z = r * cos(θ)
    SVector(x, y, z)
end

I would like to make it easier to call this function generically, but I also have a need to dispatch on the size of incoming SVector (i.e. there is also a sphere_to_cart(::SVector{6}).

What is the recommended approach for writing this function so that it can be called with any AbstractVector. (I also have a suite of similar functions, and would like them all to be similarly called with any AbstractVector of the appropriate length.)

Would defining cart_to_sphere(x::AbstractVector) = cart_to_sphere(static(x)) (and similar for each transformation) be the recommended approach?

@timholy
Copy link
Member

timholy commented Sep 19, 2016

You can definitely do this, but I bet you'll be pretty unhappy with the performance---transitioning between the world of inferrable length and noninferrable length is certain to cause performance problems.

Personally, I'd write a separate cart_to_sphere implementation for generic AbstractVector that returns a Vector rather than some SVector. In other words, choose which world you want to live in, and stick with it. If you want to support both worlds, that means you need two code paths.

@dbeach24
Copy link
Author

dbeach24 commented Sep 19, 2016

@timholy Thanks. I was really hoping to write the code only once. I'm okay with it returning SVector in either case (although it would be cool if there were a good way to write this generically, too). Are you saying that the conversion from Vector to SVector and subsequent dispatch (based on SVector length) will be the bottleneck? I certainly don't expect this to perform as well as SVector code, but you seem to be suggesting that this will perform worse than classic Vector code. Is this just due to the dynamic allocation for the Vector, or is there something more?

Edit: I guess I'm pushing on this question because writing two versions feels like a losing proposition in terms of code maintenance (and also antithetical to promise of generic programming).

@dbeach24
Copy link
Author

Answering my own question. I see that the dynamic dispatch is the problem.

@timholy
Copy link
Member

timholy commented Sep 19, 2016

Yes, a big chunk is the time required for Julia to look up which method to call. You can help, considerably, with "manual-dynamic" dispatch (how's that for a confusing term?):

if length(v) == 3
    return cart_to_sphere(convert(SVector{3}, v))  # julia knows the specialization in advance
elseif length(v) == 6
    return cart_to_sphere(convert(SVector{6}, v))
...

However, the return value will also not be inferrable, so you basically have to do this in every function that uses the output of this function.

@andyferris
Copy link
Member

andyferris commented Sep 20, 2016

Yes, I would definitely recommend writing different functions with potentially different return types for AbstractVector and StaticVector (depending on the situation).

With the StaticVector case you can branch based on the size of the type of the vector, which will be a compile-time constant and activate branch elimination in the compiler, so you can in-principle write a single cart_to_sphere method for all sizes of static vectors (which works beyond SVector). I do this kind of thing:

function sphere_to_cart(sphere::StaticVector)
    # Eliminated at compile-time (** in theory **)
    size(typeof(sphere)) === (3,) || error("Input must be a 3-vector")
    @inbounds (r, θ, ϕ) = (sphere[1], sphere[2], sphere[3]) # much faster
    sinθ = sin(θ)
    x = r * sinθ * cos(ϕ)
    y = r * sinθ * sin(ϕ)
    z = r * cos(θ)
    SVector(x, y, z)
end

function sphere_to_cart(sphere::AbstractVector)
    size(sphere) === (3,) || error("Input must be a 3-vector")
    @inbounds (r, θ, ϕ) = (sphere[1], sphere[2], sphere[3]) # much faster
    sinθ = sin(θ)
    x = r * sinθ * cos(ϕ)
    y = r * sinθ * sin(ϕ)
    z = r * cos(θ)
    # In this case the return type is always length 3 
    # (but in other cases you will be better off returning `Vector` if length is unknown)
    SVector(x, y, z)
end

Of course you could factor it better with the second function calling the first. I just realized this isn't a great example, but in the general case the second function might need to return a Vector of unknown length (at compile time).

PS - if you are interested in this particular transformation, please check out CoordinateTransformations.jl which is already integrated with StaticArrays.

EDIT - fixed size checking to be faster - this is part of the "in theory" part above, where constant propagation and branch prediction are still a little touch-and-go in Julia

@c42f
Copy link
Member

c42f commented Sep 20, 2016

@andyferris why do we need two variants here? Why doesn't the second variant just work for SVector input as well?

@c42f
Copy link
Member

c42f commented Sep 20, 2016

Ok, Andy has just told me why you might want two versions - it's nothing to do with the semantics of the language, and everything to do with current compiler optimizer limitations. The generic version works perfectly well, it's just slow. In particular the lack of interaction between constant propagation and inlining will currently lead to the AbstractVector version not removing the error branch.

So depending on whether you care about that, you might get away with a single version here.

@andyferris
Copy link
Member

Right, it's just Julia implementation details at the moment.

Also, like I said in my edited post above, this example is actually rather poor for explaining why you might ever want two versions. (There's just very little difference and that one minor change makes very little difference to codegen in this case also - but other cases where the input or output size is unknown beforehand the distinction may be quite important.)

@c42f
Copy link
Member

c42f commented Sep 20, 2016

There's definitely a disconnect between mutable vs immutable AbstractArray, in particular the assumption that similar() makes sense for all AbstractArray types is a fundamental issue. If JuliaLang/julia#17115 were sorted out with magic compiler foo, we might be able to unify the interfaces.

We should start a separate issue about this, I think it'd be nice to have a place where we document the differences between mutable and immutable AbstractArrays with an eye to removing them if possible.

@dbeach24
Copy link
Author

@andyferris Thanks for all your help. I've decided that I'm better off supporting StaticArrays as the native data type for my transformations. sphere_to_cart is just one very simple example of what I'm trying to do, but thanks for pointing out CoordinateTransformations.jl.

By the way, I didn't notice any speed difference when adding @inbounds to code that indexes SVector. Does the compiler typically optimize away the bounds-check when indexing with a constant?

@dbeach24 dbeach24 reopened this Sep 20, 2016
@andyferris
Copy link
Member

Ahh, well that's embarrassing. In some types/functions I have actually deleted the bounds checking. It's quite annoying - sometimes the compiler will check the bounds of a tupple for you (sometimes not) and sometimes codegen wasn't removing the bounds checks properly (in complex code, but it usually worked fine).

In this case I suspect the compiler knows you are indexing a tupple with literals that are inbounds.

oschulz pushed a commit to oschulz/StaticArrays.jl that referenced this issue Apr 4, 2023
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants