Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency in conversion to AbstractArray{T,N} in LinearAlgebra and Base #34995

Closed
daanhb opened this issue Mar 4, 2020 · 5 comments
Closed

Comments

@daanhb
Copy link
Contributor

daanhb commented Mar 4, 2020

It seems that conversion to the abstract type AbstractArray{T,N} is not implemented consistently in the array types of LinearAlgebra. Some types (Hermitian, Symmetric and Diagonal) explicitly support it, but others (Transpose, Adjoint and the triangular types) do so only implicitly. This leads to different behaviour, especially when using immutable arrays.

Perhaps this conversion is not widely used, but it is frequently useful at least to me :-) Writing convert(AbstractArray{T,N}, myarray) is a convenient way to generically update the element type of a matrix, while retaining its structure. It can also be used merely to ensure that an array has a certain element type T, and that is how it is used for example in the constructor of Diagonal{T} and in many other places in LinearAlgebra.

Base defines the conversion convert(AbstractArray{T,N}, myarray) and subsequently calls the abstract constructor AbstractArray{T,N}(myarray). The fallback of this constructor does:
copyto!(similar(myarray, T), myarray).
This fallback will always return a mutable array, even when myarray was immutable. This makes sense for a fallback, otherwise the data of myarray could not be copied back into the new array. But it means the result may not be optimal.

An example of this fallback at work is the following:

julia> using StaticArrays

julia> convert(AbstractVector{Int}, SVector(1,2))
2-element SArray{Tuple{2},Int64,1,2} with indices SOneTo(2):
 1
 2

julia> convert(AbstractVector{Float64}, SVector(1,2))
2-element MArray{Tuple{2},Float64,1,2} with indices SOneTo(2):
 1.0
 2.0

Conversion to AbstractVector{Int} is fine, but changing the element type from Int to Float64 suddenly makes the result a mutable array (MArray rather than SArray), which may be less efficient later on. This can of course be fixed in StaticArrays and it soon will be (see JuliaArrays/StaticArrays.jl#747).

However, in StaticArrays one can not fix the linear algebra types in stdlib. Fixing the example above, i.e., returning immutable SVector and SArray types upon conversion, one gets the following inconsistent behaviour when using Symmetric or UpperTriangular:

julia> s = Symmetric(SA[1 2; 2 3])
2×2 Symmetric{Int64,SArray{Tuple{2,2},Int64,2,4}}:
 1  2
 2  3

julia> convert(AbstractMatrix{Float64}, s)
2×2 Symmetric{Float64,SArray{Tuple{2,2},Float64,2,4}}:
 1.0  2.0
 2.0  3.0

julia> tri = UpperTriangular(SA[1 2; 0 3])
2×2 UpperTriangular{Int64,SArray{Tuple{2,2},Int64,2,4}}:
 1  2
   3

julia> convert(AbstractMatrix{Float64}, tri)
2×2 UpperTriangular{Float64,MArray{Tuple{2,2},Float64,2,4}}:
 1.0  2.0
     3.0

The point is that the symmetric type retains the immutable SArray after conversion, but the triangular type now carries a mutable MArray. The reason is that Symmetric implements the AbstractMatrix{T}(A::Symmetric) constructor and recursively converts its data, but UpperTriangular doesn't. The result is correct, but in this case arguably not optimal.

A simpler example strictly within Base is:

julia> convert(AbstractVector{Float64}, 1:3)
3-element Array{Float64,1}:
 1.0
 2.0
 3.0

This is the same fallback at work. One could argue that in some cases a better answer would be 1.0:1.0:3.0 (which is immutable).

These issues are easily fixed and I'd be happy to do so. In fact I already tested what would happen, see JuliaArrays/StaticArrays.jl#747 (comment). However, that depends on whether the "fix" is actually desirable.

Is convert(AbstractVector{Float64}, myvector) a good way to generically change an element type?
Do developers currently assume that the output after this conversion is mutable (I found exactly one example of that assumption in SparseArrays here)?
Could this affect the promotion of arrays somewhere?

There have been several issues about the construction of array, but I found none specifically on the conversion to an abstract type.

@daanhb
Copy link
Contributor Author

daanhb commented Mar 4, 2020

Following up, I have done some more tests to illustrate what I'm thinking. Changes could be limited to something like the following:

using LinearAlgebra

AbstractMatrix{T}(A::Adjoint) where {T} = Adjoint(AbstractVector{T}(A.data))
AbstractMatrix{T}(A::Transpose) where {T} = Transpose(AbstractVector{T}(A.data))

for t in (:LowerTriangular, :UnitLowerTriangular, :UpperTriangular, :UnitUpperTriangular)
           @eval AbstractArray{T,2}(A::$t) where {T} = $t(AbstractMatrix{T}(A.data))
end

AbstractArray{T,1}(r::AbstractRange) where {T<:Real} = convert(T, first(r)):convert(T, step(r)):convert(T, last(r))
AbstractArray{T,1}(r::AbstractUnitRange) where {T<:Real} = convert(T, first(r)):convert(T, last(r))

The point is that AbstractVector/Matrix on these types invokes AbstractVector/Matrix on the underlying storage. The Symmetric, Hermitian and Diagonal types already do something equivalent.

I've included the conversion of ranges above to show an example:

julia> d = Diagonal(1:3)
3×3 Diagonal{Int64,UnitRange{Int64}}:
 1    
   2  
     3

julia> convert(AbstractMatrix{Float64}, d)
3×3 Diagonal{Float64,StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}:
 1.0       
     2.0   
         3.0

julia> convert(AbstractMatrix{BigFloat}, d)
3×3 Diagonal{BigFloat,StepRangeLen{BigFloat,BigFloat,BigFloat}}:
 1.0       
     2.0   
         3.0

The structure of the array is preserved. In contrast, the current behaviour is:

julia> d = Diagonal(1:3)
3×3 Diagonal{Int64,UnitRange{Int64}}:
 1    
   2  
     3

julia> convert(AbstractMatrix{Float64}, d)
3×3 Diagonal{Float64,Array{Float64,1}}:
 1.0       
     2.0   
         3.0

julia> convert(AbstractMatrix{Int}, d)
3×3 Diagonal{Int64,UnitRange{Int64}}:
 1    
   2  
     3

The structure of the range object is lost in the conversion of the element type to Float64: the data is now an Array. This could be a feature - the outcome is now writable - but as a feature it is not consistent because the last conversion to AbstractMatrix{Int} actually preserves the range object, because it already had the Int element type.

@daanhb
Copy link
Contributor Author

daanhb commented Mar 5, 2020

Hmm, chasing a few resulting errors in Julia's test suite, I'm beginning to see the issues here. Knowing what to look for now, I also came across #22218.

There is an interplay between similar, copy_oftype and convert(AbstractArray{T,N}, A) that is hard to change without being more pervasive, especially for the Transpose and Adjoint types. Most notably, copy_oftype(A, T) is implemented in terms of conversion to AbstractArray, and its output is assumed to be writable in several places. Purely from the syntax convert(AbstractArray{T,N}, A) alone I don't see why this should be the case, but I guess it would be equally problematic to argue for any other specific interpretation.

@daanhb
Copy link
Contributor Author

daanhb commented Mar 11, 2020

By reviewing code related to this issue I actually found a bug in which a vector is unintentionally overwritten, for which I'll file a separate issue. The bug is arguably due to the ambiguity surrounding the copy and mutability semantics of convert(AbstractArray{T,N}, A) versus copy_oftype(A, T).

The first problem that led to this issue is that currently at least two packages in JuliaArrays (StaticArrays and LazyArrays) implement conversion to AbstractArray by returning an immutable array. That is the most efficient thing to do. However, at some places in Base this conversion is assumed to return a mutable copy and this could lead to more issues in the future. (I feel some responsibility here due to JuliaArrays/StaticArrays.jl#747.)

Meanwhile, I've learned more about the code in LinearAlgebra, and contrary to what I thought before a fairly modest solution exists which is not that intrusive (as far as I can tell). I'm preparing a pull request to illustrate it. Mainly, the idea is to consistently use copy_oftype whenever that is the intended behaviour: to copy the array and possibly change its element type. This is most often used to subsequently invoke an in-place algorithm. In addition, and this may surely be more controversial, I would suggest to change its definition from:

copy_oftype(A::AbstractArray{T}, ::Type{T}) where {T} = copy(A)
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = convert(AbstractArray{S,N}, A)

to

copy_oftype(A::AbstractArray{T}, ::Type{T}) where {T} = copy(A)
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = copyto!(similar(A, S), A)

This does look mightily scary at first, but it simply makes the new fallback of copy_oftype identical to the existing fallback of the AbstractArray conversion. At least the way things are currently. That reassuringly means that nothing actually changes for anybody, except for those specialized arrays that have implemented conversion to AbstractArray themselves differently. (Right? It looks that way to me.)

The semantics would be more orthogonal than they are currently:

  • convert(AbstractArray{T}, A) does nothing if eltype(A)==T and returns the most efficient way to represent A with eltype T otherwise, whatever that might be
  • copy_oftype(A, T) returns a mutable copy of A with eltype T

That is mostly how these two functions are already used, it is just not how they are currently implemented. With the change above, immutable arrays can specialize copy_oftype and return something mutable if they want to interoperate better with LinearAlgebra.

In order to make this work, I had to define copyto! for Hessenberg and SymTridiagonal matrices. This functionality was missing, but may be useful regardless of this issue. So, to be continued, hopefully soon.

For completeness, an alternative could be to start using T.(A) more often. That has been proposed elsewhere (#22218), but I'm not sure it can capture the two use-cases above on its own. I did not think this through though.

@andreasnoack
Copy link
Member

@daanhb Is this obsolete or did you close this because of no feedback?

@daanhb
Copy link
Contributor Author

daanhb commented Jun 23, 2021

Sorry for the confusion @andreasnoack - I closed because #40831 was merged. I thought the PR was linked to this issue, but now I see that it wasn't. There was more than enough feedback, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants