Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make AbstractArray{T}(...) constructor in LinearAlgebra more consistent #40831

Merged
merged 4 commits into from
Jun 19, 2021
Merged

make AbstractArray{T}(...) constructor in LinearAlgebra more consistent #40831

merged 4 commits into from
Jun 19, 2021

Conversation

daanhb
Copy link
Contributor

@daanhb daanhb commented May 15, 2021

This is an update to the old pull request #35165 (with some delay - apologies). The implementation has not changed compared to before, but the underlying issue hasn't changed either, hence the update.

Summary: The main change is to make sure that AbstractArray{T}(A::SomeStructuredMatrix) retains the structure of the structured matrix A for the types defined in LinearAlgebra. If possible, it will only change the eltype to T. This is the case currently for some arrays but not all. In cases where it isn't, arrays that are defined in terms of immutable arrays end up being mutable. That's not the end of the world, but it is avoidable. The constructor may be called by writing convert(AbstractArray{T}, A::SomeStructuredMatrix).

One example:

julia> using LinearAlgebra, StaticArrays

julia> v = SVector(1,2)'
1×2 adjoint(::SVector{2, Int64}) with eltype Int64 with indices SOneTo(1)×SOneTo(2):
 1  2

julia> convert(AbstractMatrix{Float64}, v)
1×2 adjoint(::MVector{2, Float64}) with eltype Float64 with indices SOneTo(1)×SOneTo(2):
 1.0  2.0

Here, the wrapped SVector has turned into an MVector (which is mutable).

Implementation: a relevant function used frequently in LinearAlgebra is copy_oftype, which was defined in terms of convert(AbstractArray{T}, A). It was assumed that this conversion returns a mutable array. I've replaced it by copyto!(similar(A, T), A) in this pull request, which is more explicitly mutable. (And it is close to what it did anyway due to the fallback in array.jl.)

I think there are three kinds of matrices similar to A that are often used in LinearAlgebra:

  • a copy of A with the same structure as A, but mutable and with eltype T: this is copy_oftype(A, T)
  • a copy of A with the same dimensions as A, but mutable and with eltype T: there was no function for this, I've added one and called it copy_similar_oftype(T, A) (because it uses similar(A, T, size(A)) to make the copy). This is often used to pass to rdiv! or ldiv!, where the result in general does not have the same structure as A.
  • a matrix like A but with eltype T, not necessarily mutable and no need to copy: this is convert(AbstractArray{T}, A)

Status: I have ported the changes from #35165 to the current master and the tests pass locally. However, last year I went through all the code in LinearAlgebra looking at all occurrences of convert(AbstractArray....) and copy_oftype syntax and I did not (yet) repeat the exercise. Also, the fallback for AbstractArray{T}(...) in Base performs an axes check before invoking copyto!, so the copyto!s of all matrices in LinearAlgebra should check axes too. Finally, copy_similar_oftype is not a good name, and it really comes down to copyto!(similar(A, T, size(A)), A) which is short and informative too, so this may not even have to be a function.

@daanhb daanhb changed the title Dh/linearalgebra make AbstractArray{T}(...) constructor in LinearAlgebra more consistent May 15, 2021
@daanhb
Copy link
Contributor Author

daanhb commented May 15, 2021

As suggested before in #35165, I've added an ImmutableArray type to the testhelpers. The tests really only need a single case of an immutable array that can be converted to a different eltype. Perhaps the new InfiniteArray in testhelpers could also be adapted for that purpose.

@dkarrasch dkarrasch added domain:arrays [a, r, r, a, y, s] domain:linear algebra Linear algebra labels May 16, 2021
@daanhb
Copy link
Contributor Author

daanhb commented May 17, 2021

Just to mention: this PR seems related to #40678 and #40746, in which the syntax map(T, myarray) retains the structure of the array while changing the eltype. The implementation has no overlap, I think, but the goals are similar.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction between "preserve structure" and "don't preserve structure", is that original or was that an implicit copy_oftype-contract? Is this by any means "breaking"?

copy_oftype(A::AbstractArray{T}, ::Type{T}) where {T} = copy(A)
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = convert(AbstractArray{S,N}, A)
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = copyto!(similar(A,S), A)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't seem to need the type parameters T and N:

Suggested change
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = copyto!(similar(A,S), A)
copy_oftype(A::AbstractArray, ::Type{S}) where {S} = copyto!(similar(A, S), A)

# copy_similar_oftype: like copy_oftype(A, T), but discarding structure of `A`,
# comparable to how the three-argument function similar(A, S, size(A)) behaves.
# Frequently useful as the second argument to ldiv! or the first argument to rdiv!
copy_similar_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here:

Suggested change
copy_similar_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} =
copy_similar_oftype(A::AbstractArray, ::Type{S}) where {S} =

@daanhb
Copy link
Contributor Author

daanhb commented May 18, 2021

Thanks for looking at this! Where does "preserving structure" come from, that's a good question. I'll try to be concise but I'm not good at that... There are three things combined here. It may serve as documentation later so I'll write it out.

  1. "Preserving structure" is not original, but it wasn't explicit before. The contract for similar was discussed a few times, but never resolved conclusively (e.g. similar and traits #18161, A new "outer_similar" #22218). I don't think we can touch that without side-effects. However, from these issues, it is clear that copy_oftype in LinearAlgebra (it is not used elsewhere) specifically means "give me an array like this one but with a different eltype". This in itself does not promise to preserve structure, but according to the contract it can, and it is usually the most efficient thing to do. This is known, and that is why in some cases copy_oftype is nót used: for example when passing an argument to ldiv! or rdiv!, knowing that the result may not have the same structure as A or that the structure of A is not writable in the way ldiv! expects (respecting symmetry at all times, say). One has to prepare for that, and that makes copy_oftype not safe to use.

The name copy_oftype hints at this contract and hides the detail of how it is implemented. It was implemented in terms of convert(AbstractArray{T}) before. The sequence went:

copy_oftype(A, T) -> convert(AbstractArray{T}, A) -> AbstractArray{T}(A) -> copyto!(similar(A, T), A).

I've removed the middlemen, precisely in order to preserve what copy_oftype already did. That is because the pull request changes what AbstractArray{T}(A) does (it now preserves mutability). It does not change what copy_oftype does in the end. That part isn't breaking, and even if it is, it is only internal.

  1. Is it breaking elsewhere? After all something did change. It depends on whether something else, in some package elsewhere, relies on AbstractArray{T}(A) being mutable even if A isn't. I'd classify that as a bug - mutability is a promise of similar but not the abstract constructor - but strictly speaking, such packages might now break. I can't rule it out. In fact, the code in LinearAlgebra itself depended on it! This pull request was motivated by the opposite: I considered AbstractArray{T}(A) returning something mutable for a non-mutable array to be, perhaps not a bug, but not optimal.

A curious exception was SymTridiagonal which in this line did define what AbstractArray{T}(A) means recursively in terms of the underlying data. Any code that relies on AbstractArray{T}(A) always being mutable would break as things stand for SymTridiagonal types. I think the constructor may have been added because the copyto! method was missing, so the fallback copyto!(similar...) would fail. I've added copyto!, but I don't know why it was missing. A test for it existed but was disabled.

  1. Since I went through the trouble of figuring it out, I wanted to document it. The name copy_oftype hides the implementation detail that it is using similar(A, S) to achieve its result and that's useful because this is subtle behaviour of similar. What similar does or should do is ambiguous, but what copy_oftype does is not. But that was not the case in situations where similar(A, T, size(A)) is used. In several places the code relies on the fact that this returns a mutable array free of structure constraints, but that was not clear (to me) from the syntax (nor is it clear that it will always be that way). So I've introduced a name for that purpose, copy_similar_oftype, to "hide" the implementation detail. Come to think of it, if the new name has the word "similar" in it, it actually exposes the implementation. That makes it a bad name :-) But it does to these situations, what copy_oftype does to the other cases. I've now seen that I missed some cases. Thinking further about it, perhaps this could be replaced by Matrix{T}(A).

In conclusion: (i) if the small chance of breaking something is not considered to be worth it, it is best to discard this pull request. I'm fine with that, I've updated it so that it could be closed regardless of whether it is merged or not, (ii) if I got anything wrong then I'd like to hear about it :-) and (iii) the copy_similar_oftype thing is independent of the rest and could be separated or removed.

@daanhb
Copy link
Contributor Author

daanhb commented May 18, 2021

Hmm, now that I put it this way, maybe removing middlemen is not the best idea here. If something somewhere does break, intercepting the right function call can always be used to fix it. But there are now fewer hooks to do so. On the other hand, if the conversion is not optimal, having those same hooks also provides means to fix it. I have experience with StaticArrays, but to be honest I don't know how this might interact with GPU arrays, or distributed arrays... Maybe it is not worth it at this point.

@dkarrasch
Copy link
Member

I still need to digest your comments, but as a quick comment: we could have a PkgEval run to see how/whether this affects other packages at some point.

@daanhb
Copy link
Contributor Author

daanhb commented May 20, 2021

Looking closer at a few cases, I don't think I got the semantics of copy_oftype right. It is assumed to "preserve structure" in some places, and that part is fine. It is also assumed that the output is writable in some places. For example, here the result of copy_oftype is passed to eigen!. This part is not okay, because if the eltype does not change then copy_oftype simply returns copy(A). But the result might not be mutable. Based on this, you can create examples that lead to failure of eigen, both before and after this PR.

A solution might be to go even further and replace the original definition

copy_oftype(A::AbstractArray{T}, ::Type{T}) where {T} = copy(A)
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = convert(AbstractArray{S,N}, A)

by

copy_oftype(A::AbstractArray, ::Type{T}) where {T} = copyto!(similar(A, T), T)

But that could have more side effects. I'll think about some examples to show what I mean here.

@daanhb
Copy link
Contributor Author

daanhb commented May 20, 2021

Here is an example on the use of similar(A, T) versus similar(A, T, size(A)), where the former preserves structure but the latter does not. It is in the multiplication of triangular with diagonal matrices, here and in the lines below:

(*)(A::AbstractTriangular, D::Diagonal) =
   rmul!(copyto!(similar(A, promote_op(*, eltype(A), eltype(D.diag))), A), D)
(*)(D::Diagonal, B::AbstractTriangular) =
   lmul!(D, copyto!(similar(B, promote_op(*, eltype(B), eltype(D.diag))), B))

(*)(A::AbstractMatrix, D::Diagonal) =
   rmul!(copyto!(similar(A, promote_op(*, eltype(A), eltype(D.diag)), size(A)), A), D)
(*)(D::Diagonal, A::AbstractMatrix) =
   lmul!(D, copyto!(similar(A, promote_op(*, eltype(A), eltype(D.diag)), size(A)), A))

The first two lines can use the similar(A, T) pattern, because the code knows that the result remains triangular here. The last two lines use similar(A, T, size(A)), because the result could be anything.

@daanhb
Copy link
Contributor Author

daanhb commented May 20, 2021

I came across an unrelated issue:

julia> using LinearAlgebra; D = Diagonal(1:3)
3×3 Diagonal{Int64, UnitRange{Int64}}:
 1    
   2  
     3

julia> D^2
ERROR: MethodError: no method matching Vector{Int64}(::Diagonal{Int64, UnitRange{Int64}})
Closest candidates are:
  Array{T, N}(::AbstractArray{S, N}) where {T, N, S} at array.jl:540
  Vector{T}() where T at boot.jl:467
  Vector{T}(::Core.Compiler.AbstractRange{T}) where T at range.jl:1042
  ...
Stacktrace:
 [1] convert(#unused#::Type{Vector{Int64}}, a::Diagonal{Int64, UnitRange{Int64}})
   @ Base ./array.jl:532
 [2] Diagonal{Int64, Vector{Int64}}(diag::Diagonal{Int64, UnitRange{Int64}})
   @ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/diagonal.jl:10
 [3] convert(T::Type{Diagonal{Int64, Vector{Int64}}}, m::Diagonal{Int64, UnitRange{Int64}})
   @ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/special.jl:53
 [4] to_power_type(x::Diagonal{Int64, UnitRange{Int64}})
   @ Base ./intfuncs.jl:240
 [5] power_by_squaring(x_::Diagonal{Int64, UnitRange{Int64}}, p::Int64)
   @ Base ./intfuncs.jl:255
 [6] ^(A::Diagonal{Int64, UnitRange{Int64}}, p::Int64)
   @ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/dense.jl:447
 [7] macro expansion
   @ ./none:0 [inlined]
 [8] literal_pow(f::typeof(^), x::Diagonal{Int64, UnitRange{Int64}}, #unused#::Val{2})
   @ Base ./none:0
 [9] top-level scope
   @ REPL[41]:1

It fails because the Diagonal{T,V} constructor is being called with a diagonal matrix as argument (item [2] in the stacktrace above), whereas the constructor here expects a vector.

@dkarrasch
Copy link
Member

Yes, we should catch that by providing a method

(^)(D::Diagonal, a::Number) = Diagonal(D.diag.^a)

@daanhb
Copy link
Contributor Author

daanhb commented May 20, 2021

A solution might be to go even further and replace the original definition

copy_oftype(A::AbstractArray{T}, ::Type{T}) where {T} = copy(A)
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = convert(AbstractArray{S,N}, A)

by

copy_oftype(A::AbstractArray, ::Type{T}) where {T} = copyto!(similar(A, T), T)

But that could have more side effects. I'll think about some examples to show what I mean here.

This quickly leads to lots of errors, bad idea :-)

@daanhb
Copy link
Contributor Author

daanhb commented May 22, 2021

After some more study of the code I'm reaching new conclusions:

  • First of all, I think the result of copy_oftype should always be mutable. It is used mainly to pass arrays to in-place algorithms, like lu!, ldiv!, rdiv!, and so on. That makes no sense if the result is not mutable.

  • One can't generally expect that convert(AbstractArray{T}, A) returns a mutable array, even if T differs from eltype(A).

  • One also can't expect that copy(A) is mutable. A notable exception within Base is a range: copy(1:4) === 1:4. (Of course copying an immutable array makes little sense regardless.)

These considerations leave one possible implementation, namely

copy_oftype(A, T) = copyto!(similar(A, T), A)

The result is mutable and equals A. In spite of what I said above, this actually works - I had made unrelated changes that led to errors before.

@daanhb
Copy link
Contributor Author

daanhb commented May 22, 2021

It still leaves the question in my mind what copy_oftype is for, because it does not actually do what you would want it to do in various places.

As an example, lu(A) uses copy_oftype to pass a copy of A to lu!. This doesn't work for most matrices. Even the factorization of a diagonal matrix fails:

julia> using LinearAlgebra; lu(Diagonal(rand(3)))
ERROR: MethodError: no method matching lu!(::Diagonal{Float64, Vector{Float64}}, ::Val{true}; check=true)

In fact, almost all standard matrix decompositions fail for diagonal matrices, and for several other structured matrices of LinearAlgebra too. Sometimes for different reasons. E.g. qr actually calls the three-argument function similar(A, T, size(A)), svd has no fallback at all for abstract matrices.

The only one that reliably works is schur(A) because for abstract matrices it copies A to a dense Array. That strikes me as the most sensible thing to do. What all factorizations share is that they invoke an in-place function, which applies if A is strided. Over the years I've grown used to these errors of the in-place factorization functions, but now I'm wondering why. Since a copy of the data is being made anyway - why not copy any abstract matrix for all these factorizations?

@daanhb
Copy link
Contributor Author

daanhb commented May 22, 2021

I've tried it out for lu in the current update of the PR, to see what it might look like. The implementation is functionally equivalent to what existed before, but unless a specific implementation of lu! is availabe, lu(A) will now fall back to making a dense copy of A.

Of course it would be even better to look at all the structured matrices in LinearAlgebra and consider their factorization. The fallback for diagonal matrices is of course not efficient when you treat them as a dense array...

It seems beneficial to try to find a common way to go from lu to lu!, qr to qr!, and so on. Perhaps that should be a separate issue?
I think that copy_oftype is only useful in case an in-place algorithm is available (so that "structure is preserved"), and for an AbstractMatrix it makes sense to use a copy_to_array function. I have a feeling that all occurrences of copyto!(similar(A, T, size(A)), A) should be reconsidered.

@dkarrasch
Copy link
Member

In fact, almost all standard matrix decompositions fail for diagonal matrices

Yes, I think we have an issue for that, and I think I started to work on it. IIRC, the main issue was with Bidiagonal, for which I was unable to come up with a type-stable solution. But for the diagonal ones, I could see if I find my previous attempt.

@daanhb
Copy link
Contributor Author

daanhb commented May 31, 2021

Ok, thanks. In any case I'll remove the changes to lu again here, because it's conflating the issues.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Very clear. To help users, I think you should add a few comments in the main file to explain the rationale behind the three copy* functions. And we really need to add lu methods for every special matrix type soon, now that lu starts working for them.

Comment on lines 162 to 164
copyto!(dest::SymTridiagonal, src::SymTridiagonal) =
(copyto!(dest.dv, src.dv); copyto!(dest.ev, src.ev); dest)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this better be moved to tridiag.jl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes :-)
I moved it

@daanhb
Copy link
Contributor Author

daanhb commented Jun 16, 2021

I had to do a rebase, because the interface for lu changed in the meantime. It seems from the tests something went wrong, I'll do some more local tests first. I have yet to add comments to the copy* functions also.

@daanhb
Copy link
Contributor Author

daanhb commented Jun 17, 2021

Okay, I've made a new attempt at documenting the copy functions. Unfortunately it is a bit fuzzy, because the behaviour of similar(A, T) versus similar(A, T, size(A)) is a bit fuzzy. At least the fuzziness is now confined to these three functions, I guess.

I have yet to add some tests for lu. One complication in the code is that you want to continue to use copy_oftype for all the special matrices for which lu! is implemented, and copy_to_array only as a fallback, because the latter would result in a less efficient lu! being called (for dense matrices). That is why every specific implementation of lu! now also has a corresponding definition of lu, which invokes copy_oftype. I'm thinking about how to test for this.

@dkarrasch
Copy link
Member

I think this already in very good shape. The few extra lu methods are fine to me, since they seem to make a performance difference.

@daanhb
Copy link
Contributor Author

daanhb commented Jun 17, 2021

Alright, does that mean we're good here? Apart from a new apparent conflict in diagonal.jl.

@dkarrasch
Copy link
Member

I think we should make sure the new methods are tested, but otherwise we're good.

@dkarrasch
Copy link
Member

In diagonal, it's probably the removal of many methods by my very recent commit.

@daanhb
Copy link
Contributor Author

daanhb commented Jun 17, 2021

Do you mean adding tests for lu, or for the copy_* functions (or both)?

@dkarrasch
Copy link
Member

dkarrasch commented Jun 17, 2021

Ideally for just everything. 🤣 but one could argue that the copy_* methods are exercised excessively by all the other linalg stuff.

@daanhb
Copy link
Contributor Author

daanhb commented Jun 17, 2021

It turns out that Tridiagonal was the only type for which the return type of the LU factorization differs. For Symmetric and Hermitian matrices, I could only indirectly check that a different method of lu is invoked (using @which). This is not ideal, but would at least trigger an error if these intercepting methods are removed at some point. For StridedMatrix, I don't think there is a difference in practice between copy_oftype and copy_to_array.

@dkarrasch
Copy link
Member

Yes, actually the @which tests may be a bit too specific. We care very rarely about which code path is taken, and the copy_* functions are sufficiently internal, so I believe we could as well reduce the tests at this point.

@daanhb
Copy link
Contributor Author

daanhb commented Jun 17, 2021

So just keep the tests for lu itself (I selected a diagonal and bidiagonal matrix) and the one for lu of tridiagonal matrix perhaps?

@dkarrasch
Copy link
Member

Yes, sounds reasonable. I realized that the copy_* functions can be regarded as helper functions, and typically we don't test these explicitly, but only via the API. I was wondering if you could turn your comments into docstrings, so people could look them up. They wouldn't appear in the docs, so as long as it is helpful, I think it's okay if they are a bit rough.

stdlib/LinearAlgebra/test/lu.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/test/lu.jl Outdated Show resolved Hide resolved
@daanhb
Copy link
Contributor Author

daanhb commented Jun 18, 2021

I have applied these changes, but had also turned the comments for the copy_* functions into docstrings in the meantime, so I combined it into a separate commit -> hence the commit. Locally, all tests passed - just now the online build seems to give an error about a repository key. I'll give it some time to finish.

@dkarrasch
Copy link
Member

One last thing: I have just merged #41248. Could you please rebase and adjust those few lines to your new copy_* scheme? Once that's done, we merge, I promise! While we're at it, let's run pkgeval to check if this hits the ecosystem badly.

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here. cc @maleadt

@dkarrasch
Copy link
Member

Aha, there seems to be only one relevant issue: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/ada4020_vs_3279e20/TaylorSeries.1.8.0-DEV-7bddc7efdc.log

The corresponding code is here:

https://github.com/JuliaDiff/TaylorSeries.jl/blob/4421e17e591769234031d6625c566b85d9700398/src/arithmetic.jl#L611-L619

Replacing the copy_oftype(A, S) there by an explicit convert(AbstractArray{S}, A), which is handled by the package itself should fix that Julia-version-independently? Oh, or they could overload 2-arg similar, a method that they don't currently have.

@dkarrasch
Copy link
Member

Ok, let's fix that issue once this is in master.

@andreasnoack
Copy link
Member

@daanhb Only now, I'm reading through this and I have a historical comment and few questions. The original motivation for copy_oftype was to avoid the potential aliasing convert(AbstractMatrix{T}, A::SomeMatrix{S}) when T==S.

Originally, we didn't really consider anything but Array. The non-Array-array-awareness only arrived later and has been implemented in a pretty ad hoc manner for which I should probably take most of the blame. So your changes here are, I believe, the right direction. The whole discussion around similar is pretty complicated, though, as can be seen in many issues. It's generally useful to have a generic function like similar but the exact behavior it is supposed to have is difficult to pin down. E.g. it should probably not be the same for the left- and the right-hand-side of a linear solve.

Now the questions, why didn't you use an Array constructor instead of copy_to_array? And why does e.g. lu and schur use copy_to_array while qr uses copy_similar?

@daanhb
Copy link
Contributor Author

daanhb commented Aug 24, 2021

@andreasnoack Thanks for your comments. I could find a lot of information in old issues and pull requests, a very valid resource, but I was not aware of the original motivation for copy_oftype.

The current implementation of copy_to_array comes from the way schur copied an AbstractMatrix in an earlier change here. That change had passed peer review and I believe the functionality is often used (schur is used a lot to compute functions of matrices) so it looked the safest option to me. It had not occurred to me to simply use Array.

Regarding qr and the inconsistency, I think all matrix factorizations should eventually copy to a dense Array as a fallback for abstract matrices (or even for any A?). This had already happened for schur of an AbstractMatrix, and I've added lu. I did not want to make the pull request even larger and do them all...

Also, the interface for lu is not ideal yet. What I mean is, right now almost every method of lu and lu! repeats the same default values for the pivot and check arguments. I tried, but did not see how to avoid that. Ideally, LinearAlgebra fills in the defaults once, such that anyone with a new structured matrix can just implement lu(A::MyMatrix, pivot; check) or something like that. The risk, in combination with a fallback for any A, is that after a while lu(A) ends up calling the fallback instead of the more specialized method. It may be worth pondering about good interface practices before generalizing the fallback to other factorizations, perhaps in a separate issue.

@daanhb
Copy link
Contributor Author

daanhb commented Jan 7, 2022

Regarding the issue that came up in TaylorSeries: this is likely to appear in other places as well. There is always a possibility for ambiguity if lu is defined for a different eltype. (Oops.)

I think it may be correct that we can leave out the lu(A::StridedMatrix...) method, but I'm not sure. Strictly speaking this does change the flow: currently a StridedMatrix will be passed to copy_oftype, while an AbstractMatrix goes to copy_to_array. That difference might not matter in practice, but I haven't checked all cases. Even then, a possible ambiguity remains with `HermOrSym' and different eltypes.

An alternative that is perhaps cleaner would be something like this:

function lu(A::AbstractMatrix, pivot::Union{RowMaximum,NoPivot} = RowMaximum(); check::Bool = true)
    B = lucopy(A, lutype(eltype(A))
    lu!(B, pivot; check = check)
end

lucopy(A::AbstractMatrix, T) = copy_to_array(A, T)
lucopy(A::StridedMatrix, T) = copy_oftype(A, T)
lucopy(A::HermOrSym, T) = copy_oftype(A, T)
lucopy(A::Tridiagonal, T) = copy_oftype(A, T)

That should be functionally equivalent to the current state, and avoid the ambiguity because only one method for lu remains which can be easily specialized without ambiguity. All the other methods can be removed. (Just an idea, I haven't checked!)


See also: `copy_similar`, `copy_to_array`.
"""
copy_oftype(A::AbstractArray, ::Type{T}) where {T} = copyto!(similar(A,T), A)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be renamed copymutable_oftype as it behaves very different than copy (which is not in general mutable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:arrays [a, r, r, a, y, s] domain:linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants