-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce RowVector
as the transpose of a vector
#19670
Conversation
OK, there's definitely some fallout in other tests that needs to be addressed. |
RowVector
as the transpose of a vectorRowVector
as the transpose of a vector
How does this compare to #6837? cc @andreasnoack |
|
||
# Some overloads from Base | ||
@inline transpose(r::Range) = RowVector(r) | ||
@inline ctranspose(r::Range) = RowVector(conj(r)) # is there such a thing as a complex range? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach in #6837 (which doesn't make a copy for ctranspose
) better.
It's also unfortunate that this doesn't handle (2d) matrices. |
Hi Steven, I agree completely with what you are saying - we need to handle conjugation and matrices in a similar manner. AFAICT #6837 doesn't extend well for RowVector
TransposedMatrix # non-conjugating, also living in LinAlg
ConjArray # Any dimension, living in Base However, the final two are optimizations and not semantically important in the linear algebra sense (the view vs. copy semantic certainly is important for programmers). I think for us as developers it is going to be much easier to have a modular system of simple views rather than type parameters or an interface to ask "are you conjugated?" to a complicated type (dispatch would be difficult). I was hoping we could deal with this in a process that goes something like this:
However, I don't see all this happening in 9 days! I thought I'd see if there was an appetite to deal with the linear algebra (row vector) semantics in v0.6 or not, with the "software" side of things (optimizations, view vs. copy, etc) to come later. |
Why should It's true that |
Maybe I misunderstood - in the A transposed matrix is just a matrix, but the transpose of a vector in some senses is not "really" a matrix or a vector (at least, that is what I'm aiming for here). To me, it made sense to separate the two concepts at the (primary) type level. For example, users might want to dispatch row vectors (or dual vectors) separately, and while
Hmm, this is where I disagree. I think that a modular system would help deal with each operation atomically, and thus reduce code complexity significantly. We still need just a single definition *(A::Transpose{Conj{M}} where M<:StridedMatrix, B::StridedVector) = ...
*(A::Transpose{Conj{M}} where M<:StridedMatrix, B::StridedMatrix) = ...
... (I do realize that BLAS is handled in *(A::Transpose{M} where M<:AbstractSparseMatrix, B::AbstractSparseMatrix) for a generic Julia implementation. Anyway, it seems prettier to separate the As to broadcast fusion (like |
We have a lot of separate code for Matrix and Vector, but they are still subtypes of the same type. With a separate Conj type, you would conceivably have to handle I guess it's not terrible to have separate Conj and Transpose types, but I'd really strongly prefer that (a) there be only one Transpose type of varying dimensions (not completely separate TransposedMatrix and RowVector types) and (b) that Conj and Transpose be implemented in the same PR to prevent regressions in |
OK - I think I could support that; The question now is, do we work like mad to make this happen this week for v0.6? And does anyone know if PS - this PR does already attempt to "canonicalize" |
it should be merged pretty soon. just needs some performance testing and review.
This type does not have any subtypes (e.g. does not have (*){M <: AbstractSparseMatrix}(A::Transpose{M}, B::AbstractSparseMatrix)
(*)(A::Transpose{M}, B::AbstractSparseMatrix) where M <: AbstractSparseMatrix |
You're right... I'm still in a bit of a headspin with the new syntax :) |
I think this is the one I like: (*)(A::(Transpose{M} where M <: AbstractArray), B::AbstractSparseMatrix) I don't see that the brackets should be necessary, though. (edit: xref #18457 (comment)) |
@@ -0,0 +1,183 @@ | |||
immutable RowVector{T,V<:AbstractVector} <: AbstractMatrix{T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alanedelman and I both tend to think that Transpose{T,N,AbstractArray{T,N}} <: AbstractArray{T,N}
, so that a RowVector
(AbstractRowVector
? = Transpose{T,1,AbstractArray{T,1}}
) would be a subtype of AbstractVector
, i.e. a "1d" array object (an element of the dual space, with conjugation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, that necessitates a bit of care, since the assumption AbstractVector == ColumnVector
appears in a number of methods, but it doesn't seem too bad. There would need to be specialized +
, *
, and (most annoying) various broadcast_foo
methods.
@@ -349,6 +349,15 @@ A_mul_B!(C::AbstractVector, A::BiTri, B::AbstractVector) = A_mul_B_td!(C, A, B) | |||
A_mul_B!(C::AbstractMatrix, A::BiTri, B::AbstractVecOrMat) = A_mul_B_td!(C, A, B) | |||
A_mul_B!(C::AbstractVecOrMat, A::BiTri, B::AbstractVecOrMat) = A_mul_B_td!(C, A, B) | |||
|
|||
@inline \(::Diagonal, ::RowVector) = error("Cannot left-divide matrix by transposed vector") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to inline a function that just throws an error.
We had a long discussion today with @JeffBezanson, @andreasnoack, and @alanedelman. We initially leaned towards making a RowVector an AbstractVector, but after Alan left we began leaning in the other direction, making a RowVector an AbstractMatrix as in this proposal. Making it an AbstractMatrix seems less likely to cause regressions, and will require less code to be modified, than making it an AbstractVector (since there is lots of linear-algebra code that assumes every AbstractVector is a column vector). |
@inline check_tail_indices(i1, i2, i3, is...) = i3 == 1 ? check_tail_indices(i1, i2, is...) : false | ||
|
||
# Some conversions | ||
convert(::Type{AbstractVector}, rowvec::RowVector) = rowvec.vec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure we want this? There aren't currently any conversions between arrays with different numbers of dimensions. If there were, I suspect we'd only allow converting (n,)
-size arrays to (n,1)
, not (1,n)
-size arrays to (n,)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this isn't very useful - you can just type .'
@inline *(::RowVector, ::RowVector) = error("Cannot multiply two transposed vectors") | ||
@inline *(vec::AbstractVector, rowvec::RowVector) = kron(vec, rowvec) | ||
@inline *(vec::AbstractVector, rowvec::AbstractVector) = error("Cannot multiply two vectors") | ||
@inline *(mat::AbstractMatrix, rowvec::RowVector) = error("Cannot right-multiply matrix by transposed vector") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are all of these error definitions necessary? We already throw errors for multiplying matrices and vectors of incompatible shapes, and since RowVector <: AbstractMatrix
the existing code should just work (i.e. continue to throw errors), no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted users to see that semantically we only allow
- matrix * matrix
- matrix * vector
- rowvector * matrix
- rowvector * vector
- vector * rowvector
So it's less about the specific shapes and more about the semantics of the type. I'm also deprecating (well, there should be a deprecation warning, not an error, but I haven't got to it yet) the vector * matrix method in favor for the last item on the list.
Ideally, there should only be four nice error message methods (a single one for each of vectormatrix, matrixrowvector, vectorvector, rowvectorrowvector), but they are repeated (a) because of ambiguities and (b) because of all the different A_mul_Bc
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyferris, but there are already error messages for multiplications that deviate from these. If you want to improve the error messages, can't that be a separate PR?
|
||
# Multiplication # | ||
|
||
@inline *(rowvec::RowVector, vec::AbstractVector) = reduce(+, map(At_mul_B, transpose(rowvec), vec)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sum( rowvec.vec[i].' * vec[i] for i = 1:length(vec) )
to fuse the map and reduce?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! I always forget to use generators.
end | ||
A | ||
end | ||
@inline transpose(sv::SparseVector) = TransposedVector(sv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> RowVector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
|
||
# Multiplication # | ||
|
||
@inline *(rowvec::RowVector, vec::AbstractVector) = reduce(+, map(At_mul_B, transpose(rowvec), vec)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you are doing this in order to support vectors of matrices and similar, but it seems like we will also want to define specialized methods for RowVector{<:Number} * AbstractVector{<:Number}
. e.g. for BLAS scalars we want to call BLAS.dotu
.
(Even in the general case, can't you use mapreduce
with zip(rowvec.vec, vec)
to avoid the temporary arrays?, or sum
with a generator as suggested above?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and yes - agreed.
Care to share the preference for |
@musm, the basic arguments went like this, as I recall: If you make
If you make
Put another way, there are very few methods that take an |
Another argument in favor of |
Well said, @stevengj. I agree 100% with the above, but couldn't express it as elegantly. I also do view this as an incremental step. I have a "feeling" that a 1D dual vector would be a bit easier to deal with in a language with traits, so this could be revisited later if a row-shaped vector seems wrong in the future. |
Regarding changing this to a general
to be exported? |
If |
Oh dear - now we go full circle! 😄 I think either way can be made to work. For example. if they are separate types, we can even do typealias Transpose{A,T} Union{RowVector{T,A}, TransposedMatrix{T,A}} lol |
Sounds like we're ready to merge. |
Would be great to have a |
Aim to do so, within a week (hopefully less). |
See #20047 |
if length(rowvec) != length(vec) | ||
throw(DimensionMismatch("A has dimensions $(size(rowvec)) but B has dimensions $(size(vec))")) | ||
end | ||
sum(@inbounds(return rowvec[i]*vec[i]) for i = 1:length(vec)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be calling dot
for two real vectors? This should produce the same result, of course, but is there a performance difference (especially when BLAS is called)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea, I will address this in the new PR, along with the complex case.
I suppose I should go measure this, but is there much difference between native Julia and BLAS for level-1 operations, and if so, why? (BLAS might be multithreaded?) What about the outer product, which is now broadcast
, maybe there is a faster BLAS outer product? (Would it be a rank-1 update on a zero'd array, or a direct op?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BLAS might be multithreaded, but mainly it is the fact that it is hand-coded for SIMD. @simd
can make up for this gap, but not always, and in generic code like you used above you don't have @simd
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For broadcast operations, the advantage of BLAS1 is wiped out by fusion in typical circumstances. Of course, it would be great if broadcast
could exploit SIMD well too, but we are dependent on the compiler for this.
rv = v.' | ||
|
||
@test (rv*d)::RowVector == [2,6,12].' | ||
@test_throws Exception d*rv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make these test a specific exception type! This test would pass if your implementation had a typo in it.
# This is broken by the introduction of RowVector... see brittle comment above. | ||
#@testset "issue #16548" begin | ||
# @test which(\, (SparseMatrixCSC, AbstractVecOrMat)).module == Base.SparseArrays | ||
#end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than simply commenting this test out, replace it with another test or set of tests that capture its spirit?
@test (cz*mat')::RowVector == [-2im 4 9] | ||
@test_throws Exception [1]*reshape([1],(1,1))' # no longer permitted | ||
@test cz*cz' === 15 + 0im | ||
@test_throws Exception z*vz' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example, vz
is a typo and this test still passed because Exception
is far too broad a type to be testing for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I had to chuckle when I found this after you wrote about this possibility earlier. FYI its being addressed in #20047
@andyferris: I want to say that this is – despite my waffling about whether we should do this or something else – a pleasure to work with. Thanks for all the hard work on bringing it to fruition! |
Thanks Stefan - that means a lot to me :) |
This used to be the outer product but was removed by JuliaLang#19670. It makes much more sense to deprecate this than make it a breaking change.
@inline size(rowvec::RowVector, d) = ifelse(d==2, length(rowvec.vec), 1) | ||
@inline indices(rowvec::RowVector) = (Base.OneTo(1), indices(rowvec.vec)[1]) | ||
@inline indices(rowvec::RowVector, d) = ifelse(d == 2, indices(rowvec.vec)[1], Base.OneTo(1)) | ||
linearindexing{V<:RowVector}(::Union{V,Type{V}}) = LinearFast() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be looking at the linearindexing of the wrapped type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is what we want. It only wraps vectors, but is itself a two-dimensional object. If it passed this through, then it'd have pessimized performance for LinearSlow types unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then are all AbstractVector types LinearFast? that currently isn't the case for e.g. SparseVector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is something that'd be nice to change at some point. LinearFast (read: index-with-one-dimension) and LinearSlow (read: index-with-N-dimensions) represent the same thing when N=1. It'd be worth seeing if the performance of SparseVector changes if you make it LinearFast
— it may end up hitting simpler methods that way. If so, we could change the default for all vectors. And maybe eventually re-work this to be clearer (#20175 (comment)).
Hi Andy, I don't think I ever thanked you for this. I think we are all happy with this solution. One final comment that I wanted to make, rather now than after julia v1.0, is that I did come up with one other solution to the issue of (vector) transposes. I didn't want to pollute that infamous issue any further now that it has finally found its peace, so I am posting it here. That solution is to have two parameters Linear algebra would than primarily take place with |
Thank you, @Jutho! :) Yes, we've certainly have considered more complex combinations of dual and normal dimensions, as type parameters. I remember a chat with @jiahao and @mbauman where this was discussed, and Jiahao's talk at this year's JuliaCon ended by mentioning this. My understanding is that v1.0 will happen as soon as practically possible (to just implement any foreseeable breaking changes and other "essential" features, bug fixes/correctness/consistency issues and performance bottlenecks). I recommend we play with this stuff in community packages and if we come up with an awesome design for dual spaces and multi linear algebra, we can propose changes for v2.0. For obvious reasons, you'd be a great person to help out with or to discuss such a design :) For my part, I'm slightly confused about the complex case. If I have a complex matrix, the adjoint ( |
Actually If you have a matrix / linear map Transpose, on the other hand, maps I've been reading a bit about diagrammatic notation and category theory to formalize my understanding: https://arxiv.org/abs/0908.3347v1 |
Ok - thanks I'll check out that reference. So the discussion has been that we're more-or-less ditching transpose for linear algebra to replace it with |
I wonder whether a syntactically nice way of getting a row of a matrix as a |
That's a fair point – you just want a row slice there, no conjugate. |
We definitely need a way of getting a |
It's going to be really confusing that |
These "taking seriously" issues highlight that getting what we want for arrays and what we want for linear algebra simultaneously will be challenging. On one hand we want APL indexing. So, if we introduce dual vectors as 1D arrays, then this at least satisfies the So I'm wondering if we should leave the |
Ideally we could just have |
This is my take on #4774 (taking vector transposes seriously). The transpose of an
AbstractVector
becomes aRowVector
, which is a single concrete type which is a lightweight wrapper or "view" of the original vector, and is for all purposes outside of linear algebra it is a1 x n
sizedAbstractArray
. The inner productv' * v
is a scalar,v''
returnsv
, and so-on. For details of my thoughts while developing this, consider the README for the v0.5-compatible companion package here.I've also taken the liberty of moving most of the definition of
transpose
andctranspose
to theLinAlg
module sincectranspose
andtranspose
are quite arguably specific to linear algebra (as opposed topermutedims
which makes sense for an array).There are a few optimizations left to be made involving(done). Opinions, comments, feedback, etc are very welcome and since I think this is my first serious PR to julia - forgive me if I messed something up :)map
andreduce
used in the inner product, but I don't mind if they are done in this PR or separately later (as "feature freeze" is coming soon)