Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dot(A,B) for A,B matrices #11064

Closed
mschauer opened this issue Apr 30, 2015 · 18 comments
Closed

dot(A,B) for A,B matrices #11064

mschauer opened this issue Apr 30, 2015 · 18 comments
Labels

Comments

@mschauer
Copy link
Contributor

It would be nice to extend the implementation of
dot for matrices using the natural definition

dot(A,B) = diag(A'*B)

or more matlab like

dot(A,B) = diag(A'*B)' = sum(A.*B,1)
dot(A,B, d) = sum(A.*B,d)
@tkelman
Copy link
Contributor

tkelman commented Apr 30, 2015

This might lead to some unintentional confusion for numpy users where dot means matrix multiplication...

@andreasnoack
Copy link
Member

I think we want to think of this in connection with #4774, even though this is about matrices. Personally, I think that dot(A::Matrix,B::Matrix) = A'B is a better definition.

@mschauer
Copy link
Contributor Author

Just a data point: for Quaternions a, b where the dot product a.b as scaled angle is defined with physical meaning, this dot product coincides with trace(A'*B)/2 where A and B are quaternion matrices.

@stevengj
Copy link
Member

It seems to me that dot should be an inner product, hence it should always produce a scalar result. The natural (Euclidean) inner product of two matrices is dot(A,B) = tr(A' * B) = ∑ᵢ conj(A[i]) * B[i].

However, there is enough variability in what people seem to think that dot(A,B) should mean that it might be wiser not to have a default meaning.

Alternatively/additionally, we already have a vecnorm function that computes ∑ᵢ |A[i]|² for any iterable container. It would be natural to provide a vecdot function that computes the Euclidean dot product for any pair of iterable containers (which, for matrices, would give tr A'B)

(If you don't want a scalar result, you may be thinking of a tensor contraction (#3250).)

@stevengj stevengj added the domain:linear algebra Linear algebra label Apr 30, 2015
@jiahao
Copy link
Member

jiahao commented Apr 30, 2015

I agree with @stevengj that the dot product should produce a scalar. The definition as trace(A'B) is essentially standard in linear algebra, although I don't see it very widely used.

numpy and Mathematica's abuse of the dot product to mean matrix multiplication is unfortunately not going away. It may be safer to avoid defining the dot product of matrices altogether.

@jiahao
Copy link
Member

jiahao commented Apr 30, 2015

On the subject on vecnorm, the renaming of normfro to vecnorm is one of the worst things we've ever done. As I had stated in #7990, we went from an old name with a well-known meaning (the Frobenius norm) to a new name that is ambiguous, and therefore meaningless.

@andreasnoack
Copy link
Member

We are not always that strict, e.g. we allow norm(x, 0) even though it is not a norm. With the risk of revealing my lacking algebra training, it seems to me that X'Y is not that different from an inner product. The field is not a field, but more like a division ring so we'd have to be careful when multiplying.

  • Conjugate (transpose) symmetry: <X,Y> = X'Y = (Y'X)' = <Y,X>'
  • Linearity:
    • <XA,Y> = (XA)'Y = A'X'Y = A'<X,Y>
    • <X + Y, Z> = (X + Y)'Z = X'Z + Y'Z = <X,Z> + <Y,Z>
  • Positive (semi)-definiteness: X'X >= 0

@mschauer
Copy link
Contributor Author

I do run into these trace(A'B) operations from time to time https://github.com/mschauer/SDE.jl/blob/master/src/SDE.jl#L677

@stevengj
Copy link
Member

I don't get too excited about vecnorm vs. normfro. As @StefanKarpinski said, vecnorm is sensible if viewed as a contraction of norm(vec(A)), and there isn't a particularly good alternative name for this operation in general; we have to pick something. (Not normfro, because the Frobenius norm is only for matrices; you wouldn't want to call it normfro for 4-dimensional arrays, or sets, for example.)

@jiahao
Copy link
Member

jiahao commented Apr 30, 2015

@stevengj the problem with vecnorm is that it just screams "vector norm", which is simply not well defined. Generalizing normfro to the tensor case is the lesser evil IMO. But that's a separate issue.

@stevengj
Copy link
Member

norm(x) screams "norm", which is simply not well defined. You could criticize nearly any generalizable operation in Julia this way — sometimes, we pick a default meaning for things that could have other meanings too.

But honestly I don't care that much what we call it, vecnorm or normfro. My main point is that if we have vecnorm we should probably have vecdot (or dotfro) too.

@stevengj
Copy link
Member

@andreasnoack, the question seems moot since if you want A'B you should type A'B; defining a dot(A,B) = A'B method for this would be superfluous even if it weren't confusing.

@andreasnoack
Copy link
Member

The motivation was generic programming. In linear least squares you often calculate inv(x'x) to estimate the variance of the estimator and right now it breaks for vectors. It is also an example where the generalisation from vector to matrix is X'X and not trace(X'X). Maybe we should just fix #4774.

@toivoh
Copy link
Contributor

toivoh commented May 1, 2015

I also think that dot returning a scalar seems like the most reasonable definition. One thing to note is that evaluating it as trace(A'B) is O(n^3), while the actual dot product is O(n^2) (if A and B are n x n matrices, but you get the idea), so it should be implemented some other way.

@toivoh
Copy link
Contributor

toivoh commented May 1, 2015

@stevengj: I see that you added vecdot to take the dot product of the vectorization of arrays. But I think that for the dot product of eg two matrices, I would really like a function that checks that the dimensions match also. That should also allow a more efficient implementation for matrices without fast linear indexing.

@stevengj
Copy link
Member

stevengj commented May 1, 2015

@toivoh, that's a good point; please comment at the PR, though.

stevengj added a commit to JuliaLang/IJulia.jl that referenced this issue May 5, 2015
… override the defaults via LINES and COLUMNS environment variables in the shell where IJulia is launched
@stevengj
Copy link
Member

stevengj commented May 5, 2015

(Whoops, put the wrong issue number in an unrelated IJulia patch, sorry.)

@andreasnoack
Copy link
Member

Fixed by #11067

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants