Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursive vecdot and vecnorm #25093

Closed
wants to merge 0 commits into from
Closed

recursive vecdot and vecnorm #25093

wants to merge 0 commits into from

Conversation

Jutho
Copy link
Contributor

@Jutho Jutho commented Dec 15, 2017

For consistency with recursive adjoint (and to a lesser extent transpose), I think vecdot and vecnorm (at least for 2 norm) should work recursively. It already does now, but strangely enough using dot and norm within the definition of vecdot and vecnorm, respectively.

While doing so, I discovered that for a=[randn(2,2), randn(2,2)], the current behaviour of dot and vecdot differs greatly between 0.6.1 and 0.7.
We have

  • dot(a,a): matrix in 0.7, scalar in 0.6.1
  • vecdot(a,a): error in 0.7, scalar in 0.6.1

The underlying reason is that dot of two matrices, i.e. dot(a[1], a[1]) is defined in 0.6.1 (being BLAS.dot) and yields a scalar, whereas in 0.7 this gives an error, the reason being that BLAS.dot is never used anymore (not even for vectors) and the definition of dot in linalg/generic only accepts AbstractVector. Is this intentional? I don't find it in NEWS.md?

This PR will currently break tests, because it does also change behavior for vecnorm(a,p) with p != 2. Currently, generic_vecnormInf and friends just call norm on the elements, whereas this PR again calls vecnorm with the same p value. It seems to me that this is what one wants if one of the motivations for the recursive behavior is block matrices. I have no preference about this, but wanted to ask for opinions/thoughts.

@stevengj
Copy link
Member

I don't know, it seems like we should have vecnorm(a) == norm(vec(a)), which would imply calling norm recursively.

On the other hand, I can imagine cases where "recursive vecnorm" is useful too.

Remind me how this is connected to recursive adjoint? I thought adjoint was defined with respect to dot, so vecdot is not implicated?

@Jutho
Copy link
Contributor Author

Jutho commented Dec 15, 2017

Let me in this comment only address the case of (vec)dot:

Remind me how this is connected to recursive adjoint? I thought adjoint was defined with respect to dot, so vecdot is not implicated?

I don't think in math there is a notion of dot versus vecdot. Note that dot also does not recursive calls dot. dot(a,b) will call a[i]' * b[i] on the elements. a[i]' * b[i] will only yield dot if those are numbers or vectors, but not if matrices.

The problem is that in 0.7 there is no way of getting a scalar inner product out of vectors filled with matrices. For a=[randn(2,2),randn(2,2)], dot(a,a) yields a matrix because of the a[i]' * b[i] behaviour, whereas vecdot(a,a) errors out because dot on matrices is no longer defined.

in 0.6, dot of two matrices or even general arrays was defined (via BLAS.dot) and yields a number. Therefore, in 0.6, vecdot(a,a) does work and yields a number, but furthermore, there was also a line dot(x::AbstractVector, y::AbstractVector) = vecdot(x, y) in linalg/generic, so that also dot(a,a) yields a number.

@andreasnoack
Copy link
Member

@Jutho The behavior of dot on 0.6 was unintentional. See #22374.

@Jutho
Copy link
Contributor Author

Jutho commented Dec 15, 2017

Yes I just stumbled upon that PR, together with #22392. I agree that dot of matrices should not be defined, and one should use vecdot in that case. I am just arguing that vecdot should recursively call vecdot, if the point is that this is supposed to be the Euclidean inner product in more general vector spaces.

@Jutho Jutho changed the title recursive vecnorm recursive vecdot and vecnorm Dec 15, 2017
@kshyatt kshyatt added the linear algebra Linear algebra label Dec 16, 2017
@Jutho
Copy link
Contributor Author

Jutho commented Dec 18, 2017

Any further thoughts or input on this?

Copy link
Member

@andreasnoack andreasnoack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to use vecnorm and vecdot such that e.g.

julia> x = rand(1:10, 4, 2);

julia> y = [x[1:2,1:2], x[3:4,1:2]];

julia> vecdot(x,x) == vecdot(y,y)

This, of course, raises the question if vec should be recursive but let's leave that discussion for another issue.

@stevengj
Copy link
Member

stevengj commented Dec 19, 2017

I agree that a recursive vecdot is a potentially useful function. The whole question in my mind, as I commented above, is that now it would become somewhat somewhat inconsistent with vec. I'm not sure whether we should worry about this, but we should at least discuss it.

I'm very skeptical of making vec recursive. vec is normally defined to always produce a length-mn vector from an m×n matrix. If you make it recursive, then the length of the result becomes rather hard to predict (without looping recursively over the elements). Also, it wouldn't work recursively on arbitrary arrays unless you defined an odd-looking fallback vec(x::Any) = x.

Note that you'll need to update the docstrings too.

@andreasnoack
Copy link
Member

Also, it wouldn't work recursively on arbitrary arrays unless you defined an odd-looking fallback vec(x::Any) = x.

This is the one the lessons from making ctranspose/adjoint recursive. If you define an array function recursively then all possible element types should support the operation as well. However, if we decided to use some kind of trait to indicate "scalarness" then we could use it for adjoint as well as vec.

vec is normally defined to always produce a length-mn vector from an m×n matrix

I think the "normal" argument here is also very similar to our discussions of recursive adjoint. In most discussions of the concept, it is assumed that the elements are numbers so the question of recursiveness isn't really dealt with.

I think I'm in favor of recursive vec but I'm not completely sure. Can we come up with realistic examples where it would be a problem (besides the problem with the Any fallback)? E.g. where you have a matrix of matrices and want a Vector{Matrix}.

@Jutho
Copy link
Contributor Author

Jutho commented Dec 19, 2017

That's great. Before moving over to the vec discussion, preferably elsewhere, there is also the question about norms with p !=1, which are thus not related to vecdot via vecnorm(x) = sqrt(vecdot(x,x)).

In the existing code, something like vecnorm(x,p) calls the normal norm on the elements (which is 2-norm for vectors, induced 2-norm for matrices, ...), independent of the value of p. I changed this such that vecnorm(x,p) also calls vecnorm( . , p) on the elements, with the same p. That's why some tests currently break with this PR, because they are hard coded to support the old behavior. Is the new behavior ok. Is this considered breaking? How should I deal with this (aside from updating the docs).

@stevengj
Copy link
Member

stevengj commented Dec 19, 2017

This is definitely a breaking change.

Note that this would also break vecnorm for Array{T} where T is some general Banach space, which defines norm but not vecnorm. For example, consider Array{ApproxFun.Fun}.

I feel like recursive vec, vecdot, and vecnorm should be separate functions. vecr?

@stevengj
Copy link
Member

stevengj commented Dec 19, 2017

I guess you could have vecnorm(x::Any) = norm(x) as a fallback, but the p ≠ 2 case is still a problem.

@Jutho
Copy link
Contributor Author

Jutho commented Dec 19, 2017

Thanks for the feedback.

Regarding the behaviour: I am correct that we do want to compute a 1-norm or inf-norm (or any p-norm) recursively on the elements right? E.g. for a vector of vectors. How should such a breaking change be implemented? A deprecation cannot handle this, correct?

Regarding the implementation: how about a general vecnorm(x, p) = norm(x, p) fallback? I would personally find it great if I could do vecnorm(::Array{ApproxFun.Fun}) since I have an actual use case for this and was saddened by the fact that this did not work, which was part of my motivation for finally starting this PR.

@stevengj
Copy link
Member

stevengj commented Dec 20, 2017

@Jutho, vecnorm(::Array{ApproxFun.Fun}) should work as-is, since norm is defined for Fun, right?

As I said, my inclination is that "recursive vec" and related functions (dot product and norm) should be different functions (e.g. vecr). (It's also not clear to me that they need to be in Base.)

@Jutho
Copy link
Contributor Author

Jutho commented Dec 20, 2017

Ok, just so that I understand, this means essentially status quo for the current implementation of vecnorm (and I guess vecdot for consistency), and adding new functions instead.

Personally, I am not really convinced by the current implementations. In particular,

  • I don't understand why dot(x,y) would call x[i]'y[i] instead of just dot(x[i],y[i]) on its elements (this has been mentioned but not been discussed above).
  • I don't understand what the mathematical meaning or use case would be, for an array filled with matrices, to call vecnorm on the array and have this then call the matrix norm on its matrix-valued elements.

@Jutho
Copy link
Contributor Author

Jutho commented Dec 20, 2017

Also, I don't see the benefit of using vecnorm(x) as a simple substitute for norm(vec(x)). It should probably be called normvec then and is only 2 characters less. For a generic array filled with numbers, the time difference is also negligable.

I think it's more useful to have a norm that acts on the natural vector space that arises for any family of objects that you can multiply with scalars and that you can add together. That's at least how I always interpreted vecnorm and that requires a recursive action of vecnorm. If this should have a different name, fine, but then the point of keeping the current vecnorm escapes me.

@stevengj
Copy link
Member

vecnorm isn't defined for Vector{Matrix} because norm isn't defined for Matrix. norm(vec(A)) would require the construction of an intermediate vec object...

@Jutho
Copy link
Contributor Author

Jutho commented Dec 20, 2017

norm is defined for matrix and computes the induced matrix norm:

julia> v=[randn(2,2),randn(2,2)]
2-element Array{Array{Float64,2},1}:
 [1.43478 -1.06198; 0.843557 -0.714887]
 [-0.253848 0.532226; 2.03743 -1.05801]

julia> vecnorm(v)
3.146799307329483

julia> hypot(norm(v[1]),norm(v[2]))
3.146799307329483

Will the cost of the intermediate vec object (currently always a view but still allocating a little bit) not decrease as Julia is optimized further. Already now it looks negligible from a simple benchmark.

@stevengj
Copy link
Member

Sorry, I was thinking of vecdot for an array of matrices, not vecnorm. Or vecnorm for an array of 3d arrays.

@Jutho
Copy link
Contributor Author

Jutho commented Jan 3, 2018

What is the recommended course of action here? Do other contributors to LinAlg and transpose/adjoint contributors, e.g. @andyferris and @Sacha0, have an opinion about the 'correct' behavior of vecnorm and vecdot?

A summary. Current situation:

  • dot(x,y) calls x[i]'y[i], making it work on Vector{Matrix} and producing a Matrix (and thus non-scalar) output. For Vector{Vector{<:Number}}, the output is a scalar.
  • vecdot(x,y) calls dot(x[i],y[i]) thus resulting in an error for Vector{Matrix}, because dot is not defined for matrices.
  • norm(x,p) for x::Vector calls vecnorm(x,p), for x::Matrix computes the induced matrix norm.
  • vecnorm(x,p) calls norm(x[i]) on the elements, this always being the 2-norm, even if p!=2, and this being the induced matrix 2 norm if the elements are matrices, i.e. for x::Vector{Matrix}. Especially the latter behavior seems to be completely unrelated to any mathematically interesting definition of a norm to me.

My proposal:

  • dot(x,y) calls dot(x[i],y[i]), so that it still works like before on Vector{Vector} and yields a scalar, but errors on Vector{Matrix}, since dot is not defined for matrices.
  • vecdot(x,y) calls vecdot(x[i],y[i]), so that it will always will yield a scalar. I think this is the natural Euclidean inner product for objects that otherwise form a vector space (i.e. that can be added and multiplied by scalars).
  • norm(x,p) stays as is.
  • vecnorm(x,p) calls vecnorm(x[i],p) on the elements, so that for x::Vector{Vector}}, it is really the p-norm of the vector obtained by interpreting x as a block vector.

@andreasnoack
Copy link
Member

We spent some time on dot in #22220

@Jutho
Copy link
Contributor Author

Jutho commented Jan 3, 2018

Thanks for the link to that discussion. So it seems that the behavior of dot is intentional (though without a clear consensus between @andreasnoack and @stevengj ).

My main interest is vecdot and vecnorm, I would like these to act recursively, such that there is at least one way to get a scalar inner product from arbitrary vector space objects like ::Vector{Matrix{<:Number}} over the field of numbers.

Copy link
Member

@stevengj stevengj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we should change these to be inconsistent with vec, and I'm not convinced the meaning of vec should be changed.

@andreasnoack andreasnoack added the triage This should be discussed on a triage call label Jan 3, 2018
@Jutho
Copy link
Contributor Author

Jutho commented Jan 3, 2018

I am surprised by that reaction, as, back in #22220, you were arguing that Vector{Matrix{<:Number}} is at least a vector space over numbers, and therefore, there should be at least a way to compute the Euclidean inner product in that space and obtain a number, whereas currently, this is impossible.

I am also not sure that the current implementation really satisfies vecdot(x,y) = dot(vec(x),vec(y)), since if both x and y are Vectors (but possibly with elements which are not Number), then vec is a no-op, and the left hand side calls dot(x[i],y[i]) whereas the right hand side calls x[i]'y[i].

@stevengj
Copy link
Member

stevengj commented Jan 3, 2018

@Jutho, I'm not arguing that it isn't useful to have a recursive vecdot-like function. Just that it shouldn't be called vecdot.

With regard to your other point, I think it would be reasonable to add a vecdot(x::Vector, y::Vector) = dot(x, y) method. (Just have to be careful to avoid method loops: at one point dot called vecdot IIRC!)

@Jutho
Copy link
Contributor Author

Jutho commented Jan 3, 2018

at one point dot called vecdot IIRC!)

This still is the case, for Vector{<:BlasFloat}.

I still don't quite get the argument. For most common types, my proposed vecdot(x,y) would still equal dot(vec(x),vec(y)), e.g. when x,y are Array{<:Number, N} with N>1.

For the cases where it would differ, it also differs with the current implementation, and this will not be fixed completely by adding vecdot(x::Vector, y::Vector) = dot(x, y). The fact that dot only accepts vectors and vecdot accepts general iterables is not the main difference between those two functions. The main difference is what is called on the elements contained in x and y. If you want vecdot(x,y) = dot(vec(x),vec(y)) for those types of x that support vec, than vecdot should also call x[i]'y[i] on the elements instead of dot(x[i],y[i]) (currently) or vecdot(x[i],y[i]) (my proposal).

But I don't see the point of a function vecdot(x,y) that is just a shorthand for dot(vec(x),vec(y)). The overhead of calling vec will be negligible for reasonably sized x and y. In addition, vec is a pretty specific function with only 5 method definitions

# 5 methods for generic function "vec":
[1] vec(v::Union{Adjoint{T,#s520}, Transpose{T,#s520}} where #s520<:(AbstractArray{T,1} where T) where T) in Base.LinAlg at linalg/adjtrans.jl:117
[2] vec(x::AbstractSparseArray{Tv,Ti,1} where Ti where Tv) in Base.SparseArrays at sparse/sparsevector.jl:876
[3] vec(rowvec::RowVector) in Base.LinAlg at deprecated.jl:2469
[4] vec(a::AbstractArray{T,1} where T) in Base at abstractarraymath.jl:39
[5] vec(a::AbstractArray) in Base at abstractarraymath.jl:38

most of which are trivial.

Furthermore, I don't understand why the generic adjoint function was insisted upon to act recursively, based on the argument that it needs to produce a scalar-valued inner product, when then the associated recursive dot and norm function that would return this scalar inner product has some obscure name or is not even defined in Julia Base.

@stevengj
Copy link
Member

stevengj commented Jan 3, 2018

@Jutho,

I wasn't suggesting vecdot(x::AbstractVector, y::AbstractVector) = dot(x, y) in the context of your recursive definition of vecdot. I was suggesting it in the context of the current definition, to ensure that vecdot(x,y) == dot(vec(x),vec(y)) — this is intended to be the case (and is already always true for vectors of numbers), but it could be broken if someone defines a custom dot product on a custom vector type. The point is, vec* functions should be consistent with vec(x) (and hence should be equivalent to non-vec functions for vectors). I'm suggesting that anyplace they currently disagree is a bug that should be fixed — your PR makes the disagreement worse.

(I also still dislike the change that was made to have dot call x[i]'y[i], rather than dot recursively, but perhaps that is water under the bridge at this point.)

adjoint should be recursive to be consistent with dot. It has nothing to do with vecdot. Different inner products induce different adjoints.

@stevengj
Copy link
Member

stevengj commented Jan 3, 2018

I see your point that a recursive vecdot might be more useful than the current one, given that you can do dot(vec(x),vec(y)) (albeit with a small performance hit). It irks me that it is inconsistent with vec (which I think should definitely not be recursive), but that's not the end of the world.

It also bugs me that vecdot and vecnorm in your definitions will cease to work on Container{T} where T is a general Hilbert/Banach space (supporting dot/norm), since your implementation assumes everything is either a Number or iterable.

@StefanKarpinski
Copy link
Member

Would this be a good place for a recurse=false|true keyword argument?

@Jutho
Copy link
Contributor Author

Jutho commented Jan 3, 2018

I see your point that a recursive vecdot might be more useful than the current one

Thanks for that. I was starting to doubt whether I was able to successfully get my point across.

I wasn't suggesting vecdot(x::AbstractVector, y::AbstractVector) = dot(x, y) in the context of your recursive definition of vecdot. I was suggesting it in the context of the current definition, to ensure that vecdot(x,y) == dot(vec(x),vec(y))

My argument was already for the current implementation, not my new proposal. Even currently, this extra definition is not sufficient, because of the way dot and vecdot differ in how they act on the elements, which is indeed the following:

(I also still dislike the change that was made to have dot call x[i]'y[i], rather than dot recursively, but perhaps that is water under the bridge at this point.)

Me too, I liked your arguments in the thread where that was discussed. Because of this, I rather have started using vecdot and vecnorm as the general functions for inner product and norm for my own custom type objects (which are not <:AbstractArray) but that do have natural vector space / Banach space structure. This also relates to your following comment:

It also bugs me that vecdot and vecnorm in your definitions will cease to work on Container{T} where T is a general Hilbert/Banach space (supporting dot/norm), since your implementation assumes everything is either a Number or iterable.

I would love to see ApproxFun have definitions of vecdot and vecnorm for their Fun objects. I have started working on some krylov methods, that use generic functions as operators, and whose vectors are not required to be (Abstract)Vector but just any type that supports scale!, axpy!, vecdot and vecnorm.

I see your point that a recursive vecdot might be more useful than the current one, given that you can do dot(vec(x),vec(y)) (albeit with a small performance hit). It irks me that it is inconsistent with vec (which I think should definitely not be recursive), but that's not the end of the world.

But vecdot is so much more than dot(vec(x),vec(y)), since vec is rather limited to higher dimensional arrays, whereas vecdot accepts iterables. If there is any comparision, it should be dot(collect(x),collect(y)) and hence, if that is the constraint it should satisfy be called collectdot :-). As stated above, I have started to interpret vecdot as the general inner product for objects that have a natural vector space structure.

Do you think there is any use case where dot(vec(x),vec(y)) and my definition of vecdot(x,y) differ, and where the performance hit would be noticable? I could see this matter for small arrays with elements <:Number, but in those cases, both definitions match.

@StefanKarpinski
Copy link
Member

I don't think this is a place where there is one right answer – some people have good reasons for one behavior while others have good reasons for the other. The non-recursive behavior seems simpler and more intuitive to me, so I would propose a recurse::Bool=false keyword, so doing vecnorm(a, recurse=true) would take a recursive norm and vecdot(a, b, recurse=true) would take a recursive inner product, both giving scalar results. This may be a type stability concern at the moment since I'm not sure how well we type infer keyword arguments at this point: @vtjnash, @Keno?

@StefanKarpinski
Copy link
Member

My admittedly somewhat naive impression is that if you're working with, say, vectors of 2x2 invertible matrices – i.e. elements of SO(2) – then your "scalar type" of the vector space is SO(2), so I would expect the norm of such a vector to be an element of SO(2), i.e. a matrix. That would seem to suggest a non-recursive norm rather than the recursive one.

@Jutho
Copy link
Contributor Author

Jutho commented Jan 4, 2018

I don't think SO(2) is correct here, it's GL(2) (genera linear group) but with also addition so that you have a ring. But that aside, I certainly get your point.

In fact, I have always liked that kind of interpretation: Julia Base should just defines linear algebra for Vector and Matrix one level deep, and whatever the entries are, they form the scalar type. Julia Base should then only make sure that the simple case, where the entries are <:Number, are supported (while still aiming at generality), but if the entries are of a different type, it is the user's responsibility to take care that that type supports all the correct behavior to act as a field or ring.

However, that interpretation is, from my point of view, in conflict with having adjoint and especially transpose act recursively down to scalar level, and with the whole "native support for block matrices" ideas that have influenced some design decision. I don't want to go that way again (I've made those arguments when I was contesting the recursive behaviour of adjoint and transpose). In the one-level deep interpretation, I just think it is unreasonable to even try to define dot, norm and adjoint, since, if the scalar type is not simply Number, the resulting vector space or module will certainly not have the natural Euclidean inner product.

But, now that I have finally accepted the recursive behavior down to the scalar level, I think the only consistent possibility is for dot to follow that recursive behaviour and always produce a scalar.

But really, I have no intent to stir up that discussion again. It's good that actual useful work is being done instead of time being invested in these endless discussions about formal mathematical generalizations. Github threads are probably not the right medium for that anyway, and often lead to misunderstanding each other's argument.

Unless any of you have a strong opinion about changing the current situation, I will just close this PR and let it rest.

@Jutho Jutho closed this Jan 4, 2018
@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jan 5, 2018

Yes, thanks for the correction on SO(2) versus GL(2). On the triage call just now, I was just expressing my concern that we have these very similar debates regularly and neither side is demonstrably correct since they're both valid interpretations, but a different random camp wins the debate each time, which leaves the set of behaviors implemented as a pastiche of these different philosophies, so the point you make about recursive adjoint is a concern that resonates with me. Perhaps @stevengj can comment on how recursive adjoint (which he was in favor of, IIRC) and non-recursive vecnorm which he's also in favor of, dovetail with each other.

@andyferris
Copy link
Member

andyferris commented Jan 6, 2018

This is a very interesting discussion.

Like @Jutho, I feel that non-recursive behavior (AbstractArrays assume they contain scalar elements by default - users are free to defined more complex structures) would be simpler because of cases like this issue, but I won't try to argue that point but instead want to point out that IMHO what to do here to support block matrices is pretty well defined.

In #20978 (see #20978 (comment) and #20978 (comment)) we discussed this kind of thing a lot, and in particular there seems no way that that system can "infer" when the user wants their 2x2 submatrices to be identified as a single scalar element of GL(2), or as 4 numbers in a block matrix. This "interpretation" affects the result of many functions like trace, det, etc but also vecnorm, vecdot. So please let's focus specifically on block matrices (and similar cases) and let users wrap their scalar rings as a subtype of Number or some-such.

To support block vectors/matrices (of matching sizes) just as we do the flattened version (which we should definitely be doing, or else there's really no point of recursive adjoint...) we need the following:

  • vecnorm(blockmatrix, p) is recursive and gives the same answer as the flattened matrix.
  • vecdot(blockmatrix1, blockmatrix2) is recursive (to match vecdot of two flattened matrices).
  • norm(blockvector, p) should be the same as the norm of the flattened vector (i.e. recursive, also same answer as vecnorm)
  • dot(blockvector1, blockvector2) should be recursively dot - the same as the flattened vectors (also same answer as vecdot).
  • norm(blockmatrix, p) needs to be quite complex (depending on p) to get the same as the flattened matrix (we probably need things like svd(blockmatrix) to work properly!)
  • dot(blockmatrix1, blockmatrix2) does not need to be defined.

AFAICT this PR address all but norm(blockmatrix, p) (which is complicated), so that's great. :)

Can I also point out that vec seems to me to be a convenience function for reshaping data to a 1D array - I don't see that it has a lot of specific linear algebra meaning attached to it. I feel it's a red herring to worry about vec + dot = vecdot because it's just a linguistic coincidence in this case.

@andyferris andyferris reopened this Jan 6, 2018
@andyferris
Copy link
Member

(Umm... the interface shows I reopened this PR. FYI this was unintentional - I must have clicked the wrong button, sorry. Feel free to close though I do think it would be nice to have a proper resolution here).

@StefanKarpinski
Copy link
Member

Linalg folks need to come to a decision here.

@stevengj
Copy link
Member

stevengj commented Jan 18, 2018

Stefan's vecnorm(x, recurse=true) seems fine to me. But honestly, I'm not sure I care that much either way; I guess I wouldn't lament if were changed to be always recursive, either.

@Jutho
Copy link
Contributor Author

Jutho commented Jan 20, 2018

I can try to change my PR but I have a busy week ahead. I am fine either way but would like to link to this discussion which I think is important: #25565

Essentially, my motivation for this PR was that I am interpreting vecdot as the notion of inner product on arbitrary types which act as vector spaces, i.e. object of that type can be multiplied with scalars and added. This interpretation works for 'vectors' of type AbstractArray{T,N} with arbitrary N, but maybe that's just a coincidence, and an actual general inner method is missing from Base.

@andyferris
Copy link
Member

On that topic, I’d favor renaming vecdot to inner with the semantics as Jutho suggests (we implement it for things which are vector spaces under + and scalar *.

@Sacha0
Copy link
Member

Sacha0 commented Jan 25, 2018

After rereading this thread and #22220, I can appreciate arguments for both the existing and proposed definitions of vecdot and vecnorm. Both possible definitions have worthwhile use cases, as e.g. @stevengj and @Jutho illustrate on the one hand and the other.

I strongly sympathize, however, with @stevengj's arguments in #22220 (and the voices echoing those sentiments here) for dot(x, y) recursing via dot(x[i], y[i]) rather than x[i]'y[i]. (Though admittedly I haven't read Andreas's reference in #22220 (comment).)

The preceding sympathy derives from my interpretation of dot as Julia's name for inner product. And it seems dot's interpretation varies, there being other generalizations / domain definitions of the dot product, so I have some sympathy for clarifying renames.

Thanks for the lively discussion all! 😄 Best!

@StefanKarpinski
Copy link
Member

So what are we actually going to do here? There's a strong chance that today's trage will be the last before we cut an alpha release, so if any action is going to happen here, it should be proposed and acted upon post-haste. Otherwise I'm just going to close this issue and remove it from the milestone.

@andreasnoack
Copy link
Member

We have discussed several things in this thread but with #25093 (comment), it seems to me that making vecdot recursive is generally acceptable so maybe we shouldn't let disagreements on the other parts of this issue (name change and dot behavior) block that change.

@Sacha0
Copy link
Member

Sacha0 commented Jan 25, 2018

Mulling this issue over, I sympathize progressively more with the desire for recursive behavior. Perhaps more descriptive names for the proposed operations would be flatdot/flatnorm?

@Jutho
Copy link
Contributor Author

Jutho commented Jan 25, 2018

Thanks for the comments @Sacha0 and everybody. Please also have a look at #25565. I like the names vecdot and vecnorm because I read them as:
threat these objects as if they are vectors and compute their dot or norm. So yes, for nested arrays, recursive behavior is the way to go, but other user types which somehow form a vector space can provide a completely independent definition of vecdot. I realize this was probably not the original semantics intended when these methods were introduced, but I do think methods with those semantics should exist. That's what #25565 is about. That's why I don't like names which would be more descriptive like flatdot or recursivedot, because that's only an implementation detail for the specific case of nested arrays.

And as stated in my last comment in #25565, I think that interface would be completed by a corresponding veclength function, that has the semantics of returning the dimension of the corresponding vector-like objects. For nested arrays, it would again just be a recursive version of length. Hence the question, can I also introduce a corresponding veclength function?

@Jutho
Copy link
Contributor Author

Jutho commented Jan 31, 2018

I seem to have screwed up this PR by forced pushing latest julia master back to my own fork, forgetting that this PR was living there in the master branch. Unless anybody knows how to undo this, I will prepare a new PR based on the discussion so far.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Feb 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants