-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Taking vector transposes seriously #42
Comments
The double-dual of a finite dimensional vector space is isomorphic to it, not identical. So I'm not clear on how this is bad mathematics. It's more that we tend gloss over the distinction between things that are isomorphic in mathematics, because human brains are good at handling that sort of slippery ambiguity and just doing the right thing. That said, I agree that this should be improved, but not because it's mathematically incorrect, but rather because it's annoying. |
How can |
(Speaking as myself now) It is not just isomorphic, it is naturally isomorphic, i.e. the isomorphism is independent of the choice of basis. I can’t think of a practical application for which it would be worth distinguishing between this kind of isomorphism and an identity. IMO the annoyance factor comes from making this kind of distinction.
That was the impression I got out of this afternoon’s discussion with @alanedelman. |
I think what Jeff is asking is right on the money ... it is starting to look like x'_y and x_y' is making more |
I'm in violent agreement with @Stefan. Bad mathematics was not meant to mean, wrong mathematics, it was |
If we go with this logic, here are two choices we have x_x remains an error ..... perhaps with a suggestion "maybe you want to use dot" |
If I would point out that if we allow |
Here is a practical discussion of distinguishing "up tuples" and "down tuples" that I like: http://mitpress.mit.edu/sites/default/files/titles/content/sicm/book-Z-H-79.html#%_idx_3310 It carefully avoids words like "vector" and "dual", perhaps to avoid annoying people. I find the application to partial derivatives compelling though: http://mitpress.mit.edu/sites/default/files/titles/content/sicm/book-Z-H-79.html#%_sec_Temp_453 |
Another reason to distinguish |
I like the idea of up and down vectors. The trouble is generalizing it to higher dimensions in a way that's not completely insane. Or you can just make vectors a special case. But that feels wrong too. |
It's true that up/down may be a separate theory. The approach to generalizing them seems to be nested structure, which takes things in a different direction. Very likely there is a reason they don't call them vectors. |
Also, |
Yes. To me, that's completely unacceptable. I mean technically, floating-point |
We all agree that The question remains whether we can think of @JeffBezanson and I were chatting A proposal is the following:
There is no distinction between rows and column vectors. I liked this anyway Users have to be aware that there are no row vectors .... period. Thus if
Regarding the closely related issue in https://groups.google.com/forum/#!topic/julia-users/L3vPeZ7kews Mathematica compresses scalar-like array sections:
[Edit: code formatting – @StefanKarpinski] |
do you mean M[1,:] is just a vector? |
Yes, sorry. My mind meant M[1,:] was processing the scalar 1 :-) Mathematica uses the period There are no doubt many problems with the period, one of which is that it just doesn't Unicode does provide very nicely a character named "the dot operator" so we could have In summary a current proposal subject for debate is
a. if you stick to all vectors being column vectors, everything works |
Suggestion 5) looks very odd to me. I prefer I want to echo @StefanKarpinski that it would be rather unfortunate to lose our concise broadcasting behavior in all this. After this change, what is the concise syntax for taking a vector |
Good questions My scheme does not preclude we could eliminate 5 This approach to broadcasting is cute and kludgy It has the benefit of documenting which dimension is being broadcast on Oh and the I LIKE THESE TWO EXAMPLES because the first is a LINEAR ALGEBRA example |
|
It seems to me that while having the broadcasting behavior is nice some of the time, you end up having to squeeze those extra unit dims outs just as often. Thus, having to do the opposite some of the time is OK if the rest of the system is nicer (and I do think having scalar dimensions dropped will make the system nicer). Thus you will need a function like
which will allow code like |
M[:,0,:] and v[:,0] ????? |
I'm more with @blakejohnson on the reduction issue; I personally think it's clearer to |
Regardless of whether we would widen or squeeze per default, I think that Julia should follow the lead from numpy when it comes to widening and allow something like |
In the list of @alanedelman v*A returns a vector if A is a matrix is not good. v_A should be an error if A is not 1x1 (mismatch of the range of index) One way to handle this issue is to automatically convert vector v to nx1 matrix (when needed) I feel that this represents a consistent and uniform way to think about linear algebra. (good mathematics) A uniform way to handle all those issues is to allow automatic (type?) conversion (when needed) In this case an array of size (1,1) can be converted to a number (when needed) (See JuliaLang/julia#4797 ) Xiao-Gang (a physicist) |
This leaves v'_A however .... I really want v'_A*w to work |
My impression of linear algebra in Julia is that it is very much organized like matrix algebra, although scalars and true vectors exist (which I think is good!) Let us consider how to handle a product like
Ideally, we would like to arrange things so that we never need to represent row vectors, but it would be enough to implement operations like But I'm beginning to suspect that banning row vectors in this kind of scheme will come at a high cost. Example: Consider how we would need to parenthesize a product to avoid forming a row vector at any intermediate step: (
To evaluate a product Another reason not to fix the multiplication order in advance: I seem to remember that there was an idea to use dynamic programming to choose the optimal evaluation order of So right now, introducing a transposed vector type seems like the most sane alternative to me. That, or doing everything as now but dropping trailing singleton dimensions when keeping them would result in an error. |
Transposition is just a particular way to permute modes. If you allow Having a special type for row vectors doesn't look like a good solution to me, because what will you do about other types of mode-n vectors, e.g., |
@toivoh mentioned that "One approach would be to extend this definition so that n or m might be replaced by absent, which would act like a value of one as far as computing the product is concerned, but is used to distinguish scalar and vectors from matrices:
In multi linear algebra (or for high rand tensors) the above proposal correspond to use absent to represent If we use this interpretation of absent , then 1. and 2. is OK and nice to have, but 3. may not be OK. |
I'm not a specialist in tensor theory, but I have used all the systems mentioned above (without any add-on packages) for substantial projects involving linear algebra. [TL;DR: skip to SUMMARY] Here are the most common scenarios in which I have found a need for greater generality in array handling than commonplace matrix-vector operations: (1) Functional analysis: For instance, as soon as you're using the Hessian of a vector-valued function, you need higher-order tensors to work. If you're writing a lot of math, it would be a huge pain to have to use special syntax for these cases. (2) Evaluation control: For instance, given any product that can be computed, one should be able to compute any sub-entity of that product separately, because one might wish to combine it with multiple different sub-entities to form different products. Thus Toivo's concern about, e.g., (3) Simplifying code: Several times when dealing with arithmetic operations mapped over multi-dimensional data sets, I have found that 6-7 lines of inscrutable looping or function-mapping code can be replaced with one or two brief array operations, in systems that provide appropriate generality. Much more readable, and much faster. Here are my general experiences with the above systems: MATLAB: Core language is limited beyond commonplace matrix-vector ops, so usually end up writing loops with indexing. NumPy: More general capability than MATLAB, but messy and complicated. For almost every nontrivial problem instance, I had to refer to documentation, and even then sometimes found that I had to implement myself some array operation that I felt intuitively should have been defined. It seems like there are so many separate ideas in the system that any given user and developer will have trouble automatically guessing how the other will think about something. It is usually possible to find a short, efficient way to do it, but that way is not always obvious to the writer or reader. In particular, I feel that the need for widening and singleton dimensions just reflects a lack of generality in the implementation for applying operators (though maybe some find it more intuitive). Mathematica: Clean and very general---in particular, all relevant operators are designed with higher-order tensor behavior in mind. Besides Dot, see for example the docs on Transpose, Flatten / Partition, and Inner / Outer. By combining just these operations, you can already cover most array-juggling use cases, and in version 9 they even have additional tensor algebra operations added to the core language. The downside is that even though the Mathematica way of doing something is clean and makes sense (if you know the language), it may not obviously correspond to the usual mathematical notation for doing it. And of course, the generality makes it difficult to know how the code will perform. scmutils: For functional analysis, it is clean, general, and provides the most mathematically intuitive operations (both write and read) of any of the above. The up/down tuple idea is really just a more consistent and more general extension of what people often do in written mathematics using transpose signs, differentiation conventions, and other semi-standardized notions; but everything just works. (To write my Ph.D. thesis, I ended up developing a consistent and unambiguous notation resembling traditional math notation but isomorphic to Sussman & Wisdom's SICM syntax.) They've also used it for a differential geometry implementation [1], which has inspired a port to SymPy [2]. I have not used it for for data analysis, but I would expect that in a generic array context where you only wanted one kind of tuple (like Mathematica's List), you could just pick one ("up") by convention. Again, generality obscures performance considerations to the programmer, but I would hope this is an area where Julia can excel. SUMMARY I think the proposed transposed-vector type should be characterized as the more general "down" tuple in scmutils, while regular vectors would be the "up" tuples. Calling them something like "vector" and "transposed-vector" would probably make more sense to people than calling them "up" and "down" (at the cost of brevity). This would support three categories of use: (1) for data analysis, if people just want nested arrays, they only need "vector"; I believe this approach reflects the emerging consensus above for the results of various operations, with the exception that those cases that earlier posts considered errors ( [1] http://dspace.mit.edu/handle/1721.1/30520 |
@thomasmcoffee sound like you are advocating for an explicit distinction between co- and contravariant vectors then. |
No, JuliaLang/julia#11004 has more |
Sorry. You're right, I should have specified open issue thread. |
from @alanedelman:
We really should think carefully about how the transpose of a vector should dispatch the various
A_*op*_B*
methods. It must be possible to avoid new types and ugly mathematics. For example, vector'vector yielding a vector (#2472, #8), vector' yielding a matrix, and vector'' yielding a matrix (#2686) are all bad mathematics.What works for me mathematically (which avoids introducing a new type) is that for a 1-dimensional
Vector
v
:v'
is a no-op (i.e. just returnsv
),v'v
orv'*v
is a scalar,v*v'
is a matrix, andv'A
orv'*A
(whereA
is anAbstractMatrix
) is a vectorA general N-dimensional transpose reverses the order of indices. A vector, having one index, should be invariant under transposition.
In practice
v'
is rarely used in isolation, and is usually encountered in matrix-vector products and matrix-matrix products. A common example would be to construct bilinear formsv'A*w
and quadratic formsv'A*v
which are used in conjugate gradients, Rayleigh quotients, etc.The only reason to introduce a new
Transpose{Vector}
type would be to represent the difference between contravariant and covariant vectors, and I don't find this compelling enough.The text was updated successfully, but these errors were encountered: