Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Use super script T as tranpose operator instead of .' #19344

Closed
wants to merge 1 commit into from

Conversation

andreasnoack
Copy link
Member

@andreasnoack andreasnoack commented Nov 16, 2016

Now that we have gotten used to the nice dot broadcast syntax, our .' transpose seems like an odd inheritance from Matlab. Therefore in this PR, I'm trying out as the symbol for (non-conjugate) transpose of a matrix, i.e.

julia> A = randn(4,3)
A4×3 Array{Float64,2}:
 -0.440397   -1.4602      -0.28655
 -1.20043     1.15012     -0.908555
  0.455838   -0.00121238   0.101605
 -0.0430458   0.411735    -1.88737

julia> Aᵀ
3×4 Array{Float64,2}:
 -0.440397  -1.20043    0.455838    -0.0430458
 -1.4602     1.15012   -0.00121238   0.411735
 -0.28655   -0.908555   0.101605    -1.88737

For most of us, it is twothree keystrokes more (\^T[tab]) but I think it reads so much nicer that it is worth it. Only minor problem is if Aᵀ is used as an identified but it will error and it will probably be a minor inconvenience to change the name.

@@ -1116,4 +1116,8 @@ Filesystem.stop_watching(stream::Filesystem._FDWatcher) = depwarn("stop_watching
@deprecate takebuf_array take!
@deprecate takebuf_string(b) String(take!(b))

# Deprecate .' syntax for transpose
matlabtranspose(x) = depwarn(string(".' is now depreacted in favor of ᵀ, i.e. superscipt T which ",
"you can type \\^T[tab] in the REPL, Jupyter, and most editors"), :matlabtranspose)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "most editors" really is true,

Copy link
Member Author

@andreasnoack andreasnoack Nov 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. Then "most common editors" maybe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You typically need plugins as well, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...then maybe "most common editors have plugins supporting this"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@@ -662,6 +662,7 @@ export
lufact!,
lufact,
lyap,
matlabtranspose,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should go in deprecated.jl so we know to remove it eventually. I would also rename it something like nonctranspose. Alternatively, this function could be avoided entirely by giving the deprecation warning during lowering.

@JeffBezanson
Copy link
Member

If we do this, maybe we should also add superscript H?

@dpsanders
Copy link
Contributor

Maybe add \transpose<TAB> or similar, which is easier to type and completable.

@StefanKarpinski
Copy link
Member

I'm a little concerned about the precedent set by making some superscripts operators while others are parts of variable names. Still, this is appealing.

@joa-quim
Copy link

I find this non-ascii chars an unnecessary pain that users would be subjected when writing in a text editor. For example, I simply do not know how to do it and imagine trying to do a transpose and have to go google on how to do it. Might look nice but doesn't pay.

@stevengj
Copy link
Member

stevengj commented Nov 17, 2016

I have to say I don't like this. It is nice to be able to do Aᵀ = A.' to define a variable to hold the transpose of a matrix, or e.g. to define AᵀA = A.'*A. This PR would eliminate the ability to use in variable names, and I think it is frankly more useful there; .' works perfectly well and is commonplace thanks to Matlab.

Also, I think the adjoint ' is much more commonly useful than the transpose anyway, so it seems especially unfortunate to take over a useful Unicode character for an operator of lower utility. Of course, we could also define , but that notation is far from universal for the adjoint: most mathematicians use *, and most physicists use (but even with I think it would be more useful to allow it in variable names a†b etc...currently we don't allow it at all).

@stevengj
Copy link
Member

(@joa-quim, you can type by \^T<tab> in the REPL or Jupyter and in many popular editors these days: in emacs and vim with Julia mode, and in Atom with the atom-latex-completions plugin.)

@stevengj
Copy link
Member

stevengj commented Nov 17, 2016

Also, superscript T has other uses, e.g. in international macro rᵀ is an interest rate in the presence of taxes, here is another usage in economics, and here is a usage in genetics.

It just seems much more sensible to me to let all letterlike and numberlike characters form part of an identifier, rather than making a few special cases into operators.

@joa-quim
Copy link

I know (sort of) how to use latex on REPL and \^T<tab> works on Atom, which I do not use, but doesn't on Sublime nor in Visual Studio Code that are editors that I use.

But my point is not for me in particular. Just imagine the number of users that would stuck when trying to do a transpose and find that they would have to find how print that Unicode (or any other) to do what can simply and usually be done with '

@andreasnoack
Copy link
Member Author

@joa-quim You can also do it Sublime with the UnicodeCompletion package. I use that all the time. Generally, I also think we beyond the point in Julia where we want to be restricted to ASCII. There is simply too much good stuff in unicode that makes code easier to read and with editor and REPL support these are easy to write. The policy has typically been that there should always be an ASCII alternative, e.g. you can write π or pi. That would also be the case for transpose where you could simply write transpose(A) instead of Aᵀ.

The only drawback I see is the "minor problem" I mentioned initially which @stevengj also mentions, i.e. we lose Aᵀ as an identifier. So the question is how big a loss that is. I'd say that as an economist, I can easily live without as an identifier. I can see that using one or two letters for this is quite arbitrary but the ' and .' are already a quite significant exception from our normal syntax and my proposal is to trade one exceptional syntax for another. Finally, as a data point, I'd mention that Mathematica uses Aᵀ and superscript (which we probably can't support) for the transpose and the conjugate transpose respectively.

I guess that most of the arguments have been mentioned at this point so we should probably just let the thumbs speak.

@Sacha0
Copy link
Member

Sacha0 commented Nov 17, 2016

I can see that using one or two letters for this is quite arbitrary but the ' and .' are already a quite significant exception from our normal syntax and my proposal is to trade one exceptional syntax for another.

This argument makes a better case for nixing the exceptional syntax altogether than for changing the symbols involved :).

@joa-quim
Copy link

@andreasnoack I installed UnicodeCompletion but nothing happens when I do \^T<tab>. Also installed a package to write Unicode symbols in Visual Studio Code. The amount of work needed to print one symbol really really discourages writing code that would need one of those symbols.

Really, one thing is to a able to use Unicode symbols. Another, very different, is to make it's use almost mandatory (for some cases like this ofc).

@stevengj
Copy link
Member

stevengj commented Nov 17, 2016

I really don't think the slight syntactic oddity of ' and .' (which is mitigated by the familiarity of this syntax from Matlab, the dominant player in numerical linear algebra) is worth the annoyance of removing one letter from the set of identifier superscripts, combined with the challenge (especially for newcomers) of typing .

Moreover, transpose(A) is not a drop-in substitute, because e.g. A.'*B is parsed as a call to At_mul_B. This special parsing cannot be done for transpose(A)*B, because the parser does not know the binding of transpose (unless we make that a keyword).

@andreasnoack
Copy link
Member Author

Moreover, transpose(A) is not a drop-in substitute, because e.g. A.'*B is parsed as a call to At_mul_B.

This will go away with a matrix transpose type which we plan to do anyway. At that point, transpose(A)*B and AᵀB will be exactly the same thing.

@stevengj
Copy link
Member

stevengj commented Nov 17, 2016

Yeah, we've been discussing a transpose type for ages (e.g. JuliaLang/LinearAlgebra.jl#42); it comes with its own tradeoffs. But at the very least this PR cannot even be considered until such a time as a Transpose type lands.

@andreasnoack
Copy link
Member Author

Yeah, we've been discussing a transpose type for ages (#4774)

The issue is not JuliaLang/LinearAlgebra.jl#42. We can (and most likely will) have a matrix transpose type without/before a vector transpose type.

But at the very least this PR cannot even be considered until such a time as a Transpose type lands

I appreciate your critique but I think it's unfair to other participants to claim a veto right over what can and cannot be considered. Anyway, I made this RFC to get some feedback. I'm not in a hurry.

@joa-quim I confused two of my packages. The completion is in the "Julia" package not in the "UnicodeCompletion" package.

@kshyatt kshyatt added the linear algebra Linear algebra label Nov 17, 2016
@swissr
Copy link
Contributor

swissr commented Nov 17, 2016

I would give a thumps up if were in addition: people who care have the option to beautify / 'mathematize' their code, analogous to e.g. in and . But as an enforced replacement I feel the unicode char complicates matters. Many know ' from Matlab and .' is the vectorized julian form. Julia will grow and it is important to keep the syntax as simple and predictable (?) as possible.

@StefanKarpinski
Copy link
Member

I have to say I'm not sure what fundamental problem new syntax addresses here, aside from making the conjugate-transpose less likely to be used on values for which conjugation doesn't make sense. Wouldn't introducing a revdims function and telling people to use that to transpose matrices instead of using .' or ' address the real problem more directly?

@johnmyleswhite
Copy link
Member

But .' is not a vectorized operator. It has semantics that are not derived from ' by making it perform ' element-wise. Instead it removes the automatic conjugation that ' performs. So it's clearly inconsistent with the rest of Julia's syntactic patterns and makes Julia harder to learn, not easier.

Addressing that is, of course, a separate issue from adding a new syntax for transposition.

@StefanKarpinski
Copy link
Member

That's true, but I'm way less concerned with that one irregularity than I am about the fundamental problems we have with generic (c)transpose.

@nalimilan
Copy link
Member

Indeed I think the fact that ' looks simpler than .' explains why people want to use it even when the goal is just to reshape the vector/matrix. A revdims function could improve things, but I'm afraid that won't be enough since ' is so appealing...

OTOH, using as an operator doesn't follow any established pattern in Julia, and goes against the general rule that "there's an ASCII way to do it". If we support it, people will soon expect ² to be equivalent to ^2.

I think my vote would be to use ' for transpose, and only provide a function for conjugate transposition.

@andyferris
Copy link
Member

I'm generally in favor for having a "beautified" unicode version so long as it doesn't degrade the ASCII experience. However, to be honest, we are really running out of ASCII characters... consider function composition which just plain looks better and more sensible as an operator f ∘ g ∘ h than something like compose(compose(f, g), h). So maybe the ASCII "operator" is a function, not an operator at all, and I think we all agree that we want and plan to have in the long-term transpose() and .'behaving identically.

Thus I would be OK with having e.g. transpose(A) and Aᵀ being the same, for example. Along with ctranspose(A) and Aᴴ. But I also temper this with the fact that I want more Unicode superscript and subscript letters than Unicode defines (they don't even define the complete set of Latin characters, for some reason...).

A final idea: we can easily define to be some singleton such that A^† becomes ctranspose(A). On the plus side, it is typed exactly like LaTeX, but it is a pun on exponentiation (and we don't support e.g. _ for "subscript" indexing). I also don't think it is reasonable to define T as a transposer, but there might be a good Unicode symbol for that somewhere (or else A^ᵀ and A^ᴴ)

Random thought. if ' can double up as ctranspose and char delimeters, what's to stop us using " as a postfix operator for something (e.g. if .' is "offensive" to dot-call syntax)?


@nalimilan

If we support it, people will soon expect ² to be equivalent to ^2.

True, and I agree, but OTOH how cool would it be if 3.0 × 10¹² parsed correctly??? :)


@andreasnoack wrote

We can (and most likely will) have a matrix transpose type without/before a vector transpose type.

Is that a challenge? ;)

@ararslan
Copy link
Member

ararslan commented Nov 18, 2016

I wasn't sure at first but I've now decided that I'm against this change. While .' is weird and unfortunate syntax that, that as John said conflicts with other uses of ., I don't think we should require the use of Unicode characters for anything. If I'm not mistaken this would be the first such case. I think Milan's proposal to use ' for general transpose is a great one, and I personally don't feel a particular need for an ASCII shortcut for conjugate transpose.

@KristofferC
Copy link
Member

Why would this require Unicode characters?

@stevengj
Copy link
Member

If you have a real matrix, ' is just a transpose anyway. If you have a complex matrix, conjugate-transpose is almost always the right thing, for the same reason that dot(x,y) conjugates x. I strongly disagree with the proposal to make ' the unconjugated transpose.

@ararslan
Copy link
Member

@KristofferC If we require then we're requiring the use of a non-ASCII character with no ASCII synonym.

@stevengj Yeah, that's a good point...

@nalimilan
Copy link
Member

@stevengj I would buy that argument if [:a :b]' didn't print a warning...

@stevengj
Copy link
Member

stevengj commented Nov 18, 2016

@nalimilan, see JuliaLang/LinearAlgebra.jl#257 ... it's unfortunate that the permutedims usage of transposition is overloaded with the linear-algebra meaning, but since the latter is vastly more common I think it should determine how we define '. If you want permutedims, you should call permutedims, not transpose. A warning for [:a :b]' is appropriate, in my opinion—you are using a linear-algebra notation on a non-algebraic object.

@swissr
Copy link
Contributor

swissr commented Nov 18, 2016

As stevengj said, I'm against using ' for transpose. This would be in opposition to Matlab where ' is ctranspose and .' is transpose.

My dot-vectorized argument was not good. But with a bit wider perspective one could maybe say that dots were introduced to ~save a 'oh so convenient' syntax (mostly vectors but sometimes, like here, special cases). I still think .' is ok (ignorant people (like me;) continue to use A' for real numbers and it works fine, non-ignorant people with complex numbers know what they do).

@Ismael-VC
Copy link
Contributor

Ismael-VC commented Nov 18, 2016

I'm not an expert in this maters and I'm not suggesting by any menas to do it this way, yet I kinda like my way of doing this, something like this:

julia> const= ctranspose;                                                      

julia> Base.:*{T<:Number}(x::AbstractMatrix{T}, f::typeof(Base.ctranspose)) = f(x)

julia> A = rand(2, 2);                                                            

julia> Aᵀ = (A)ᵀ                                                                  
2×2 Array{Float64,2}:                                                             
 0.943234    0.64957                                                              
 0.00476442  0.596362                                                             

julia>                                                                            

In this example is just an alias for ctranspose (could be instead of transpose or whatever), so the syntax reads as (A)ᵀ (but parenthesis are required!). Yet this doesn't conflict with the ability to use in the name of another variable, as it isn't specially handled, as shown by the ability to still define Aᵀ.

I came with this while playing around in twitter, see if I could do this in just one tweet!

So I'll leave this just for reference to you.

Still trying to generalize this syntax proves difficult as things are now, ie I want to say now (n)² and (n)³, but:

julia> square(n) = n^2; cube(n) = n^3;

julia> Base.:*(n, f::typeof(square)) = f(n)

julia> Base.:*(n, f::typeof(cube)) = f(n)

julia> const ² = square    # why?
syntax: invalid character "²"

julia> const ³ = cube    # why?
syntax: invalid character "³"

julia> n = 3;

julia>= (n)square    # I want: (n)² instead
9

julia>= (n)cube    # I want: (n)³ instead
27

julia>

I still don't understand what is allowed and what not, the only way to tell is to try to do it, or try to understand the parser perhaps.

Edit: ...mmm in this case I think it's because both are number like:

julia> const ⁽²⁾ = square              
square (generic function with 1 method)

julia>= (n)⁽²⁾                     
9            

julia> const ⁽³⁾ = cube              
cube (generic function with 1 method)

julia>= (n)⁽³⁾                   
27                                                             

This makes it legal but more impractical IMO.

@stevengj
Copy link
Member

stevengj commented Nov 18, 2016

@Ismael-VC, superscripts are too useful for variable names, e.g. you can currently have a variable named χ⁽²⁾, to consider them for postfix exponentiation operators. (And the reason ² = square doesn't work is that digits not allowed as the first character of an identifier. Even if they were allowed, would parse as an identifier, not as a function call.)

(This only goes to show what a can of worms ᵀ = transpose would introduce; it is much more comprehensible if we are consistent about superscripts letters/digits = identifiers rather than introducing odd exceptions. In contrast, the slight oddity of .' seems much more minor and isolated, particularly since it is a well-established convention in linear algebra.)

And ᵀ = ctranspose (instead of transpose) would run counter to standard algebraic notation.

@andyferris
Copy link
Member

@nalimilan wrote

I would buy that argument if [:a :b]' didn't print a warning...

To me it seems crucial that the purpose of ' and .' are for performing linear algebra. Symbols are not fields. Use reshape for data - it reads much, much clearer!

@ararlsan the ASCII version of the postfix operator would be the function transpose()... (its not 100% true at the moment, but hopefully will be one day)

@stevengj wrote

(And the reason ² = square doesn't work is that digits not allowed as the first character of an identifier. Even if they were allowed, x² would parse as an identifier, not as a function call.)

From my interpretation, I thought that @Ismael-VC was suggesting the brackets were mandatory, like (A)ᵀ. When he wrote Aᵀ = (A)ᵀ the left-hand-side was a new identifier. Similarly, we would have A² = (A)², but we could also write A_squared = (A)².

I don't understand why the parser treats ² as a digit when it is not treated as a number anywhere else? What is the purpose of this?

@Ismael-VC
Copy link
Contributor

Ismael-VC commented Nov 18, 2016

From my interpretation, I thought that @Ismael-VC was suggesting the brackets were mandatory

@stevengj @andyferris yes, they are, this is not mock up syntax, this works already without touching the parser, we'd only need to define the Base.:* methods, and the use of implicit multiplication syntax, which in this case, makes use of parenthesis mandatory.

The whole example from the tweet was:

Suppose I would like to do: m̂ = Gᵀ(GGᵀ)⁻¹d², I could currently write this: m̂ = Gᵀ * (GGᵀ)⁻¹ * (d)⁽²⁾ valid Julia code, like this:

julia> for (func, n) in Dict(:square => 2, :cube => 3, :pow¹² => 12)
           @eval $(func)(x) = x^$n
       end;

julia> for (super_script, func) in Dict(
               :⁻¹ => :inv, :⁽²⁾ => :square, :⁽³⁾ => :cube, :⁽¹²⁾ => :pow¹²,
               :ᵀ => :transpose, :ᴴ => :ctranspose
           )
           @eval begin
               const $super_script = $func
               Base.:*{T <: Number}(x::T, f::typeof($func)) = f(x)
               Base.:*{T <: Number}(x::AbstractArray{T}, f::typeof($func)) = f(x)
           end
       end;

julia> d = 2; G = rand(d, d)
2×2 Array{Float64,2}:
 0.720101  0.967408
 0.724678  0.318131

julia> Gᵀ = (G)ᵀ    # implicit multiplication, see: http://bit.ly/NumericLiteralCoeficients_JL
2×2 Array{Float64,2}:
 0.720101  0.724678
 0.967408  0.318131

julia> GGᵀ = G * Gᵀ
2×2 Array{Float64,2}:
 1.45442   0.829604
 0.829604  0.626366

julia>= Gᵀ * (GGᵀ)⁻¹ * (d)⁽²⁾
2×2 Array{Float64,2}:
 -2.69618   8.19885
  6.1417   -6.1029

julia> m̂¹² = (m̂)⁽¹²⁾
2×2 Array{Float64,2}:
  2.51495e12  -3.68582e12
 -2.76102e12   4.04645e12

julia> :(m̂⁽¹²⁾= (m̂)⁽¹²⁾)
:(m̂⁽¹²⁾ =* ⁽¹²⁾)

julia>

It's effectively some kind of reverse application syntax, ie. instead of f(x); doing Base.:*(x, f) = f(x) makes writing (x)f the same! 😈

@stevengj
Copy link
Member

@andyferris, there is a difference between something being a digit allowed in a literal vs. a digit allowed (as the second or later character) in an identifier. The latter is anything in one of the three Unicode "number" categories (Nd, Nl, and No). Superscripts like ² are in No (Number, other).

Using Unicode character categories to decide whether a character is allowed is an identifier is a huge win. It allows us to include huge swaths of characters using sensible general rules, without having to special-case each one (especially hard in non-English languages), and is automatically updated when the Unicode standard is revised (utf8proc's tables are updated by a semi-automatic process).

In contrast, I don't think we want to allow numeric literals using anything other than ASCII decimals.

@stevengj
Copy link
Member

stevengj commented Nov 21, 2016

That being said, there is a reasonable argument for allowing characters in categories No and Nl (Nl is already allowed) to be at the start of an identifier, e.g. to allow identifiers like or ¹x₂.

@stevengj
Copy link
Member

(I just ran into a case today where it was really nice to be able to use in a variable name, in order to write LDLᵀ = ldltfact!(T).)

@stevengj
Copy link
Member

(See #20278 for allowing category No to start identifiers.)

@juliohm
Copy link
Contributor

juliohm commented Aug 16, 2017

It is compelling to have this beautiful mathematical syntax in a language, but on the other hand it seems inconsistent with other decisions previously taken, like for instance A^2. Having to think if unicode means something or not can get in the way of programming productivity.

I am neutral on this one, it definitely requires deeper thought of the pros and cons.

@morningkyle
Copy link

While the small amount of ASCII operators should be enough for compilers in theory, it surely can't meet the simplicity requirement for many people that is used to common mathematical operations from mathematical background.
On the other hand, we have a huge amount of words in dictionary to express variable/function names. So keep a few special symbols for variable/function names does not sound that appealing. Naming things by exact words, and express operations by abstract symbols seems to be quite a general rule in engineering and science world.
And most people would spend much more time in thinking and reading code than typing code. So the slower speed to input a special operator(symbol) is much less important than to present a neat readable code format.
Just some thoughts that why I like the super script ᵀ operator.

@andyferris
Copy link
Member

Thought it might be worth pointing out that in v0.7 we will have non recursive transpose (for data manipulation - can transpose a matrix or vector of strings) and recursive adjoint (replacing ctranspose) and that conj(adjoint(x)) isn't always the same as transpose(x).

Perhaps this might create the raison d'être for another operator (or a change in operators) in this space?

@stevengj
Copy link
Member

Regardless of whether we want another operator, I don't like the idea of singling out one or two superscripts as operators. They are too useful in identifiers.

@Ismael-VC
Copy link
Contributor

I don't like the idea of singling out one or two superscripts as operators. They are too useful in identifiers.

With my hack both kinda worked, but the idea is that:

  • xᶠ, xf is an identifier,
  • (x)ᶠ, (x)f is function application and (x).ᶠ, (x).f as well.

@andreasnoack
Copy link
Member Author

We no longer have special syntax for the transpose so I'll close.

@andreasnoack andreasnoack deleted the anj/parse branch October 29, 2018 14:44
@musm
Copy link
Contributor

musm commented Dec 4, 2018

Would it at all be possible to allow certain symbols to be used as both identifiers and operators.

Could we allow super script T to be used as an operator, but not define it in base?

I'd like to use it in https://github.com/musm/EasyTranspose.jl so that we can write Aᵀ instead of (A)ᵀ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.