-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new syntax for transpose #21037
Comments
Andreas tried |
I kind of think that |
Moving my comment from the other issue (not that it solves anything, but...): +1 for using something else than I couldn't find languages with a special syntax for transposition, except for APL which uses the not-so-obvious See Rosetta code. (BTW, the Julia example actually illustrates conjugate transpose...) |
Could one of the other ticks be used? |
-100 to changing adjoint, since it's one of the awesome things that makes writing Julia code as clear as writing math, plus conjugate transpose is usually what you want anyway so it makes sense to have an abbreviated syntax for it. As long as we have the nice syntax for conjugate transpose, a postfix operator for regular transpose seems mostly unnecessary, so just having it be a regular function call seems fine to me. Using a different tick would be kind of weird, e.g. |
If we make the change in #20978, then a postfix transpose actually becomes more useful than it is now. e.g. if you have two vectors Honestly, I think our best option is still to leave it as-is. None of the suggestions seem like a clear improvement to me. In fact, we could even take it further, and make |
I like that plan a lot, @stevengj. |
One wrinkle: presumably the |
The main problem is finding a clean way to make Note that, for this to work properly, we might need to restore the fallback |
I think deciding whether transpose should be recursive or not is orthogonal to whether we make it participate in dot syntax fusion. The choice of making it non-recursive is not motivated by that. |
@StefanKarpinski, if restore a fallback |
What's the problem if the fallback is restored but we still have the transpose be non-recursive? |
@jebej, recursive transpose is more correct when it is used as a mathematical operation on linear operators. If I remember correctly, the main reason for making it non-recursive was so that we don't have to define the But it would not be terrible to have the fallback but still be non-recursive. |
Let me add two comments (I have looked through the earlier discussion and did not notice them - sorry if I have omitted something):
|
What is your use case for transposing a vector of strings? |
Consider the following scenario for instance:
and I want to append Comes up in practice when incrementally logging text data to a |
Another use-case: transposing is the simplest way to make two vectors orthogonal to each other in order to broadcast a function over the cartesian product ( |
Data point: Yesterday I encountered a party confused by the postfix "broadcast-adjoint" operator and why it behaves like transpose. Best! |
FWIW, I strongly feel that we should get rid of the I think it's fine to just have |
That's a good point. Arguably only the adjoint needs super-compact syntax. |
Let's just call this |
It seems confusing that we would have |
Might it be better to split this discussion off into a new Github issue, since it's about |
Is there an automated way of splitting issues, or should I simply open a new one and link to the discussion here? |
The latter |
I guess I'm way too late for the discussion, but I'd like to point one use that I think is worth mentioning: Applying the complex-step differentiation to a real-valued function which has I'll give an example with multiple occurences of using LinearAlgebra
# f : Rⁿ → R
# x ↦ f(x) = xᵀ * x / 2
f(x) = 0.5 * transpose(x) * x
# Fréchet derivative of f
# Df : Rⁿ → L(Rⁿ, R)
# x ↦ Df(x) : Rⁿ → R (linear, so expressed via multiplication)
# h ↦ Df(x)(h) = Df(x) * h
Df(x) = transpose(x)
# Complex-step method version of Df
function CSDf(x)
out = zeros(eltype(x), 1, length(x))
for i = 1:length(x)
x2 = copy(x) .+ 0im
h = x[i] * 1e-50
x2[i] += im * h
out[i] = imag(f(x2)) / h
end
return out
end
# 2nd Fréchet derivative
# D2f : Rⁿ → L(Rⁿ ⊗ Rⁿ, R)
# x ↦ D2f(x) : Rⁿ ⊗ Rⁿ → R (linear, so expressed via multiplication)
# h₁ ⊗ h₂ ↦ D2f(x)(h₁ ⊗ h₂) = h₁ᵀ * D2f(x) * h₂
D2f(x) = Matrix{eltype(x)}(I, length(x), length(x))
# Complex-step method version of D2f
function CSD2f(x)
out = zeros(eltype(x), length(x), length(x))
for i = 1:length(x)
x2 = copy(x) .+ 0im
h = x[i] * 1e-50
x2[i] += im * h
out[i, :] .= transpose(imag(Df(x2)) / h)
end
return out
end
# Test on random vector x of size n
n = 5
x = rand(n)
Df(x) ≈ CSDf(x)
D2f(x) ≈ CSD2f(x)
# test that the 1st derivative is correct Fréchet derivative
xϵ = √eps(norm(x))
for i = 1:10
h = xϵ * randn(n) # random small y
println(norm(f(x + h) - f(x) - Df(x) * h) / norm(h)) # Fréchet check
end
# test that the 2nd derivative is correct 2nd Fréchet derivative
for i = 1:10
h₁ = randn(n) # random h₁
h₂ = xϵ * randn(n) # random small h₂
println(norm(Df(x + h₂) * h₁ - Df(x) * h₁ - transpose(h₁) * D2f(x) * h₂) / norm(h₂)) # Fréchet check
end
# Because f is quadratic, we can even check that f is equal to its Taylor expansion
h = rand(n)
f(x + h) ≈ f(x) + Df(x) * h + 0.5 * transpose(h) * D2f(x) * h The point being that |
I don't think the complex step method is super relevant in julia. Isn't it a neat hack/workaround to get automatic differentiation in cases where a language supports efficient builtin complex numbers, but an equivalently efficient |
I agree about using Dual numbers instead of the complex-step method and that's a very good point you are making (I personally have already replaced all my complex-step-method evaluations with dual-number ones in julia). However, I do think that this is still a valid use case, for demonstration purposes, teaching tricks (see, e.g., Nick Higham talking about the complex-step method at Julia Con 2018), and portability (in other words, I worry that MATLAB's version of the code above using complex numbers would be cleaner). |
Coming from the world of Engineers and possibly Physicists who use complex arrays more than real arrays, not having a transpose operator is a bit of a pain. (Complex phasor representation for a harmonic time dependency is ubiquitous in our field.) I personally would favor the numpy syntax of x.H and x.T, though my only consideration is conciseness . The density of the transpose operator relative to Hermitian transpose is about 1 to 1 in my code. So the unconjugated transpose is equally important to me. A lot of the use of transpose is to create outer products and to size arrays correctly for interfacing to other code or for matrix multiplication. I intend for now to simply provide a macro or one character function for the operation, however what is the proper equivalent to the old functionality, transpose() or permutedims()? |
It’s interesting you say you use transpose as much as adjoint. I used to be the same, but mostly because I tended to make mistakes where my data was real so I tended to transpose but actually the adjoint was the correct operation (generalized to the complex case - adjoint was the right operation for my algorithm). There are (many) valid exceptions, of course. |
In everything related to electrodynamics, you often use space-like vectors and want to use vector operations in R^n (typically n=3), i.e. That being said, when reading up on syntax discussions, I am often contemplating that for me, personally, I could not imagine that a slightly more verbose syntax is the thing that slows down my programming speed or efficiency. Thinking about the algorithm itself and the most natural/efficient way of implementing it takes much more time. |
Not necessarily. Often you want time-average quantities from the Fourier amplitudes, in which case you use the complex dot product, e.g. ½ℜ[𝐄*×𝐇] is the time-average Poynting flux from the complex Fourier components and ¼ε₀|𝐄|² is a time-average vacuum energy density. On the other hand, since the Maxwell operator is (typically) a complex-symmetric operator ("reciprocal"), you often use an unconjugated "inner product" for (infinite-dimensional) algebra on the fields 𝐄(𝐱) etc. over all space. |
That's true, I had the word often in the first sentence, but removed it apparently :-). |
Well if you want to go there, Electromagnetic quantities are even more concisely written in a Clifford Algebraic formulation, often called Geometric algebra. Those algebras have multiple automorphisms and antiautomorphisms that play a critical role in the formulation of the theory, especially when considering scattering problems. These algebras typically have a concise matrix representation and those morphisms are often easily computed via complex transpose, Hermitian transpose and conjugation. Nevertheless as I stated earlier my primary use of transpose is often to arrange my arrays to interface with other arrays, other code, and to get matrix multiply to work against the correct dimension of a flattened array. |
Easy to implement now in 1.0, and should be efficient: function Base.getproperty(x::AbstractMatrix, name::Symbol)
if name === :T
return transpose(x)
#elseif name === :H # can also do this, though not sure why we'd want to overload with `'`
# return adjoint(x)
else
return getfield(x, name)
end
end This is surprisingly easy and kind of neat. The downside seems to be that orthogonal uses of |
Hmm. I wonder if that implies x.T "should" have been lowered to |
I’m sure everyone has their opinion - but to me it’s almost a feature that it’s hard to build a generic interface out of dot syntax. Don’t get me wrong, it’s a really great feature and wonderful for defining named tuple-like structs, and so-on. (It’s also possible to add a |
@c42f 's code works like a charm. Unfortunately for me I'm trying to write code that works on versions 0.64 and up, which forces me to use either transpose or my own defined function T(A) = transpose(A). Perhaps a macro would have been a little cleaner and slightly more efficient. |
To be clear, I'm not suggesting defining this particular But in a general sense, I do wonder why this kind of property use for defining "getters" in generic interfaces is actually bad. For example, generic field getter functions currently have a whopping big namespace problem which is simply solved by judicious use of |
Funny idea: julia> x'̄
ERROR: syntax: invalid character "̄" The character looks kind of like a |
There is a certain arrogance to a statement like this. Consider that some finite proportion of developers explicitly do not want Case and point for us working with symbolic calculations for modeling the default Maybe the solution is some kind of compiler directive declaring the meaning of |
Now that
.op
is generally the vectorized form ofop
, it's very confusing that.'
means transpose rather than the vectorized form of'
(adjoint, aka ctranspose). This issue is for discussing alternative syntaxes for transpose and/or adjoint.The text was updated successfully, but these errors were encountered: