-
Couldn't load subscription status.
- Fork 27
Description
This issue is just putting down some of my notes on why I think the current status is ok.
The question is whether to_vec(::Diagonal) should return only the vector along the diagonal, or the whole densified matrix. I.e. should it do
julia> v, b = to_vec(Diagonal([1, 2])); v # struct-like
2-element Vector{Int64}:
1
2or
julia> v, b = to_vec(Diagonal([1, 2])); v # array-like
4-element Vector{Int64}:
1
0
0
2in other words, do we treat it as a struct or as an array? At present, we densify it, which feels wrong in the sense that the off-diagonal zeros are structural, i.e. they can't (conceptually) be perturbed. Practically, when we perturb the off-diagonal elements, we just get no effect and we waste computation.
On the other hand, it is practical. Consider test_rrule(Symmetric, randn(3, 3), :U), and we want to test whether the rule accepts Diagonal tangents as well, which we do by test_rrule(Symmetric, randn(3, 3), :U; check_inferred=false, output_tangent=Diagonal(rand(3, 3))). If we do not densify the Diagonal, the test fails. The reason is that the finite differences jacobian is computed for the output type (Symmetric) vector (length 9), while the non-densified vector for a 3by3 Diagonal has length 3, which causes an awkward DimensionMismatch("second dimension of A, 9, does not match length of x, 3") error, which is highly likely confusing to a newcomer.
So the tradeoff seems to be sacrificing convenience or wasting computational power. In the 21st century convenience is sacred and computation is cheap, so keeping Diagonal densified is ok with me.
Densifying used to cause different (and quite common) problems when testing rules in ChainRulesTestUtils, e.g. for d=Diagonal(rand(3)); test_rrule(*, d, d) because the rand_tangent would make a Tangent{Diagonal}(diag=[1., 2., 3.]) (vector of length 3), whereas the primal was a Diagonal (vector of length 9). However these have been mostly fixed by adding the ProjectTo into rand_tangent in ChainRulesCore 1.0.