Replies: 4 comments 11 replies
-
Have you tried using a |
Beta Was this translation helpful? Give feedback.
-
I believe the reference measure for LKJ is the Lebesgue measure on strict upper/lower triangle of the correlation matrix, so with these constraints the sensible reference measure is the Lebesgue measure on the same elements that are not constrained to be zero. For LKJCholesky, the reference measure is the (hemi-)spherical measure on the rows/columns of the factor. I think the sensible reference measure with these constraints would be more similar to the invariant measure on orthogonal matrices, but we would have to be careful defining it to know what the correct Jacobian correction should be. |
Beta Was this translation helpful? Give feedback.
-
Here's what I have so far: https://gist.github.com/sethaxen/0b1c9cf92cbea99035a0261e16921a69 It works by iteratively satisfying the constraints. For each column Quick demo: julia> b = VecCholeskyCorrBijectorWithZeros([(4, 3), (5, 3), (6, 1), (7, 2), (10, 8)]);
julia> y = randn(Bijectors.output_size(b, (10, 10)));
julia> x = inverse(b)(y);
julia> isposdef(x)
true
julia> Matrix(x)
10×10 Matrix{Float64}:
1.0 -0.0907855 0.998123 0.0309052 0.0382051 0.0 -0.436952 0.215967 0.983914 0.301468
-0.0907855 1.0 -0.138372 0.522425 0.805116 0.0598142 -8.89604e-17 -0.717 -0.258497 0.404107
0.998123 -0.138372 1.0 4.32063e-18 5.69099e-18 0.0248355 -0.461901 0.243942 0.991967 0.251734
0.0309052 0.522425 4.32063e-18 1.0 0.455836 0.240887 -0.0552835 -0.629001 -0.0816327 0.0239458
0.0382051 0.805116 5.69099e-18 0.455836 1.0 -0.0195799 0.180512 -0.330956 -0.102061 0.300346
0.0 0.0598142 0.0248355 0.240887 -0.0195799 1.0 -0.784509 -0.54447 0.0285604 -0.687657
-0.436952 -8.89604e-17 -0.461901 -0.0552835 0.180512 -0.784509 1.0 0.361692 -0.458932 0.438751
0.215967 -0.717 0.243942 -0.629001 -0.330956 -0.54447 0.361692 1.0 0.322642 -8.18197e-17
0.983914 -0.258497 0.991967 -0.0816327 -0.102061 0.0285604 -0.458932 0.322642 1.0 0.194298
0.301468 0.404107 0.251734 0.0239458 0.300346 -0.687657 0.438751 -8.18197e-17 0.194298 1.0 And sure, this is a straightforward generalization of julia> b = VecCholeskyCorrBijectorWithZeros(Tuple{Int,Int}[]);
julia> b2 = Bijectors.VecCholeskyBijector(:U);
julia> y = randn(Bijectors.output_size(b, (10, 10)));
julia> inverse(b)(y).U ≈ inverse(b2)(y).U
true To-do:
Here's the Jacobian: julia> using ForwardDiff, Plots
julia> J = ForwardDiff.jacobian(y) do y
x = inverse(b)(y)
R = Matrix(x)
z = similar(y)
k = 1
for j in axes(R, 2), i in 1:(j - 1)
if (i, j) ∈ b.zero_inds || (j, i) ∈ b.zero_inds
continue
end
z[k] = R[i, j]
k += 1
end
return z
end;
julia> heatmap((iszero.(J))[reverse(axes(J, 1)), :]; colormap=:grays, colorbar=false, aspect_ratio=1, size=(200, 200), dpi=600) |
Beta Was this translation helpful? Give feedback.
-
So it turns out the Jacobian determinant is a very simple modification to the standard Jacobian determinant, requiring only that one divides out the determinants of the R factors in the QR decompositions. I've updated the gist at https://gist.github.com/sethaxen/0b1c9cf92cbea99035a0261e16921a69 to include this. Here's a quick demo of how to use this code in Turing: julia> using Turing
julia> @model function foo(N, η, b, M = Bijectors.output_size(b, (N, N))[1])
y ~ filldist(Flat(), M)
Rchol, logJ = with_logabsdet_jacobian(inverse(b), y)
Turing.@addlogprob! logJ + logpdf(LKJCholesky(N, η), Rchol)
return (; R=Matrix(Rchol))
end;
julia> b = VecCholeskyCorrBijectorWithZeros([(4, 3), (5, 3), (6, 2), (6, 4), (7, 2), (10, 8)]);
julia> model = foo(10, 1.0, b);
julia> chns = sample(model, NUTS(), MCMCThreads(), 1_000, 4)
Chains MCMC chain (1000×51×4 Array{Float64, 3}):
Iterations = 501:1:1500
Number of chains = 4
Samples per chain = 1000
Wall duration = 2.58 seconds
Compute duration = 8.33 seconds
parameters = y[1], y[2], y[3], y[4], y[5], y[6], y[7], y[8], y[9], y[10], y[11], y[12], y[13], y[14], y[15], y[16], y[17], y[18], y[19], y[20], y[21], y[22], y[23], y[24], y[25], y[26], y[27], y[28], y[29], y[30], y[31], y[32], y[33], y[34], y[35], y[36], y[37], y[38], y[39]
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
y[1] -0.0014 0.3262 0.0036 8306.1741 2650.6651 1.0046 997.4990
y[2] 0.0065 0.3370 0.0035 9004.9886 2239.7985 1.0030 1081.4205
y[3] -0.0000 0.3573 0.0036 9817.5543 2616.6600 1.0022 1179.0026
y[4] -0.0128 0.3851 0.0040 9337.8045 2719.1003 1.0018 1121.3888
y[5] -0.0010 0.3858 0.0040 9292.1477 2667.5592 1.0002 1115.9058
y[6] -0.0069 0.3661 0.0038 9555.6746 2645.6843 1.0031 1147.5531
y[7] 0.0047 0.3740 0.0041 8427.9599 2314.7639 1.0018 1012.1244
y[8] -0.0023 0.3961 0.0040 9777.3013 3027.9953 1.0008 1174.1685
y[9] 0.0019 0.3752 0.0041 8329.9344 2817.7893 1.0010 1000.3524
y[10] 0.0013 0.3996 0.0043 8699.3030 2772.0395 1.0003 1044.7103
...
julia> Rs = first.(generated_quantities(model, chns));
julia> Rs_arr = permutedims(stack(Rs), (3, 4, 1, 2));
julia> dropdims(mean(Rs_arr; dims=(1, 2)); dims=(1, 2))
10×10 Matrix{Float64}:
1.0 -0.000937718 0.000716397 0.00131169 -0.0112337 -0.00768082 -0.00955092 0.00433007 -0.000583165 -0.00612702
-0.000937718 1.0 0.00477415 -0.00758961 -0.00111306 1.40106e-18 -7.06457e-19 0.00305375 0.000326224 0.000141046
0.000716397 0.00477415 1.0 -3.49966e-19 2.00646e-18 -0.00141633 -0.000750534 0.00383163 -0.0052103 0.00375098
0.00131169 -0.00758961 -3.49966e-19 1.0 -0.00740308 2.88631e-19 0.00290981 0.000568475 0.00339934 0.00365722
-0.0112337 -0.00111306 2.00646e-18 -0.00740308 1.0 -0.00411793 -0.00111534 -0.00342769 -0.00271073 0.00231426
-0.00768082 1.40106e-18 -0.00141633 2.88631e-19 -0.00411793 1.0 0.000787432 -0.00914072 -0.00736483 -0.00295641
-0.00955092 -7.06457e-19 -0.000750534 0.00290981 -0.00111534 0.000787432 1.0 -0.00597936 0.00280843 -0.00632741
0.00433007 0.00305375 0.00383163 0.000568475 -0.00342769 -0.00914072 -0.00597936 1.0 -0.000178056 -1.17235e-19
-0.000583165 0.000326224 -0.0052103 0.00339934 -0.00271073 -0.00736483 0.00280843 -0.000178056 1.0 0.000328028
-0.00612702 0.000141046 0.00375098 0.00365722 0.00231426 -0.00295641 -0.00632741 -1.17235e-19 0.000328028 1.0
julia> dropdims(std(Rs_arr; dims=(1, 2)); dims=(1, 2))
10×10 Matrix{Float64}:
0.0 0.297735 0.295122 0.305087 0.314443 0.30063 0.298835 0.297618 0.303849 0.299865
0.297735 7.32965e-17 0.28859 0.313681 0.309783 6.41493e-17 6.1412e-17 0.295638 0.293659 0.303158
0.295122 0.28859 1.23095e-16 6.41305e-17 6.68502e-17 0.318834 0.300889 0.303083 0.298644 0.297913
0.305087 0.313681 6.41305e-17 1.28809e-16 0.322199 6.30431e-17 0.301314 0.303051 0.300869 0.300524
0.314443 0.309783 6.68502e-17 0.322199 1.52498e-16 0.301399 0.290621 0.297827 0.298588 0.300076
0.30063 6.41493e-17 0.318834 6.30431e-17 0.301399 1.54036e-16 0.315676 0.305576 0.30645 0.303368
0.298835 6.1412e-17 0.300889 0.301314 0.290621 0.315676 1.8129e-16 0.302016 0.303771 0.298926
0.297618 0.295638 0.303083 0.303051 0.297827 0.305576 0.302016 1.99332e-16 0.299137 6.72064e-17
0.303849 0.293659 0.298644 0.300869 0.298588 0.30645 0.303771 0.299137 2.08847e-16 0.296491
0.299865 0.303158 0.297913 0.300524 0.300076 0.303368 0.298926 6.72064e-17 0.296491 2.08714e-16
julia> ess(Rs_arr)
10×10 Matrix{Float64}:
NaN 9907.95 11237.6 2145.82 2093.74 4569.24 2003.48 9540.87 11305.5 3299.59
9907.95 4227.96 6579.85 9770.15 9294.1 3826.64 4053.54 6731.9 6292.09 8570.7
11237.6 6579.85 3842.25 3930.91 3927.51 7574.65 9761.37 5520.39 5772.56 5799.75
2145.82 9770.15 3930.91 4146.77 4391.51 3501.25 8235.1 6993.72 6392.95 6221.25
2093.74 9294.1 3927.51 4391.51 4164.1 7755.78 4670.25 3878.4 4921.05 4083.3
4569.24 3826.64 7574.65 3501.25 7755.78 4070.78 4844.72 3868.94 4155.53 4241.81
2003.48 4053.54 9761.37 8235.1 4670.25 4844.72 3879.63 3187.78 3069.21 3108.06
9540.87 6731.9 5520.39 6993.72 3878.4 3868.94 3187.78 3668.2 2537.59 3821.87
11305.5 6292.09 5772.56 6392.95 4921.05 4155.53 3069.21 2537.59 3844.7 2471.25
3299.59 8570.7 5799.75 6221.25 4083.3 4241.81 3108.06 3821.87 2471.25 3862.0
julia> rhat(Rs_arr)
10×10 Matrix{Float64}:
NaN 1.00113 1.00242 1.00069 1.00204 1.00113 1.0028 1.00227 1.00045 1.00066
1.00113 0.999754 1.00207 1.00176 1.00015 1.00051 1.00159 1.0004 1.00027 1.00152
1.00242 1.00207 1.001 1.00051 1.00013 1.00187 0.999544 1.00127 1.00014 1.00062
1.00069 1.00176 1.00051 0.999743 1.00143 1.00227 1.00114 1.00034 1.00035 1.00177
1.00204 1.00015 1.00013 1.00143 1.00062 1.00152 1.00262 1.00085 1.0006 1.00022
1.00113 1.00051 1.00187 1.00227 1.00152 0.999803 1.00058 1.00088 1.00099 1.00254
1.0028 1.00159 0.999544 1.00114 1.00262 1.00058 0.999535 1.00103 1.00037 0.999657
1.00227 1.0004 1.00127 1.00034 1.00085 1.00088 1.00103 0.999928 1.00093 1.00001
1.00045 1.00027 1.00014 1.00035 1.0006 1.00099 1.00037 1.00093 1.00115 1.00064
1.00066 1.00152 1.00062 1.00177 1.00022 1.00254 0.999657 1.00001 1.00064 1.00061 If this is sufficiently useful, we might consider adding this directly to Bijectors (would need to work out the unconstraining transform also). Note, however, that still the bijector would need to directly be used, unless one had a way to override LKJCholesky's default bijector to specify this one be used. @torfjelde this in some way relates to our Slack discussion. |
Beta Was this translation helpful? Give feedback.
-
I'm working on a Turing model for fitting a covariance matrix and have been following along with the threads here and here. In my use case, I'm working with an inverse covariance matrix,
Θ
, conditioned on an undirected graph.Θ[i,j]=0
if nodei
andj
are conditionally independent, i.e. if there is no edge connecting vertexi
andj
in the undirected graph. I'm sampling the prior fromLKJ()
, then zeroing the elements that don't share an edge in the undirected graph, but this leading to non-positive definite matrices,PosDefException: matrix is not positive definite; Cholesky factorization failed.
PositiveFactorizations
doesn't seem to be an appropriate solution here because it perturbs the zero entries ofΘ
.Any thoughts on work-arounds would be greatly appreciated.
Beta Was this translation helpful? Give feedback.
All reactions