Creating triangular matrices #10146

jatentaki · 2022-04-05T17:06:58Z

jatentaki
Apr 5, 2022

I'm looking to learn correlation matrices (positive semi-definite) reparametrized as Corr = L @ L.T where L are the parameters, calculated by a neural network. My current code is

L_flat = DenseLayer(ndim=3*3)(input)
L = L_flat.reshape(3, 3)
return L @ L.T

but this is wasteful, since it is sufficient to paramaterize the triangular part of L in order to cover all possible correlation matrices. In our 3x3 case it's sufficient to have a DenseLayer with ndim=6 instead of 9. My question would be how to turn a flat vector into a triangular matrix (or possibly directly into the product L @ L.T) in the most efficient way. jnp.diag has this dual behavior where it can be used to both extract the diagonal of a 2d matrix and construct one given the diagonal elements, but it is not the case with jnp.triu. An obvious solution is to perform indexing like L = jnp.zeros(3, 3); L = L.at[precomputed_triu_indices].set(L_flat) but this doesn't look like it would be very efficient as it will look to XLA like any other pointer-based mutation of an array. This operation is quite central to my architecture so I believe it to be worth optimizing.

Answered by sharadmv

Apr 6, 2022

TensorFlow Probability (on JAX) has a bijector meant for this application (tfb.FillScaleTriL). It will convert a vector of unconstrained values into a PSD matrix (and it can also convert the PSD matrix back into the vector if you want).

The implementation is based on the fill_triangular function that concatenates a vector to the "tail" of itself, reshapes, then zeros out half of the matrix.

Pasted from fill_triangular's docstring:

x = np.arange(15) + 1
xc = np.concatenate([x, x[5:][::-1]])
# ==> array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 15, 14, 13,
#            12, 11, 10, 9, 8, 7, 6])
# (We add one to the arange result to disambiguate the zeros below the
# diagonal of our…

View full answer

Justin-Tan · 2022-04-06T03:21:18Z

Justin-Tan
Apr 6, 2022

This is what I've been using, it basically uses indexing as you said, but is not too slow for my purposes.

def fill_lower_tri(v, dim, out_dtype=float):
    num_nonzero = dim * (dim - 1) // 2
    mask = jnp.tri(dim, dtype=bool, k=-1)
    mask_idx = jnp.nonzero(mask, size=num_nonzero)
    out = jnp.eye(dim, dtype=out_dtype).at[mask_idx].set(v)
    return out

2 replies

aldopareja Aug 31, 2022

Is this still JITable?.

gil2rok Aug 26, 2024

^^ Is this Jitable? Not sure if it is...

sharadmv · 2022-04-06T03:52:03Z

sharadmv
Apr 6, 2022
Collaborator

TensorFlow Probability (on JAX) has a bijector meant for this application (tfb.FillScaleTriL). It will convert a vector of unconstrained values into a PSD matrix (and it can also convert the PSD matrix back into the vector if you want).

The implementation is based on the fill_triangular function that concatenates a vector to the "tail" of itself, reshapes, then zeros out half of the matrix.

Pasted from fill_triangular's docstring:

x = np.arange(15) + 1
xc = np.concatenate([x, x[5:][::-1]])
# ==> array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 15, 14, 13,
#            12, 11, 10, 9, 8, 7, 6])
# (We add one to the arange result to disambiguate the zeros below the
# diagonal of our upper-triangular matrix from the first entry in `x`.)
# Now, when reshapedlay this out as a matrix:
y = np.reshape(xc, [5, 5])
# ==> array([[ 1,  2,  3,  4,  5],
#            [ 6,  7,  8,  9, 10],
#            [11, 12, 13, 14, 15],
#            [15, 14, 13, 12, 11],
#            [10,  9,  8,  7,  6]])
# Finally, zero the elements below the diagonal:
y = np.triu(y, k=0)
# ==> array([[ 1,  2,  3,  4,  5],
#            [ 0,  7,  8,  9, 10],
#            [ 0,  0, 13, 14, 15],
#            [ 0,  0,  0, 12, 11],
#            [ 0,  0,  0,  0,  6]])

1 reply

YouJiacheng Apr 6, 2022

zero the elements below the diagonal need a indexing update as well...

YouJiacheng · 2022-04-06T05:17:33Z

YouJiacheng
Apr 6, 2022

Another alternative

from functools import partial
import jax
import jax.numpy as jnp
import numpy as np

@partial(jax.jit, static_argnames='dim')
def fill_lower_tri(v, dim):
    # we can use jax.ensure_compile_time_eval + jnp.tri to do mask indexing
    # but best practice is use numpy for static variable
    # and jnp.tril_indices is just a wrapper around np.tril_indices
    idx = np.tril_indices(dim)
    return jnp.zeros((dim, dim), dtype=v.dtype).at[idx].set(v)

print(fill_lower_tri(jnp.arange(6), 3))

1 reply

dishank-b May 23, 2023

How can I make the function handle batches? ie where v is of shape (N, ....) where is N is batch length.

jatentaki · 2022-04-06T08:49:56Z

jatentaki
Apr 6, 2022
Author

Thanks everyone for suggestions, since I can only accept one answer I chose that of @sharadmv since this is the approach I didn't think of myself :)

0 replies

aldopareja · 2023-05-23T19:14:56Z

aldopareja
May 23, 2023

Use vmapOn May 23, 2023 1:28 PM, Dishank Bansal ***@***.***> wrote: How can I make the function handle batches? ie where v is of shape (N, ....) where is N is batch length. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating triangular matrices #10146

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Creating triangular matrices #10146

Replies: 5 comments · 4 replies

sharadmv Apr 6, 2022 Collaborator

jatentaki Apr 6, 2022 Author

Replies: 5 comments 4 replies

sharadmv
Apr 6, 2022
Collaborator

jatentaki
Apr 6, 2022
Author