Speed up computations with dynamic input shapes #14786

mar-muel · 2023-03-05T20:42:48Z

mar-muel
Mar 5, 2023

Hi team!

I know I'm not the first one to ask questions about dynamic input shapes. But maybe someone will have some further ideas how to optimize this (and maybe it will help someone else with a similar problem).

I'm trying to run SVD on a bunch of matrices with different shapes. Here are the shapes and counts of my input data:

Counter({(768, 768): 48, (1280, 1280): 48, (320, 320): 40, (640, 640): 40, (768, 3072): 12, (1280, 768): 12, (3072, 768): 12, (320, 768): 10, (640, 768): 10, (1280, 5120): 6, (10240, 1280): 6, (320, 1280): 5, (640, 2560): 5, (2560, 320): 5, (5120, 640): 5})

The naive approach of going through these matrices and computing the SVD with the jax.numpy.linalg.svd function takes 75s.

As you know, the issue here is that for each shape jax re-compiles the SVD function. When using jit and saving the compilation cache using

from jax.experimental.compilation_cache import compilation_cache as cc
cc.initialize_cache("jax_cache")

I bring it down to 38s.

When sorting/grouping the matrices by shape, I get a further speed-up and end up with 33s.

I've further tried to use vmap to vectorize, but this leads to a slow-down (36s). I could use pmap, but that leads to issues with the number of items not being divisible by the number of devices.

As of 2023, are there other ways to speed up this computation? E.g., by using jnp.where with a mask as input?

Any help is welcome 🙏 !

Answered by jakevdp

Mar 6, 2023

One possibility: if you create a padded matrix with the smaller matrix in the upper left corner, the padded svd output will contain the smaller svd:

import jax.numpy as jnp
import numpy as np

x = jnp.array(np.random.rand(3, 4))
x_padded = jnp.zeros((10, 10)).at[:3, :4].set(x)

u, s, vt = jnp.linalg.svd(x)
u_padded, s_padded, vt_padded = jnp.linalg.svd(x_padded)

print(u)
print(u_padded[:3, :3])
print()
print(s)
print(s_padded[:3])
print()
print(vt)
print(jnp.vstack([vt_padded[:3, :4], vt_padded[-1:, :4]]))

[[-0.6057347  -0.5485058   0.57639146]
 [-0.6427847  -0.08961529 -0.7607872 ]
 [-0.46894974  0.8313307   0.29828787]]
[[-0.6057346  -0.54850584 -0.5763916 ]
 [-0.64278454 -0.0896153   …

View full answer

jakevdp · 2023-03-06T17:25:19Z

jakevdp
Mar 6, 2023
Maintainer

One possibility: if you create a padded matrix with the smaller matrix in the upper left corner, the padded svd output will contain the smaller svd:

import jax.numpy as jnp
import numpy as np

x = jnp.array(np.random.rand(3, 4))
x_padded = jnp.zeros((10, 10)).at[:3, :4].set(x)

u, s, vt = jnp.linalg.svd(x)
u_padded, s_padded, vt_padded = jnp.linalg.svd(x_padded)

print(u)
print(u_padded[:3, :3])
print()
print(s)
print(s_padded[:3])
print()
print(vt)
print(jnp.vstack([vt_padded[:3, :4], vt_padded[-1:, :4]]))

[[-0.6057347  -0.5485058   0.57639146]
 [-0.6427847  -0.08961529 -0.7607872 ]
 [-0.46894974  0.8313307   0.29828787]]
[[-0.6057346  -0.54850584 -0.5763916 ]
 [-0.64278454 -0.0896153   0.7607871 ]
 [-0.46894974  0.8313308  -0.29828805]]

[2.0944376  0.7835688  0.18435456]
[2.0944378 0.7835684 0.1843546]

[[-0.35446793 -0.2589347  -0.69338554 -0.57142097]
 [ 0.85306406 -0.4337151  -0.2895241   0.01867583]
 [ 0.36930272  0.82382786 -0.17142993 -0.39437935]
 [-0.10124092  0.25719982 -0.63718456  0.7194403 ]]
[[-0.354468   -0.25893462 -0.6933855  -0.57142085]
 [ 0.8530642  -0.43371484 -0.28952408  0.01867592]
 [-0.3693026  -0.82382786  0.17142987  0.39437944]
 [-0.10124096  0.2571999  -0.6371846   0.7194403 ]]

This would allow you to compute the outputs using only a single-sized svd call.

0 replies

mar-muel · 2023-03-07T19:55:19Z

mar-muel
Mar 7, 2023
Author

Hi Jake - That's great, I didn't know about this trick to pad matrices for SVD.

Comparing the padded and unpadded results with jnp.allclose gives me false. However, the results are very close, so it would be within tolerance for me (especially when using only the top singular values).

However, no matter what padding strategy (or bucketing) I apply, I don't get a speedup as there's additional overhead from running the large padded matrices through SVD. Thanks for your help anyway!

PS. For the third output I used a slightly different transformation than you. Note this is not based on any algebraic derivation (but seemed to work 🤷 )

x0, x1 = x.shape  # shape before padding
pad = 10
x_padded = jnp.zeros((pad, pad)).at[:x0, :x1].set(x)
U, S, Vh = jnp.linalg.svd(x_padded)
U = U[:x0, :x0]
S = S[:x0]
Vh = Vh.at[x1-1, :x1].set(-1*Vh[x1-1, :x1])[:x1, :x1]

1 reply

jakevdp Mar 7, 2023
Maintainer

The results will not be identical, because singular vectors are only defined up to a sign.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up computations with dynamic input shapes #14786

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Speed up computations with dynamic input shapes #14786

mar-muel Mar 5, 2023

Replies: 2 comments · 1 reply

jakevdp Mar 6, 2023 Maintainer

mar-muel Mar 7, 2023 Author

jakevdp Mar 7, 2023 Maintainer

mar-muel
Mar 5, 2023

Replies: 2 comments 1 reply

jakevdp
Mar 6, 2023
Maintainer

mar-muel
Mar 7, 2023
Author

jakevdp Mar 7, 2023
Maintainer