Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is P0 the initial state covariance matrix or the square root of the initial state covariance matrix? #325

Open
rklees opened this issue Apr 5, 2024 · 7 comments

Comments

@rklees
Copy link

rklees commented Apr 5, 2024

I wonder if someone could clarify what P0 really is. For instance, in the example notebooks to pymc-experimental , I found the following statement:

P0_diag = pm.Gamma("P0_diag", alpha=2, beta=5, dims=P0_dims[0])
P0 = pm.Deterministic("P0", pt.diag(P0_diag), dims=P0_dims)

A Gamma distribution is usually used as a prior for a standard deviation, not for a variance. So, does P0 as used above is a covariance matrix or the square-root of a covariance matrix?

Moreover, in structural.py, I frequently find statements like the following for a cycle:

if self.innovations:
sigma_cycle = self.make_and_register_variable(f"sigma_{self.name}", shape=(1,))
self.ssm["state_cov", :, :] = pt.eye(self.k_posdef) * sigma_cycle

So the name "state_cov" suggests that I define a variance, however, the right-hand side suggests that I define a standard deviation (sigma_cycle is the standard deviation of the cycle disturbance).

Any help would be appreciated.

@jessegrabowski
Copy link
Member

jessegrabowski commented Apr 5, 2024

P0 really is the full initial hidden states covariance matrix. In the code snippet above, P0_diag is a vector of variances. You are compeltely free to parameterize P0 as you like, up to and including estimation of the full dense matrix using e.g. pm.LKJCholeskyCov. In my testing, however, I have found that P0 tends to be only weakly identified, if identified at all. The typical thing to do is to use a "diffuse" initialization -- in statsmodels, they set non-stationary elements of P0 to be 1e6. You're free to do this, but I find the effect on the initial state variance to be way too strong, and even when using Kalman Smoothing it ruins the output plots. Plus there's really no justification for such a huge variance prior.

In the future, I'd like to hide P0 from users, and initialize it in a blockwise fashion, using stationary initialization where possible. It's on my to-do list, but I haven't had a lot of time to work on the statespace stuff recently.

As for sigma_cycle, any variable named "sigma" is a standard deviation. This was fixed in a previous PR that was already merged, you can check that the line you referenced is correctly squared on main. It's possible I missed some though, so let me know if you find anything that is not being correctly squared (or better yet, open a PR that adds the square!)

@rklees
Copy link
Author

rklees commented Apr 5, 2024 via email

@jessegrabowski
Copy link
Member

jessegrabowski commented Apr 7, 2024

I'm not sure I follow your point about P0 not mattering after the first update. x0 and P0 are taken to be predictions $x_{1|0}$ and $P_{1|0}$ (following Durbin and Koopsman), so the first update equation for the hidden state covariance is:

$$P_{1|1} = P_{1 | 0} - P_{1 | 0} Z^T (Z P_{1 | 0} Z^T + H)^{-1} Z P_{1 | 0}$$

So the filtered hidden states covariances clearly depend on the initial states. This is clear to see by playing with large diagonal values of P0.

The state names for Cycle and FrequencySeasonality are basically nonsense right now. If you actually look at how the cycle components and the frequency components are defined, you can see that they're not sine and cosine components, but mixtures. I'm open to suggestions on the names. Statsmodels doesn't even try; you just get back "state.1", "state.2", etc. That might be the best we can do here as well, although I would strongly prefer something else.

For the last point, I'd need to see full code for your final example. It shouldn't run as written, because the expected names for the variables are "annual_cycle and SA_cycle, not cycle1 and cycle2. The helper function unpack_symbolic_matrices_with_params matches on the name of the symbolic variable. Related to the above comment about what the states are, I don't think the first state will be a pure cosine amplitude? It should be $\gamma_0$, used to compute:

$$\gamma_1= \rho \gamma_0 \cos \left ( \frac{2\pi}{s} \right ) + \rho \gamma_{t-1}^\star \sin \left ( \frac{2\pi}{s} \right )+ \omega_{t}$$

If the second element ($\gamma^\star_0$) is zero you'll get a pure cosine function, but otherwise it will be a mixture. What should it be?

@rklees
Copy link
Author

rklees commented Apr 7, 2024 via email

@jessegrabowski
Copy link
Member

jessegrabowski commented Apr 7, 2024

Sure, we can change the order of the names as a short term solution. But I still think they are bad names, before if you set {"cycle_init":[A0, B0]} with neither A0 or B0 zero, the state is not a cosine or a sine, but a weighted combination of the two. I think it would be better if we could think up a more descriptive name, or at least a less deceptive one.

@rklees
Copy link
Author

rklees commented Apr 7, 2024 via email

@rklees
Copy link
Author

rklees commented Apr 8, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants