-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Adding support for adaptation of a dense mass matrix based on sample covariances #3596
Conversation
Thanks for the pull request! This is a great contribution!!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nice.
I'd love to have a tests that also samples using the new adaptation.
Eventually, we should expose this as adapt='jitter+adapt_dense'
in pm.sample
, but for now I'd rather hide it a bit, so that we can do some testing. I'd also add a warning that this is experimental for now.
There are a couple of improvements I could think of:
- Is it worth it to update the covariance in every step during tuning? It will have to do a cholesky decomposition each time.
- We never need the full matrix, but only matrix products and products with a decomposition. So we could just store the
(k, n)
matrixV
of samples (dim n, num samples k) and then do a svd of that. We could compute the matrix product as(I + sigma VV.T)
, and use the svd to draw samples. - What are good window sizes? This needs testing.
But I wouldn't let either of those stop us from merging this.
@dfm I wrote a very experimental version of mass matrix adaptation that uses low-rank approximations for the mass matrix, so it does not need to store a whole n by n matrix. I'd be interested to hear if this helps for your models. |
@aseyboldt: Thanks for the feedback! Your |
In my experience, it tends to be fine to just update the matrix at the end of each adaptation window (that's what Stan does too) but I couldn't figure out how to incorporate that with the current tuning implementation in PyMC3 because the step doesn't seem to know anything about how long tuning is and when it is going to finish. This seemed like a problem because then the last adaptation window might finish after tuning and the covariance estimate might never be updated based on the longest run. Does that make sense and is there something I'm missing?
This is a good suggestion. Would you rather try to merge or point people to
Agreed! |
Codecov Report
@@ Coverage Diff @@
## master #3596 +/- ##
==========================================
+ Coverage 89.9% 89.94% +0.04%
==========================================
Files 134 134
Lines 20269 20430 +161
==========================================
+ Hits 18222 18376 +154
- Misses 2047 2054 +7
|
Let me know if there's something you'd like to see added/removed or if you'd rather skip this and encourage people to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor nitpick and a comment - thanks for the PR @dfm!
Co-Authored-By: George Ho <[email protected]>
This pull request suggests an interface for full mass matrix adaptation based on an online estimate of the sample covariance. This is similar to the existing
QuadPotentialDiagAdapt
implementation except it updates the full covariance matrix.I have blogged an example where the performance of the default adaptation routine is much poorer than it should be. While this example might seem a bit contrived, I've found that that dense mass matrix adaptation is absolutely crucial for many of the applications that I've been considering in astronomy (for example: the exoplanet library).
So based on this experience, I took a quick stab at implementing an interface like this directly into pymc3 and I'd love to get your feedback!
I also posted a short demo based on the blog post implemented using this pull request: https://gist.github.com/dfm/da1d0470d6fb54c63e6a913c1ef67a9e