Fix random sampling in Mixture#3004
Fix random sampling in Mixture#3004junpenglao merged 9 commits intopymc-devs:masterfrom junpenglao:fix_mixture_random
Conversation
|
Wow this is much more difficult than I thought with |
|
Merge #2984 first? It doesn't break anything new... |
|
Totally - is that ready? |
|
Yep! Tests all pass now, and I think it just passes |
Also updated dependent_density_regression notebook
ColCarroll
left a comment
There was a problem hiding this comment.
looks like a tricky change! I made a few tiny suggestions (and a docstring suggestion that might be wrong?)
pymc3/distributions/mixture.py
Outdated
| the component standard deviations | ||
| tau : array of floats | ||
| the component precisions | ||
| distshape : shape of the Normal component |
There was a problem hiding this comment.
it looks like this is already called dist_shape in generate_samples (instead of distshape)... is it easy to change that here as well?
There was a problem hiding this comment.
I will change it to comp_shape to avoid confusion.
pymc3/distributions/mixture.py
Outdated
| distshape : shape of the Normal component | ||
| notice that it should be different than the shape | ||
| of the mixture distribution, with one axis being | ||
| the number of component. |
There was a problem hiding this comment.
small typo: "number of components". I am also not sure I understand the docstring either -- would adding "a mixture of three 2d normal distributions would have shape=(3, 2) and dist_shape=(2,)" be accurate?
There was a problem hiding this comment.
Yeah this is a tricky one.
"a mixture of three 2d normal distributions would have shape=(3, 2) and dist_shape=(2,)" this is not correct. A mixture of three 2d normal distributions would have shape=(..., 2), and dist_shape=(2, 3). But I am not sure multi-D normal mixture actually works...
In practice, if you have a NormalMixture RV that is shape=(a,b,c,...), the component shape get one more axis comp_shape=(a,b,c,...,k), with k components. However, passing comp_shape=(a,b,c,...,k) (also what the previous code trying to do) break cases when you are using theano.shared observed and sample_ppc.
|
Will merge if no more comments. |
|
Of course, I am glad that bugs with this distribution is fixed 😄 . However, I experienced that some of my code which relied on mixtures for multiple dimensions broke. Have you considered updating the Thanks! PS: I can provide a minimal breaking example if needed. |
|
@ahmadsalim Please provide an example. |
|
Here, you go: import pymc3 as pm
import numpy as np
with pm.Model() as model:
mus = pm.Normal('mus', shape=(6,12))
taus = pm.Gamma('taus', alpha=1, beta=1, shape=(6, 12))
ws = pm.Dirichlet('ws', np.ones(12))
mixture = pm.NormalMixture('m', w=ws, mu=mus, tau=taus)I get the error: |
|
Can confirm - could you please raise an issue? |
|
Yes, of course! |
The output from sample_ppc does not equal to the shape of the observed sometimes, as the
distribution.shapeof ObservedRVs are[]. Initially, this PR adds the specific shape (i.e., shape of the data) to the ObservedRVs, so that generating random sample will have the correct shape, which makes working with random method easier (e.g., PR #2984). However, doing so turns out breaks a lot of examples whentheano.sharedobserved is used, as updating the value viaset_valueschange the shape.Now this PR just fix the mixture random error (close #2954).