Mixture of mixtures works, but not Mixture of Mixture and Single distribution #3994

ricardoV94 · 2020-07-03T11:00:45Z

I am trying to model a Mixture between a Mixture and another distribution, but I am getting an error:

Minimal Example:

with pm.Model() as m:
    a1 = pm.Normal.dist(mu=0, sigma=1)
    a2 = pm.Normal.dist(mu=0, sigma=1)
    a3 = pm.Normal.dist(mu=0, sigma=1)
    
    w1 = pm.Dirichlet('w1', np.array([1, 1]))    
    mix = pm.Mixture.dist(w=w1, comp_dists=[a1, a2])
    
    w2 = pm.Dirichlet('w2', np.array([1, 1]))
    like = pm.Mixture = pm.Mixture('like', w=w2, comp_dists=[mix, a3], observed=np.random.randn(20))

Traceback:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/.local/lib/python3.8/site-packages/pymc3/distributions/mixture.py in _comp_modes(self)
    289         try:
--> 290             return tt.as_tensor_variable(self.comp_dists.mode)
    291         except AttributeError:

AttributeError: 'list' object has no attribute 'mode'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-8-dedf5c958f15> in <module>
      8 
      9     w2 = pm.Dirichlet('w2', np.array([1, 1]))
---> 10     like = pm.Mixture = pm.Mixture('like', w=w2, comp_dists=[mix, a3], observed=np.random.randn(20))

~/.local/lib/python3.8/site-packages/pymc3/distributions/distribution.py in __new__(cls, name, *args, **kwargs)
     44                 raise TypeError("observed needs to be data but got: {}".format(type(data)))
     45             total_size = kwargs.pop('total_size', None)
---> 46             dist = cls.dist(*args, **kwargs)
     47             return model.Var(name, dist, data, total_size)
     48         else:

~/.local/lib/python3.8/site-packages/pymc3/distributions/distribution.py in dist(cls, *args, **kwargs)
     55     def dist(cls, *args, **kwargs):
     56         dist = object.__new__(cls)
---> 57         dist.__init__(*args, **kwargs)
     58         return dist
     59 

~/.local/lib/python3.8/site-packages/pymc3/distributions/mixture.py in __init__(self, w, comp_dists, *args, **kwargs)
    139 
    140         try:
--> 141             comp_modes = self._comp_modes()
    142             comp_mode_logps = self.logp(comp_modes)
    143             self.mode = comp_modes[tt.argmax(w * comp_mode_logps, axis=-1)]

~/.local/lib/python3.8/site-packages/pymc3/distributions/mixture.py in _comp_modes(self)
    290             return tt.as_tensor_variable(self.comp_dists.mode)
    291         except AttributeError:
--> 292             return tt.squeeze(tt.stack([comp_dist.mode
    293                                         for comp_dist in self.comp_dists],
    294                                        axis=-1))

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in stack(*tensors, **kwargs)
   4726         dtype = scal.upcast(*[i.dtype for i in tensors])
   4727         return theano.tensor.opt.MakeVector(dtype)(*tensors)
-> 4728     return join(axis, *[shape_padaxis(t, axis) for t in tensors])
   4729 
   4730 

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in join(axis, *tensors_list)
   4500         return tensors_list[0]
   4501     else:
-> 4502         return join_(axis, *tensors_list)
   4503 
   4504 

~/.local/lib/python3.8/site-packages/theano/gof/op.py in __call__(self, *inputs, **kwargs)
    613         """
    614         return_list = kwargs.pop('return_list', False)
--> 615         node = self.make_node(*inputs, **kwargs)
    616 
    617         if config.compute_test_value != 'off':

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in make_node(self, *axis_and_tensors)
   4232             return tensor(dtype=out_dtype, broadcastable=bcastable)
   4233 
-> 4234         return self._make_node_internal(
   4235             axis, tensors, as_tensor_variable_args, output_maker)
   4236 

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in _make_node_internal(self, axis, tensors, as_tensor_variable_args, output_maker)
   4299         if not python_all([x.ndim == len(bcastable)
   4300                            for x in as_tensor_variable_args[1:]]):
-> 4301             raise TypeError("Join() can only join tensors with the same "
   4302                             "number of dimensions.")
   4303 

TypeError: Join() can only join tensors with the same number of dimensions.

However, if I create a fake Mixture dist for the third distribution, it seems to work:

with pm.Model() as m:
    a1 = pm.Normal.dist(mu=0, sigma=1)
    a2 = pm.Normal.dist(mu=0, sigma=1)
    a3 = pm.Normal.dist(mu=0, sigma=1)
    
    w1 = pm.Dirichlet('w1', np.array([1, 1]))    
    mix = pm.Mixture.dist(w=w1, comp_dists=[a1, a2])
    
    fake_mix = pm.Mixture.dist(w=[1, 0], comp_dists=[a3, a3])
    
    w2 = pm.Dirichlet('w2', np.array([1, 1]))
    like = pm.Mixture('like', w=w2, comp_dists=[mix, fake_mix], observed=np.random.randn(20))

I understand that this might not be optimal in the first place, and can certainly be coded as a custom distribution, but is this a design choice or a bug? It could also be just a question of shape handling, but I have no good intuition on how to check for that.

Versions and main components

PyMC3 Version: 3.8
Theano Version: 1.0.4
Python Version: 3.8.2
Operating system: Linux Ubuntu
How did you install PyMC3: pip

brandonwillard · 2020-07-03T15:05:14Z

The last line with like = pm.Mixture = ... looks like a typo.

ricardoV94 · 2020-07-03T15:55:02Z

@brandonwillard You are right, I think it was a copy/paste typo. Anyway it still works without the typo.

brandonwillard · 2020-07-03T16:19:01Z

Looks like there might be a bug in the definition of Mixture.mode:

>>> mix.mean.tag.test_value
array(0.)
>>> mix.mode.tag.test_value
array([0., 0.])

I'm assuming that the mode should be over the joint distribution of the mixture and mixing terms, and return a single value—like the mean.

brandonwillard · 2020-07-03T16:27:08Z

OK, the problem seems to start here. The comp_mode_logps have shape (2, 1) while w has shape (2,), so the product w * comp_mode_logps has shape (2, 2), which is too many dimensions.

Looking into this a little more, the real problem is apparently here:

>>> mix._comp_logp(mix._comp_modes()).tag.test_value
array([[ -0.91893853, -50.91893853],
       [-50.91893853,  -0.91893853]])

Closes pymc-devs#3994.

Closes #3994.

* Update GP NBs to use standard notebook style (pymc-devs#3978) * update gp-latent nb to use arviz * rerun, run black * rerun after fixes from comments * rerun black * rewrite radon notebook using ArviZ and xarray (pymc-devs#3963) * rewrite radon notebook using ArviZ and xarray Roughly half notebook has been updated * add comments on xarray usage * rewrite 2n half of notebook * minor fix * rerun notebook and minor changes * rerun notebook on pymc3.9.2 and ArviZ 0.9.0 * remove unused import * add change to release notes * SMC: refactor, speed-up and run multiple chains in parallel for diagnostics (pymc-devs#3981) * first attempt to vectorize smc kernel * add ess, remove multiprocessing * run multiple chains * remove unused imports * add more info to report * minor fix * test log * fix type_num error * remove unused imports update BF notebook * update notebook with diagnostics * update notebooks * update notebook * update notebook * Honor discard_tuned_samples during KeyboardInterrupt (pymc-devs#3785) * Honor discard_tuned_samples during KeyboardInterrupt * Do not compute convergence checks without samples * Add time values as sampler stats for NUTS (pymc-devs#3986) * Add time values as sampler stats for NUTS * Use float time counters for nuts stats * Add timing sampler stats to release notes * Improve doc of time related sampler stats Co-authored-by: Alexandre ANDORRA <[email protected]> Co-authored-by: Alexandre ANDORRA <[email protected]> * Drop support for py3.6 (pymc-devs#3992) * Drop support for py3.6 * Update RELEASE-NOTES.md Co-authored-by: Colin <[email protected]> Co-authored-by: Colin <[email protected]> * Fix Mixture distribution mode computation and logp dimensions Closes pymc-devs#3994. * Add more info to divergence warnings (pymc-devs#3990) * Add more info to divergence warnings * Add dataclasses as requirement for py3.6 * Fix tests for extra divergence info * Remove py3.6 requirements * follow-up of py36 drop (pymc-devs#3998) * Revert "Drop support for py3.6 (pymc-devs#3992)" This reverts commit 1bf867e. * Update README.rst * Update setup.py * Update requirements.txt * Update requirements.txt Co-authored-by: Adrian Seyboldt <[email protected]> * Show pickling issues in notebook on windows (pymc-devs#3991) * Merge close remote connection * Manually pickle step method in multiprocess sampling * Fix tests for extra divergence info * Add test for remote process crash * Better formatting in test_parallel_sampling Co-authored-by: Junpeng Lao <[email protected]> * Use mp_ctx forkserver on MacOS * Add test for pickle with dill Co-authored-by: Junpeng Lao <[email protected]> * Fix keep_size for arviz structures. (pymc-devs#4006) * Fix posterior pred. sampling keep_size w/ arviz input. Previously posterior predictive sampling functions did not properly handle the `keep_size` keyword argument when getting an xarray Dataset as parameter. Also extended these functions to accept InferenceData object as input. * Reformatting. * Check type errors. Make errors consistent across sample_posterior_predictive and fast_sample_posterior_predictive, and add 2 tests. * Add changelog entry. Co-authored-by: Robert P. Goldman <[email protected]> * SMC-ABC add distance, refactor and update notebook (pymc-devs#3996) * update notebook * move dist functions out of simulator class * fix docstring * add warning and test for automatic selection of sort sum_stat when using wassertein and energy distances * update release notes * fix typo * add sim_data test * update and add tests * update and add tests * add docs for interpretation of length scales in periodic kernel (pymc-devs#3989) * fix the expression of periodic kernel * revert change and add doc * FIXUP: add suggested doc string * FIXUP: revertchanges in .gitignore * Fix Matplotlib type error for tests (pymc-devs#4023) * Fix for issue 4022. Check for support for `warn` argument in `matplotlib.use()` call. Drop it if it causes an error. * Alternative fix. * Switch from pm.DensityDist to pm.Potential to describe the likelihood in MLDA notebooks and script examples. This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook. * Remove Dirichlet distribution type restrictions (pymc-devs#4000) * Remove Dirichlet distribution type restrictions Closes pymc-devs#3999. * Add missing Dirichlet shape parameters to tests * Remove Dirichlet positive concentration parameter constructor tests This test can't be performed in the constructor if we're allowing Theano-type distribution parameters. * Add a hack to statically infer Dirichlet argument shapes Co-authored-by: Brandon T. Willard <[email protected]> Co-authored-by: Bill Engels <[email protected]> Co-authored-by: Oriol Abril-Pla <[email protected]> Co-authored-by: Osvaldo Martin <[email protected]> Co-authored-by: Adrian Seyboldt <[email protected]> Co-authored-by: Alexandre ANDORRA <[email protected]> Co-authored-by: Colin <[email protected]> Co-authored-by: Brandon T. Willard <[email protected]> Co-authored-by: Junpeng Lao <[email protected]> Co-authored-by: rpgoldman <[email protected]> Co-authored-by: Robert P. Goldman <[email protected]> Co-authored-by: Tirth Patel <[email protected]> Co-authored-by: Brandon T. Willard <[email protected]>

brandonwillard added shape problem bug labels Jul 3, 2020

brandonwillard added a commit to brandonwillard/pymc that referenced this issue Jul 3, 2020

Fix Mixture distribution mode computation and logp dimensions

5f0b9e9

Closes pymc-devs#3994.

brandonwillard mentioned this issue Jul 3, 2020

Fix Mixture distribution mode computation and logp dimensions #3995

Merged

brandonwillard added a commit to brandonwillard/pymc that referenced this issue Jul 3, 2020

Fix Mixture distribution mode computation and logp dimensions

ceb1db8

Closes pymc-devs#3994.

brandonwillard closed this as completed in #3995 Jul 4, 2020

brandonwillard added a commit that referenced this issue Jul 4, 2020

Fix Mixture distribution mode computation and logp dimensions

8770259

Closes #3994.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixture of mixtures works, but not Mixture of Mixture and Single distribution #3994

Mixture of mixtures works, but not Mixture of Mixture and Single distribution #3994

ricardoV94 commented Jul 3, 2020 •

edited

Loading

brandonwillard commented Jul 3, 2020

ricardoV94 commented Jul 3, 2020

brandonwillard commented Jul 3, 2020

brandonwillard commented Jul 3, 2020 •

edited

Loading

Mixture of mixtures works, but not Mixture of Mixture and Single distribution #3994

Mixture of mixtures works, but not Mixture of Mixture and Single distribution #3994

Comments

ricardoV94 commented Jul 3, 2020 • edited Loading

Versions and main components

brandonwillard commented Jul 3, 2020

ricardoV94 commented Jul 3, 2020

brandonwillard commented Jul 3, 2020

brandonwillard commented Jul 3, 2020 • edited Loading

ricardoV94 commented Jul 3, 2020 •

edited

Loading

brandonwillard commented Jul 3, 2020 •

edited

Loading