Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixture of mixtures works, but not Mixture of Mixture and Single distribution #3994

Closed
ricardoV94 opened this issue Jul 3, 2020 · 4 comments · Fixed by #3995
Closed

Mixture of mixtures works, but not Mixture of Mixture and Single distribution #3994

ricardoV94 opened this issue Jul 3, 2020 · 4 comments · Fixed by #3995

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 3, 2020

I am trying to model a Mixture between a Mixture and another distribution, but I am getting an error:

Minimal Example:

with pm.Model() as m:
    a1 = pm.Normal.dist(mu=0, sigma=1)
    a2 = pm.Normal.dist(mu=0, sigma=1)
    a3 = pm.Normal.dist(mu=0, sigma=1)
    
    w1 = pm.Dirichlet('w1', np.array([1, 1]))    
    mix = pm.Mixture.dist(w=w1, comp_dists=[a1, a2])
    
    w2 = pm.Dirichlet('w2', np.array([1, 1]))
    like = pm.Mixture = pm.Mixture('like', w=w2, comp_dists=[mix, a3], observed=np.random.randn(20))

Traceback:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/.local/lib/python3.8/site-packages/pymc3/distributions/mixture.py in _comp_modes(self)
    289         try:
--> 290             return tt.as_tensor_variable(self.comp_dists.mode)
    291         except AttributeError:

AttributeError: 'list' object has no attribute 'mode'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-8-dedf5c958f15> in <module>
      8 
      9     w2 = pm.Dirichlet('w2', np.array([1, 1]))
---> 10     like = pm.Mixture = pm.Mixture('like', w=w2, comp_dists=[mix, a3], observed=np.random.randn(20))

~/.local/lib/python3.8/site-packages/pymc3/distributions/distribution.py in __new__(cls, name, *args, **kwargs)
     44                 raise TypeError("observed needs to be data but got: {}".format(type(data)))
     45             total_size = kwargs.pop('total_size', None)
---> 46             dist = cls.dist(*args, **kwargs)
     47             return model.Var(name, dist, data, total_size)
     48         else:

~/.local/lib/python3.8/site-packages/pymc3/distributions/distribution.py in dist(cls, *args, **kwargs)
     55     def dist(cls, *args, **kwargs):
     56         dist = object.__new__(cls)
---> 57         dist.__init__(*args, **kwargs)
     58         return dist
     59 

~/.local/lib/python3.8/site-packages/pymc3/distributions/mixture.py in __init__(self, w, comp_dists, *args, **kwargs)
    139 
    140         try:
--> 141             comp_modes = self._comp_modes()
    142             comp_mode_logps = self.logp(comp_modes)
    143             self.mode = comp_modes[tt.argmax(w * comp_mode_logps, axis=-1)]

~/.local/lib/python3.8/site-packages/pymc3/distributions/mixture.py in _comp_modes(self)
    290             return tt.as_tensor_variable(self.comp_dists.mode)
    291         except AttributeError:
--> 292             return tt.squeeze(tt.stack([comp_dist.mode
    293                                         for comp_dist in self.comp_dists],
    294                                        axis=-1))

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in stack(*tensors, **kwargs)
   4726         dtype = scal.upcast(*[i.dtype for i in tensors])
   4727         return theano.tensor.opt.MakeVector(dtype)(*tensors)
-> 4728     return join(axis, *[shape_padaxis(t, axis) for t in tensors])
   4729 
   4730 

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in join(axis, *tensors_list)
   4500         return tensors_list[0]
   4501     else:
-> 4502         return join_(axis, *tensors_list)
   4503 
   4504 

~/.local/lib/python3.8/site-packages/theano/gof/op.py in __call__(self, *inputs, **kwargs)
    613         """
    614         return_list = kwargs.pop('return_list', False)
--> 615         node = self.make_node(*inputs, **kwargs)
    616 
    617         if config.compute_test_value != 'off':

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in make_node(self, *axis_and_tensors)
   4232             return tensor(dtype=out_dtype, broadcastable=bcastable)
   4233 
-> 4234         return self._make_node_internal(
   4235             axis, tensors, as_tensor_variable_args, output_maker)
   4236 

~/.local/lib/python3.8/site-packages/theano/tensor/basic.py in _make_node_internal(self, axis, tensors, as_tensor_variable_args, output_maker)
   4299         if not python_all([x.ndim == len(bcastable)
   4300                            for x in as_tensor_variable_args[1:]]):
-> 4301             raise TypeError("Join() can only join tensors with the same "
   4302                             "number of dimensions.")
   4303 

TypeError: Join() can only join tensors with the same number of dimensions.

However, if I create a fake Mixture dist for the third distribution, it seems to work:

with pm.Model() as m:
    a1 = pm.Normal.dist(mu=0, sigma=1)
    a2 = pm.Normal.dist(mu=0, sigma=1)
    a3 = pm.Normal.dist(mu=0, sigma=1)
    
    w1 = pm.Dirichlet('w1', np.array([1, 1]))    
    mix = pm.Mixture.dist(w=w1, comp_dists=[a1, a2])
    
    fake_mix = pm.Mixture.dist(w=[1, 0], comp_dists=[a3, a3])
    
    w2 = pm.Dirichlet('w2', np.array([1, 1]))
    like = pm.Mixture('like', w=w2, comp_dists=[mix, fake_mix], observed=np.random.randn(20))

I understand that this might not be optimal in the first place, and can certainly be coded as a custom distribution, but is this a design choice or a bug? It could also be just a question of shape handling, but I have no good intuition on how to check for that.

Versions and main components

  • PyMC3 Version: 3.8
  • Theano Version: 1.0.4
  • Python Version: 3.8.2
  • Operating system: Linux Ubuntu
  • How did you install PyMC3: pip
@brandonwillard
Copy link
Contributor

The last line with like = pm.Mixture = ... looks like a typo.

@ricardoV94
Copy link
Member Author

@brandonwillard You are right, I think it was a copy/paste typo. Anyway it still works without the typo.

@brandonwillard
Copy link
Contributor

Looks like there might be a bug in the definition of Mixture.mode:

>>> mix.mean.tag.test_value
array(0.)
>>> mix.mode.tag.test_value
array([0., 0.])

I'm assuming that the mode should be over the joint distribution of the mixture and mixing terms, and return a single value—like the mean.

@brandonwillard
Copy link
Contributor

brandonwillard commented Jul 3, 2020

OK, the problem seems to start here. The comp_mode_logps have shape (2, 1) while w has shape (2,), so the product w * comp_mode_logps has shape (2, 2), which is too many dimensions.

Looking into this a little more, the real problem is apparently here:

>>> mix._comp_logp(mix._comp_modes()).tag.test_value
array([[ -0.91893853, -50.91893853],
       [-50.91893853,  -0.91893853]])

brandonwillard added a commit to brandonwillard/pymc that referenced this issue Jul 3, 2020
brandonwillard added a commit to brandonwillard/pymc that referenced this issue Jul 3, 2020
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Jul 22, 2020
* Update GP NBs to use standard notebook style (pymc-devs#3978)

* update gp-latent nb to use arviz

* rerun, run black

* rerun after fixes from comments

* rerun black

* rewrite radon notebook using ArviZ and xarray (pymc-devs#3963)

* rewrite radon notebook using ArviZ and xarray

Roughly half notebook has been updated

* add comments on xarray usage

* rewrite 2n half of notebook

* minor fix

* rerun notebook and minor changes

* rerun notebook on pymc3.9.2 and ArviZ 0.9.0

* remove unused import

* add change to release notes

* SMC: refactor, speed-up and run multiple chains in parallel for diagnostics (pymc-devs#3981)

* first attempt to vectorize smc kernel

* add ess, remove multiprocessing

* run multiple chains

* remove unused imports

* add more info to report

* minor fix

* test log

* fix type_num error

* remove unused imports update BF notebook

* update notebook with diagnostics

* update notebooks

* update notebook

* update notebook

* Honor discard_tuned_samples during KeyboardInterrupt (pymc-devs#3785)

* Honor discard_tuned_samples during KeyboardInterrupt

* Do not compute convergence checks without samples

* Add time values as sampler stats for NUTS (pymc-devs#3986)

* Add time values as sampler stats for NUTS

* Use float time counters for nuts stats

* Add timing sampler stats to release notes

* Improve doc of time related sampler stats

Co-authored-by: Alexandre ANDORRA <[email protected]>

Co-authored-by: Alexandre ANDORRA <[email protected]>

* Drop support for py3.6 (pymc-devs#3992)

* Drop support for py3.6

* Update RELEASE-NOTES.md

Co-authored-by: Colin <[email protected]>

Co-authored-by: Colin <[email protected]>

* Fix Mixture distribution mode computation and logp dimensions

Closes pymc-devs#3994.

* Add more info to divergence warnings (pymc-devs#3990)

* Add more info to divergence warnings

* Add dataclasses as requirement for py3.6

* Fix tests for extra divergence info

* Remove py3.6 requirements

* follow-up of py36 drop (pymc-devs#3998)

* Revert "Drop support for py3.6 (pymc-devs#3992)"

This reverts commit 1bf867e.

* Update README.rst

* Update setup.py

* Update requirements.txt

* Update requirements.txt

Co-authored-by: Adrian Seyboldt <[email protected]>

* Show pickling issues in notebook on windows (pymc-devs#3991)

* Merge close remote connection

* Manually pickle step method in multiprocess sampling

* Fix tests for extra divergence info

* Add test for remote process crash

* Better formatting in test_parallel_sampling

Co-authored-by: Junpeng Lao <[email protected]>

* Use mp_ctx forkserver on MacOS

* Add test for pickle with dill

Co-authored-by: Junpeng Lao <[email protected]>

* Fix keep_size for arviz structures. (pymc-devs#4006)

* Fix posterior pred. sampling keep_size w/ arviz input.

Previously posterior predictive sampling functions did not properly
handle the `keep_size` keyword argument when getting an xarray Dataset
as parameter.

Also extended these functions to accept InferenceData object as input.

* Reformatting.

* Check type errors.

Make errors consistent across sample_posterior_predictive and fast_sample_posterior_predictive, and add 2 tests.

* Add changelog entry.

Co-authored-by: Robert P. Goldman <[email protected]>

* SMC-ABC add distance, refactor and update notebook (pymc-devs#3996)

* update notebook

* move dist functions out of simulator class

* fix docstring

* add warning and test for automatic selection of sort sum_stat when using wassertein and energy distances

* update release notes

* fix typo

* add sim_data test

* update and add tests

* update and add tests

* add docs for interpretation of length scales in periodic kernel (pymc-devs#3989)

* fix the expression of periodic kernel

* revert change and add doc

* FIXUP: add suggested doc string

* FIXUP: revertchanges in .gitignore

* Fix Matplotlib type error for tests (pymc-devs#4023)

* Fix for issue 4022.

Check for support for `warn` argument in `matplotlib.use()` call. Drop it if it causes an error.

* Alternative fix.

* Switch from pm.DensityDist to pm.Potential to describe the likelihood in MLDA notebooks and script examples. This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.

* Remove Dirichlet distribution type restrictions (pymc-devs#4000)

* Remove Dirichlet distribution type restrictions

Closes pymc-devs#3999.

* Add missing Dirichlet shape parameters to tests

* Remove Dirichlet positive concentration parameter constructor tests

This test can't be performed in the constructor if we're allowing Theano-type
distribution parameters.

* Add a hack to statically infer Dirichlet argument shapes

Co-authored-by: Brandon T. Willard <[email protected]>

Co-authored-by: Bill Engels <[email protected]>
Co-authored-by: Oriol Abril-Pla <[email protected]>
Co-authored-by: Osvaldo Martin <[email protected]>
Co-authored-by: Adrian Seyboldt <[email protected]>
Co-authored-by: Alexandre ANDORRA <[email protected]>
Co-authored-by: Colin <[email protected]>
Co-authored-by: Brandon T. Willard <[email protected]>
Co-authored-by: Junpeng Lao <[email protected]>
Co-authored-by: rpgoldman <[email protected]>
Co-authored-by: Robert P. Goldman <[email protected]>
Co-authored-by: Tirth Patel <[email protected]>
Co-authored-by: Brandon T. Willard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants