Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem while processing pymc3 data #1279

Closed
andyfaff opened this issue Jul 4, 2020 · 9 comments
Closed

problem while processing pymc3 data #1279

andyfaff opened this issue Jul 4, 2020 · 9 comments

Comments

@andyfaff
Copy link

andyfaff commented Jul 4, 2020

I'm unable to sample a blackbox likelihood in pymc3. There is an exception raised when arviz is asked to process the trace. Based on the stacktrace I think the issue is on the arviz side.

Python 3.8.3
pymc3 3.9.2
arviz 0.9.0
theano 1.0.4
macOS

all packages installed via pip into a clean conda env. The MWE is:

import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt
theano.config.exception_verbosity='high'


def line(theta, x, *args, **kwds):
    p_arr = np.squeeze(np.array(theta))
    return p_arr[1] + x * p_arr[0]


def my_loglike(theta, x, data, sigma):
    """
    A Gaussian log-likelihood function for a model with parameters given in theta
    """

    model = line(theta, x)
    return -(0.5/sigma**2)*np.sum((data - model)**2)


class LogLike(tt.Op):
    itypes = [tt.dvector] # expects a vector of parameter values when called
    otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)

    def __init__(self, loglike, data, x, sigma):

        # add inputs as class attributes
        self.likelihood = loglike
        self.x = x
        self.data=data
        self.sigma=sigma

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        theta = inputs  # this will contain my variables
        # call the log-likelihood function
        logl = self.likelihood(theta, self.x, self.data, self.sigma)
        outputs[0][0] = np.array(logl)
           

# set up our data
N = 10  # number of data points
sigma = 1.  # standard deviation of noise
x = np.linspace(0., 9., N)

mtrue = 0.4  # true gradient
ctrue = 3.   # true y-intercept

truemodel = line([mtrue, ctrue], x)


# make data
np.random.seed(716742)  # set random seed, so the data is reproducible each time
data = sigma*np.random.randn(N) + truemodel

ndraws = 3000  # number of draws from the distribution
nburn = 1000   # number of "burn-in points" (which we'll discard)

logl = LogLike(my_loglike, data, x, sigma)

with pm.Model():
    # your external function takes two parameters, a and b, with Uniform priors
    m = pm.Uniform('m', lower=-10., upper=10.)
    c = pm.Uniform('c', lower=-10., upper=10.)

    # convert m and c to a tensor vector
    theta = tt.as_tensor_variable([m, c])

    # use a DensityDist (use a lamdba function to "call" the Op)
    pm.DensityDist('likelihood', lambda v: logl(v), observed={'v': theta})
    trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True)

gives the stack trace:

---------------------------------------------------------------------------
MissingInputError                         Traceback (most recent call last)
<ipython-input-5-a6fb0239c4a6> in <module>
     72     # use a DensityDist (use a lamdba function to "call" the Op)
     73     pm.DensityDist('likelihood', lambda v: logl(v), observed={'v': theta})
---> 74     trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True)

~/miniconda3/envs/dev3/lib/python3.8/site-packages/pymc3/sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, return_inferencedata, idata_kwargs, **kwargs)
    597         if idata_kwargs:
    598             ikwargs.update(idata_kwargs)
--> 599         idata = arviz.from_pymc3(trace, **ikwargs)
    600 
    601     if compute_convergence_checks:

~/miniconda3/envs/dev3/lib/python3.8/site-packages/arviz/data/io_pymc3.py in from_pymc3(trace, prior, posterior_predictive, log_likelihood, coords, dims, model, save_warmup)
    521     InferenceData
    522     """
--> 523     return PyMC3Converter(
    524         trace=trace,
    525         prior=prior,

~/miniconda3/envs/dev3/lib/python3.8/site-packages/arviz/data/io_pymc3.py in __init__(self, trace, prior, posterior_predictive, log_likelihood, predictions, coords, dims, model, save_warmup)
    157             self.dims = {**model_dims, **self.dims}
    158 
--> 159         self.observations, self.multi_observations = self.find_observations()
    160 
    161     def find_observations(self) -> Tuple[Optional[Dict[str, Var]], Optional[Dict[str, Var]]]:

~/miniconda3/envs/dev3/lib/python3.8/site-packages/arviz/data/io_pymc3.py in find_observations(self)
    170             elif hasattr(obs, "data"):
    171                 for key, val in obs.data.items():
--> 172                     multi_observations[key] = val.eval() if hasattr(val, "eval") else val
    173         return observations, multi_observations
    174 

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/gof/graph.py in eval(self, inputs_to_values)
    520         inputs = tuple(sorted(inputs_to_values.keys(), key=id))
    521         if inputs not in self._fn_cache:
--> 522             self._fn_cache[inputs] = theano.function(inputs, self)
    523         args = [inputs_to_values[param] for param in inputs]
    524 

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/compile/function.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    304         # note: pfunc will also call orig_function -- orig_function is
    305         #      a choke point that all compilation must pass through
--> 306         fn = pfunc(params=inputs,
    307                    outputs=outputs,
    308                    mode=mode,

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/compile/pfunc.py in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
    481         inputs.append(si)
    482 
--> 483     return orig_function(inputs, cloned_outputs, mode,
    484                          accept_inplace=accept_inplace, name=name,
    485                          profile=profile, on_unused_input=on_unused_input,

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/compile/function_module.py in orig_function(inputs, outputs, mode, accept_inplace, name, profile, on_unused_input, output_keys)
   1830     try:
   1831         Maker = getattr(mode, 'function_maker', FunctionMaker)
-> 1832         m = Maker(inputs,
   1833                   outputs,
   1834                   mode,

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/compile/function_module.py in __init__(self, inputs, outputs, mode, accept_inplace, function_builder, profile, on_unused_input, fgraph, output_keys, name)
   1484             # make the fgraph (copies the graph, creates NEW INPUT AND
   1485             # OUTPUT VARIABLES)
-> 1486             fgraph, additional_outputs = std_fgraph(inputs, outputs,
   1487                                                     accept_inplace)
   1488             fgraph.profile = profile

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/compile/function_module.py in std_fgraph(input_specs, output_specs, accept_inplace)
    178     orig_outputs = [spec.variable for spec in output_specs] + updates
    179 
--> 180     fgraph = gof.fg.FunctionGraph(orig_inputs, orig_outputs,
    181                                   update_mapping=update_mapping)
    182 

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/gof/fg.py in __init__(self, inputs, outputs, features, clone, update_mapping)
    173 
    174         for output in outputs:
--> 175             self.__import_r__(output, reason="init")
    176         for i, output in enumerate(outputs):
    177             output.clients.append(('output', i))

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/gof/fg.py in __import_r__(self, variable, reason)
    344         # Imports the owners of the variables
    345         if variable.owner and variable.owner not in self.apply_nodes:
--> 346                 self.__import__(variable.owner, reason=reason)
    347         elif (variable.owner is None and
    348                 not isinstance(variable, graph.Constant) and

~/miniconda3/envs/dev3/lib/python3.8/site-packages/theano/gof/fg.py in __import__(self, apply_node, check, reason)
    389                                      "for more information on this error."
    390                                      % (node.inputs.index(r), str(node)))
--> 391                         raise MissingInputError(error_msg, variable=r)
    392 
    393         for node in new_nodes:

MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(c_interval__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.
@OriolAbril
Copy link
Member

Thanks for reporting, I have to admit I am completely at a loss as to why the "observed data" (in this case due to the structure of the DensityDist I think theta is understood as observed_data from ArviZ side) has eval method but can't be evaluated, maybe @rpgoldman has some idea on this? Maybe it's due to theta/v being "sampled" and thus having a different values for each draw and chain?

In the meantime I'd recommend using pm.Potential instead. Modifying the DensityDist to the line below should fix the problem:

pm.Potential('likelihood', logl(theta))

I think both alternatives are equivalent implementations of the exact same model with the difference that as pm.Potential has no observed argument, ArviZ does not try to retrieve and store the values passed to observed kwarg to the observed_data group.

@rpgoldman
Copy link
Contributor

rpgoldman commented Jul 5, 2020

This is interesting to me -- when sampling starts, I see this, which I have never seen before:

Initializing NUTS using jitter+adapt_diag...
Initializing NUTS failed. Falling back to elementwise auto-assignment.
Multiprocess sampling (4 chains in 4 jobs)
CompoundStep
>Slice: [c]
>Slice: [m]

@OriolAbril Are you familiar with this? Any idea whether this could be related to the later problem (which I am investigating now)?

@rpgoldman
Copy link
Contributor

rpgoldman commented Jul 5, 2020

The error occurs while groveling over the Theano graph to decide how to categorize variables, to find the observations. It impinges on the MultiObservedRV class which ... I don't understand at all, and have never encountered before.

This docstring is almost entirely unhelpful:

    """Observed random variable that a model is specified in terms of.
    Potentially partially observed.
    """

I have no clue what distinguishes a MultiObservedRV from an ObservedRV. @ferrine @aseyboldt @twiecki --- I see your names in output from git blame: could any of you explain what this class is for, and how it is distinguished from a (singly) Observed RV?

@OriolAbril
Copy link
Member

Initializing NUTS failed. Falling back to elementwise auto-assignment.

Yes, I also got this message which I had never seen before so I don't really know if it could have any relation with the eval problem

@rpgoldman
Copy link
Contributor

Actually... isn't the answer to this simple? We have a variable whose "observed" values are not constant. So we shouldn't expect this to work, should we?

I'm pretty sure that PyMC3 expects observations not to be random variables, right?

I should mention that the Quickstart notebook for PyMC3 says:

"observed supports lists, numpy.ndarray, theano and pandas data structures."

The inclusion of theano data structures suggests that having random variables here would be OK, but I'm pretty sure that is not true. pm.Data are OK, but they are effectively constant, at least wrt a single session of inference.

@andyfaff
Copy link
Author

andyfaff commented Jul 5, 2020

Thank you for the detective work on this so far. pm.Potential works in the meantime.

I should've pointed out that the example comes from a pymc3 tutorial itself. I've been looking for exactly this kind of tutorial notebook for a long time, black box likelihoods are quite common in the area in which I work. Once the problem has been resolved, that may need to be updated.

The reason the NUTS initialisation fails is because there is no gradient specified in the tensor op (so far).

gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Jul 21, 2020
… in MLDA notebooks and script examples. This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Jul 22, 2020
* Update GP NBs to use standard notebook style (pymc-devs#3978)

* update gp-latent nb to use arviz

* rerun, run black

* rerun after fixes from comments

* rerun black

* rewrite radon notebook using ArviZ and xarray (pymc-devs#3963)

* rewrite radon notebook using ArviZ and xarray

Roughly half notebook has been updated

* add comments on xarray usage

* rewrite 2n half of notebook

* minor fix

* rerun notebook and minor changes

* rerun notebook on pymc3.9.2 and ArviZ 0.9.0

* remove unused import

* add change to release notes

* SMC: refactor, speed-up and run multiple chains in parallel for diagnostics (pymc-devs#3981)

* first attempt to vectorize smc kernel

* add ess, remove multiprocessing

* run multiple chains

* remove unused imports

* add more info to report

* minor fix

* test log

* fix type_num error

* remove unused imports update BF notebook

* update notebook with diagnostics

* update notebooks

* update notebook

* update notebook

* Honor discard_tuned_samples during KeyboardInterrupt (pymc-devs#3785)

* Honor discard_tuned_samples during KeyboardInterrupt

* Do not compute convergence checks without samples

* Add time values as sampler stats for NUTS (pymc-devs#3986)

* Add time values as sampler stats for NUTS

* Use float time counters for nuts stats

* Add timing sampler stats to release notes

* Improve doc of time related sampler stats

Co-authored-by: Alexandre ANDORRA <[email protected]>

Co-authored-by: Alexandre ANDORRA <[email protected]>

* Drop support for py3.6 (pymc-devs#3992)

* Drop support for py3.6

* Update RELEASE-NOTES.md

Co-authored-by: Colin <[email protected]>

Co-authored-by: Colin <[email protected]>

* Fix Mixture distribution mode computation and logp dimensions

Closes pymc-devs#3994.

* Add more info to divergence warnings (pymc-devs#3990)

* Add more info to divergence warnings

* Add dataclasses as requirement for py3.6

* Fix tests for extra divergence info

* Remove py3.6 requirements

* follow-up of py36 drop (pymc-devs#3998)

* Revert "Drop support for py3.6 (pymc-devs#3992)"

This reverts commit 1bf867e.

* Update README.rst

* Update setup.py

* Update requirements.txt

* Update requirements.txt

Co-authored-by: Adrian Seyboldt <[email protected]>

* Show pickling issues in notebook on windows (pymc-devs#3991)

* Merge close remote connection

* Manually pickle step method in multiprocess sampling

* Fix tests for extra divergence info

* Add test for remote process crash

* Better formatting in test_parallel_sampling

Co-authored-by: Junpeng Lao <[email protected]>

* Use mp_ctx forkserver on MacOS

* Add test for pickle with dill

Co-authored-by: Junpeng Lao <[email protected]>

* Fix keep_size for arviz structures. (pymc-devs#4006)

* Fix posterior pred. sampling keep_size w/ arviz input.

Previously posterior predictive sampling functions did not properly
handle the `keep_size` keyword argument when getting an xarray Dataset
as parameter.

Also extended these functions to accept InferenceData object as input.

* Reformatting.

* Check type errors.

Make errors consistent across sample_posterior_predictive and fast_sample_posterior_predictive, and add 2 tests.

* Add changelog entry.

Co-authored-by: Robert P. Goldman <[email protected]>

* SMC-ABC add distance, refactor and update notebook (pymc-devs#3996)

* update notebook

* move dist functions out of simulator class

* fix docstring

* add warning and test for automatic selection of sort sum_stat when using wassertein and energy distances

* update release notes

* fix typo

* add sim_data test

* update and add tests

* update and add tests

* add docs for interpretation of length scales in periodic kernel (pymc-devs#3989)

* fix the expression of periodic kernel

* revert change and add doc

* FIXUP: add suggested doc string

* FIXUP: revertchanges in .gitignore

* Fix Matplotlib type error for tests (pymc-devs#4023)

* Fix for issue 4022.

Check for support for `warn` argument in `matplotlib.use()` call. Drop it if it causes an error.

* Alternative fix.

* Switch from pm.DensityDist to pm.Potential to describe the likelihood in MLDA notebooks and script examples. This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.

* Remove Dirichlet distribution type restrictions (pymc-devs#4000)

* Remove Dirichlet distribution type restrictions

Closes pymc-devs#3999.

* Add missing Dirichlet shape parameters to tests

* Remove Dirichlet positive concentration parameter constructor tests

This test can't be performed in the constructor if we're allowing Theano-type
distribution parameters.

* Add a hack to statically infer Dirichlet argument shapes

Co-authored-by: Brandon T. Willard <[email protected]>

Co-authored-by: Bill Engels <[email protected]>
Co-authored-by: Oriol Abril-Pla <[email protected]>
Co-authored-by: Osvaldo Martin <[email protected]>
Co-authored-by: Adrian Seyboldt <[email protected]>
Co-authored-by: Alexandre ANDORRA <[email protected]>
Co-authored-by: Colin <[email protected]>
Co-authored-by: Brandon T. Willard <[email protected]>
Co-authored-by: Junpeng Lao <[email protected]>
Co-authored-by: rpgoldman <[email protected]>
Co-authored-by: Robert P. Goldman <[email protected]>
Co-authored-by: Tirth Patel <[email protected]>
Co-authored-by: Brandon T. Willard <[email protected]>
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Jul 31, 2020
This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Aug 10, 2020
This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Aug 11, 2020
This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Sep 14, 2020
This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Sep 14, 2020
This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.
gmingas added a commit to alan-turing-institute/pymc3 that referenced this issue Sep 14, 2020
This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.
twiecki pushed a commit to pymc-devs/pymc that referenced this issue Sep 15, 2020
* Add basic MLDA-MCMC algorithm (proposal and step method)

* Add example scripts for MLDA

* Make coarse_models kwarg into a normal argument

* Fix docstring in MLDA class

* Fix examples code style/comments

* Remove redundant code, improve comments

* Make MLDA do block sampling by default

* Add new MLDA example with FEniCS black box component

* Small change in readme

* Add run instructions in readme

* Change example parameters in sample()

* Minor changes and comments

* Add extra parameters in readme

* Number of chains change

* Make changes to mlda example parameters, go back to compound sampling for mlda by default, edit readme accordingly

* Updated the FEniCS code so it's more efficient

* Add eigenpairs projection and change input parameters

* Edit readme to reflect projection function

* Add compound step support and improve example.py

* Update readme

* Add new tests tests, most of them failing

* Fix test_types bug

* Refactoring of MLDA to allow retaining all methods information

* Add more tests

* Pass more parameters to Metropolis init

* Complete tuning continuation, add basic block/coumpound parameter

* Remove some mlda if statements and simplify other parts

* Change example parameters

* Add and fix tests

* Add some comments and remove redundant code

* Add xfail to test_models_utils test when new pandas version are used

* Add description to test_examples for multilevel normal example

* Remove xfail related to pandas in test_models_utils, remove redundant function from test_step

* Try removing failing test

* Remove redundant code

* Skip some failing tests

* Add sine wave test in test_examples

* Change true parameters index

* Fix index j in coarse models loop to avoid warning

* Remove sine test

* Remove unused library

* Improve docstrings and remove some redundant code/comments

* Revert some pointless changes to tests

* Fix style

* Add notebook example and utility code

* Remove old mlda example and update notebook

* Update notebook again

* Change the citation and add a use example to the MLDA docstring

* Change competence to be incompatible with discrete vars, remove discrete var detection code from MLDA init

* Add young warning and change default subsampling value

* Add more comments, change sys.exit to exception, change tests to reflect that, convert coarse_models to positional argument in MLDA

* Add is_mlda_flag to prevent tuning reset when Metropolis is used as the base sampler of MLDA

* Add more extensive docstrings

* Re-run and re-render notebook after changes

* Add tuning for tune and scaling and refactor some parameter names to be consistent with what they represent

* Regenerate notebook and add .py version of it

* Delete notebook with old name

* Fix pylint errors, move .py example, fix failing test

* Add computer specs to notebook

* Change class checks to using isinstance

* Add more tests for stats, competence, etc and fix bug with tune flag not being overriden in compound sub-methods

* Remove duplicate sample_except()

* Move is_mlda_base checks for reseting inside Metropolis' reset_tuning()

* Add DEMetropolis to notebook

* Add MLDA groundwater flow notebook which uses blocked samplers only

* Add comparison with DEMCMC-Z amd longer runs for convergence in example notebook

* Add separate subsampling rates

* Edit docstring

* Fix handling of one integer for subsampling rate

* Add tests and change subsampling argument name in examples/tests

* Add notebook with extra benchmarks and tuning demo

* Change subsampling rate to rates in notebook examples (including new benchmarking one)

* Do not count stuck proposals from lower chain as accepted

* Convert base_scaling stats to one stat of type object, modify tests for stats and also test for declining acceptance rate

* Fix pm.DensityDist in notebooks/scripts by replacing with pm.Potential

This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook.

* Move MLDA and RecursiveDAProposal classes to new mlda.py file, add a new MetropolisMLDA class which inherits Metropolis and just alters the reset_tuning function, modify test_step and __init__ files accordingly

* Remove MLDA notebooks and .py example script, remove code that uses fenics and delete reference to example from MLDA class docstring

* Blacken MLDA code and tests, separate MLDA type tests from other methods, fix some stylistic issues

* Add simple linear regression notebook which can serve as MLDA starting point for new users

* Modify some comments in the simple notebook and re-run

* Add typing to mlda.py

* Connect the two separate strings in mlda.py

* Modify simple linear regression notebook: coarse model uses subset of data, more comments, performance comparison with Metropolis

* Add more comments, format using PyMC3 NB style guide (blacken, watermark, arviz style, imports, etc)

* Add simple regression notebook to table of examples

* Add gravity surveying notebook

* Add MLDA line in 3.9.x release notes

* Edit notebook imports to adhere to nbqa-isort rules

* Add gravity notebook to table of contents

Co-authored-by: Mikkel Lykkegaard <[email protected]>
@canyon289
Copy link
Member

Im not sure but I feel like this issue has been resolved. Closing for now and if Im mistaken someone comment and we can reopn

@yanpanlau
Copy link

I confirmed that using pm.Potential() works beautifully but not pm.DensityDist()

@burnpanck
Copy link

burnpanck commented Jan 6, 2021

I did also run into this issue, also following PyMC3 tutorials to add a custom (black-box) likelihood. I don't know the relation between arviz and pymc3, but this definitely is still a bug in one of the two libraries - at least in the documentation of pymc3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants