Skip to content

Add alternative default priors#360

Merged
tomicapretto merged 26 commits intobambinos:masterfrom
tomicapretto:priors
Jul 12, 2021
Merged

Add alternative default priors#360
tomicapretto merged 26 commits intobambinos:masterfrom
tomicapretto:priors

Conversation

@tomicapretto
Copy link
Copy Markdown
Collaborator

@tomicapretto tomicapretto commented Jun 21, 2021

This PR aims to add alternative default priors that will be the default priors for those models where we lack statsmodels support. This new prior does not replace existing defaults (until we've evidence they're equivalent). I will be updating the following list of changes as I commit to this PR.

Changes

  • Remove Model._match_derived_terms(). This was unused because categorical group-specific terms are indeed one term in the model and not several terms (as many as dummies in the encoding of the categorical variable) as it used to be before.
  • Constant terms (categoricals with one level and numerics with a unique value) are flagged more appropriately.
  • Splitted PriorFactory._get_prior() into several methods with clearer names and goals. Also modified config.json as proposed in Use objects instead of arrays in the config.json of the priors #361.
  • Prior._auto_scale is now Prior.auto_scale. It's bothering to add pylint exceptions all the time.
  • Nuisance parameters of the response distribution are scaled with the prior scaler and not when the prior term is added.
  • Added alternative automatic priors (inspired on rstanarm priors). See PriorScaler2.
  • Our tests should faster because I removed unnecessary Model.build() calls.
  • Family names and priors are checked slightly differently, which makes the code simpler.
  • Removed methods and attributes that were used when we had multiple backends. Now that we only have PyMC3, it does not make sense to keep asking which backend is being used.
  • Added more tests.
  • Model has a new argument, priors_cor. It accepts dictionaries where keys are the names of the groups, and values are the eta parameter in the LKJ distribution for correlation matrices. If such a dictionary is present, priors for group-specific terms are a multivariate normal distribution with a non-zero correlation.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 21, 2021

Codecov Report

Attention: Patch coverage is 83.24873% with 66 lines in your changes missing coverage. Please review.

Project coverage is 90.20%. Comparing base (c0ca107) to head (f77d302).

Files with missing lines Patch % Lines
bambi/backends/pymc.py 56.00% 33 Missing ⚠️
bambi/models.py 64.63% 29 Missing ⚠️
bambi/priors/scaler_default.py 95.16% 3 Missing ⚠️
bambi/priors/scaler_mle.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #360      +/-   ##
==========================================
+ Coverage   88.87%   90.20%   +1.33%     
==========================================
  Files          16       17       +1     
  Lines        1411     1613     +202     
==========================================
+ Hits         1254     1455     +201     
- Misses        157      158       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tomicapretto tomicapretto changed the title [WIP] Add alternative default priors Add alternative default priors Jun 28, 2021
@tomicapretto
Copy link
Copy Markdown
Collaborator Author

I think this is ready for a review @aloctavodia, @canyon289, @twiecki. I know there's a lot going on in this PR and many things may not be that clear. Please ask as many questions as you want.

The TL;DR of this PR would be

  • We can now use LKJ prior for the correlation matrix of the prior of the group-specific terms.
  • We have an alternative method to compute default priors inspired on rstanarm priors. These are going to be used in the coming implementations for t family, beta family, etc.

Copy link
Copy Markdown
Collaborator

@aloctavodia aloctavodia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments, overall seems good. Many small details that improve readability also some changes that I need to read with more care to understand them. I will try to keep reading tomorrow.

sigma = pm.HalfNormal.dist(sigma=sigma, shape=rows)

# Obtain Cholesky factor for the covariance
lkj_decomp, corr, sigma = pm.LKJCholeskyCov( # pylint: disable=unused-variable
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not using corr, right? then do :

lkj_decomp, _, sigma

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also can we use a different name for the returned sigma and the input sigma?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to use corr in the nearby future. I've been thinking we should report it even when independent priors are used. That's why it's there.

But in the meantime, I have no problem if you think the underscore is more appropriate

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And sigma... Well, they represent the same random variable in the model. The problem is the first one is a .dist, so I have to recover the one returned by lkjcholeskycov and add it to the trace.

I don't know if there are plans in pymc3 to allow a random variable in lkjcholeskycov, that would be the best solution I think

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is done in this PR

bambi/models.py Outdated
automatic_priors: str
An optional specification to compute/scale automatic priors. ``"default"`` means to use
Bambi's default method. ``"rstanarm"`` means to use default priors from the R rstanarm
library. The latter are available in more scenarios because they don't depend on MLE.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we detect when default Bambi priors fail and switch to rstanarm's priors?

What is the advantage of keep using Bambi defaults if they are more restricted?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to change the defaults for the rstanarm inspired priors. This will make it simpler to implement t and beta families.



def add_lkj(terms, eta=1):
# Parameters
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this a full comment in numpy style, that way it shows up under add_lkj.__doc__

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

if self.model.family.name == "gaussian":
sigma = np.std(self.model.response.data)
self.model.response.prior.update(sigma=Prior("HalfStudentT", nu=4, sigma=sigma))
# Add cases for other families
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats this comment for?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the Gamma family has an auxiliary parameter alpha whose prior is HalfCauchy(beta=1). I left the comment to remember that it might be good to consider updating beta to another value based on the data, as with the sigma in the HalfStudentT prior above.

if self.mle is None:
self.fit_mle()

# Scale it
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expand it to full name, makes comment easier to read in isolation

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with full name? Something like self.fit_maximum_likeliihood_estimator? Or is it about the comment below?

if set(nuisance_params).intersection(set(priors)):
return {k: priors[k] for k in nuisance_params if k in priors}
return {k: priors.pop(k) for k in nuisance_params if k in priors}
return None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if none is returned to downstream call?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing. When None is returned it means the user didn't pass any prior for any parameter in the response distribution and there's no need to update them.

Copy link
Copy Markdown
Collaborator

@canyon289 canyon289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short review. Will do more indepth review in next 24 hours

@canyon289
Copy link
Copy Markdown
Collaborator

I think this is ready for a review @aloctavodia, @canyon289, @twiecki. I know there's a lot going on in this PR and many things may not be that clear. Please ask as many questions as you want.

The TL;DR of this PR would be

* We can now use LKJ prior for the correlation matrix of the prior of the group-specific terms.

* We have an alternative method to compute default priors inspired on rstanarm priors. These are going to be used in the coming implementations for t family, beta family, etc.

Im sorry for missing this @mention. I get so many github emails things get lost. If Im not responding to a PR fast enough message me on slack, or even proactively message. That reduces chances I'll miss PRs tremendously

@tomicapretto
Copy link
Copy Markdown
Collaborator Author

I think this is ready for a review @aloctavodia, @canyon289, @twiecki. I know there's a lot going on in this PR and many things may not be that clear. Please ask as many questions as you want.
The TL;DR of this PR would be

* We can now use LKJ prior for the correlation matrix of the prior of the group-specific terms.

* We have an alternative method to compute default priors inspired on rstanarm priors. These are going to be used in the coming implementations for t family, beta family, etc.

Im sorry for missing this @mention. I get so many github emails things get lost. If Im not responding to a PR fast enough message me on slack, or even proactively message. That reduces chances I'll miss PRs tremendously

Thanks for the feedback Ravin. I'll message on Slack next time then! :D

@canyon289
Copy link
Copy Markdown
Collaborator

I also missed more in depth review but will get it in this week :(

@tomicapretto
Copy link
Copy Markdown
Collaborator Author

I've just realized this PR will close #320 since we're changing default priors

@tomicapretto tomicapretto merged commit edb3fe8 into bambinos:master Jul 12, 2021
@tomicapretto tomicapretto deleted the priors branch July 13, 2021 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants