return_inferencedata option for pm.sample #3911

michaelosthege · 2020-05-04T11:43:10Z

Most PyMC3 modeling workflows (should) at some point use ArviZ to save and load traces as InferenceData objects.

With a sample(…, return_inferencedata=True/False) option, the conversion may happen as early as possible, encouraging the user to use ArviZ from the start.

wait for next ArviZ release to happen
update PR such that
- pin to latest ArviZ version
- take out the NotImplementedError
there are no breaking changes to the user-facing API - but a FutureWarning is added, allowing us to default to return-inferencedata=True in v4.0.0 if it turns out to be a good idea
in use inference data in end of sampling report #3883, Oriol made some optimizations to avoid duplicate work of conversion. This PR does the same by doing the conversion already in sample() and passing the idata for convergence checks.
are the changes—especially new features—covered by tests and docstrings? -- test was added, other tests remain untouched, so they assert that the user-facing API did not break
consider adding/updating relevant example notebooks - I'll do that on a future PR
right before it's ready to merge, mention the PR in the RELEASE-NOTES.md

+ convert to InferenceData and save metadata to it already in sample() + pass idata instead of trace to convergence check, to avoid duplicate work + directly use arviz diagnostics instead of pymc3 aliases

michaelosthege · 2020-05-04T12:28:50Z

This is the first stage - converting to InferenceData earlier.

I'm going to wait for the tests before adding the return_inferencedata option.

codecov · 2020-05-04T13:11:18Z

Codecov Report

Merging #3911 into master will decrease coverage by 2.90%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #3911      +/-   ##
==========================================
- Coverage   86.40%   83.49%   -2.91%     
==========================================
  Files          86      103      +17     
  Lines       13728    14192     +464     
==========================================
- Hits        11861    11849      -12     
- Misses       1867     2343     +476

Impacted Files	Coverage Δ
pymc3/backends/ndarray.py	`92.63% <ø> (ø)`
pymc3/backends/text.py	`97.16% <ø> (ø)`
pymc3/backends/report.py	`95.31% <100.00%> (+2.23%)`	⬆️
pymc3/sampling.py	`86.47% <100.00%> (+0.27%)`	⬆️
pymc3/gp/util.py	`50.74% <0.00%> (-29.26%)`	⬇️
pymc3/distributions/dist_math.py	`91.50% <0.00%> (-0.20%)`	⬇️
pymc3/tuning/starting.py	`80.76% <0.00%> (-0.15%)`	⬇️
pymc3/distributions/continuous.py	`80.01% <0.00%> (ø)`
pymc3/examples/gelman_bioassay.py	`0.00% <0.00%> (ø)`
pymc3/examples/factor_potential.py	`0.00% <0.00%> (ø)`
... and 16 more

+ set to None + defaults to False

michaelosthege · 2020-05-04T13:16:11Z

Tests are green.
Now pushing the addition of the return_inferencedata kwarg. The tests should remain green...

Replaced "<varname>: <type>" with "<varname> : <type>" per numpy guidelines. Fix spelling typo.

rpgoldman

Looks good to me, pending tests.

Probably needs some additional tests that exercise returning InferenceData.

pymc3/sampling.py

michaelosthege · 2020-05-04T14:45:23Z

I've added a test for the new kwarg.

As .sample( appears another 78 times in the tests, I'd propose to update those tests with the future PR that changes the default behavior.

So this PR is ready for review - I'll update the release notes when I address your upcoming feedback.

michaelosthege · 2020-05-04T14:49:24Z

You'll notice a NotImplementedError when return_inferencedata==True and discard_tuned_samples==False.

I'm in contact with Oriol to get arviz.from_pymc3 support for warmup draws. If this can be arranged in an ArviZ release in the near future, we can use that and pin that ArviZ version for the 3.9 release?

AlexAndorra

Thanks Michael, this looks good to me 👌
I'll merge tomorrow morning, in case @rpgoldman or @ColCarroll request some last-minute changes.

…aelosthege/pymc3 into return-inferencedata-option

pymc3/sampling.py

AlexAndorra · 2020-05-10T16:32:58Z

@michaelosthege, as discussed, @OriolAbril's warmup PR was merged on ArviZ repo, and the next ArviZ release will be 0.8.0, so I think we can reactivate work on this PR and address the comments we made above 👌

michaelosthege · 2020-05-10T16:44:00Z

Yes, we can wait for the ArviZ release.
I'll address the feedback when the release is out.

fonnesbeck · 2020-05-10T18:42:24Z

We should probably either convert one of the example notebooks to using this throughout, or create a new one. The api_quickstart is probably the place to put it.

docs/source/notebooks/api_quickstart.ipynb

+ more direct use of ArviZ + some wording things

michaelosthege · 2020-05-25T23:28:08Z

@AlexAndorra and @OriolAbril thank you for the feedback on the notebook! they were really straightforward to apply.

AlexAndorra

Thanks a lot @michaelosthege ! There was a small typo left, and I left a question about how to handle warmups 😉 Going to review the NB now.

RELEASE-NOTES.md

pymc3/sampling.py

review-notebook-app · 2020-05-26T10:37:17Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-05-26T10:37:17Z
----------------------------------------------------------------

Can you set tune to 1500 or 2000? As the default is now 1000, not sure it's a good idea to nudge users towards using less than default, especially as lots of them already don't pay a lot of attention to tuning samples. Plus, the sampler is complaining because not enough tuning here ;)

review-notebook-app · 2020-05-26T10:37:18Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-05-26T10:37:18Z
----------------------------------------------------------------

Typo: "With discard_tuned_samples=False", not True

Also, maybe put this line below the next cell, otherwise it makes it seems like you're showing the place where tuning samples are kept in ID.

review-notebook-app · 2020-05-26T10:37:19Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-05-26T10:37:18Z
----------------------------------------------------------------

You have to set the kwarg r_hat=True

review-notebook-app · 2020-05-26T10:37:19Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-05-26T10:37:19Z
----------------------------------------------------------------

To use the Data container through the end you can set the new values like this, in the model context (see this tutorial for more):

with model:
    pm.set_data({'x_obs': [-1, 0, 1.0], "y_obs": [0, 0, 0]})
    post_pred = pm.sample_posterior_predictive(idata.posterior)

Let's also use the default number of samples, as earlier.

AlexAndorra · 2020-05-26T10:41:16Z

@michaelosthege , finished reviewing the NB -- just a few tweaks to add / typos to fix and we'll be good to go 👌

michaelosthege · 2020-05-26T12:17:41Z

The pipelines fails because of arviz-devs/arviz#1210

I could make a workaround for the test case, but I also think having draws=0 is a valid use case..

…elease >=3.10

AlexAndorra

The NB is a marvel now, thanks a lot for your patience and perseverance on this @michaelosthege !
Waiting for ArviZ 0.8.3 to be released, and I'll merge if tests pass then -- I really, really hope no other edge case will pop up this time 😅 🤞

AlexAndorra · 2020-05-27T15:38:13Z

Problems seem to be fixed on ArviZ master, but before releasing 0.8.3, we need to make sure this really works. To do that, the best would be that Travis tests against ArviZ master, not only against the latest release -- @canyon289 do you know if it's possible and how easy it would be?
Also pinging @twiecki and @ColCarroll, who could have valuable input.

OriolAbril · 2020-05-27T22:11:32Z

Given that @michaelosthege has successfully run the tests locally, I'll merge tomorrow to make release 0.8.3 and hope for the best.

AlexAndorra · 2020-05-28T15:32:21Z

@OriolAbril was emboldened by your successful local tests and released ArviZ 0.8.3 @michaelosthege, so you can push your last changes and I'll merge if tests pass 🤞

OriolAbril · 2020-05-28T17:52:02Z

🎉

AlexAndorra

We've done it, well done guys 👏 Quite a team 💪

michaelosthege added 4 commits May 4, 2020 12:41

mention arviz functions by name in warning

a00fff8

convert to InferenceData already in sample function

6dec963

+ convert to InferenceData and save metadata to it already in sample() + pass idata instead of trace to convergence check, to avoid duplicate work + directly use arviz diagnostics instead of pymc3 aliases

fix refactoring bugs

7c4ae31

fix indentation

653e3d2

add return_inferencedata option

7789c2c

+ set to None + defaults to False

Fix numpy docstring format.

d2b312f

Replaced "<varname>: <type>" with "<varname> : <type>" per numpy guidelines. Fix spelling typo.

rpgoldman approved these changes May 4, 2020

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

pymc3/sampling.py Outdated Show resolved Hide resolved

michaelosthege and others added 3 commits May 4, 2020 16:30

pass model to from_pymc3 because of deprecation warning

1c558d4

add test for return_inferencedata option

2cf5e54

Merge branch 'master' into return-inferencedata-option

bb3aef0

michaelosthege changed the title ~~[WIP] return_inferencedata option for pm.sample~~ return_inferencedata option for pm.sample May 4, 2020

michaelosthege added enhancements request discussion labels May 4, 2020

AlexAndorra approved these changes May 4, 2020

View reviewed changes

michaelosthege added 3 commits May 4, 2020 20:30

advise against keeping warmup draws in a MultiTrace

67d9c58

mention pymc-devs#3911

c910221

Merge branch 'return-inferencedata-option' of https://github.com/mich…

0a06da1

…aelosthege/pymc3 into return-inferencedata-option

AlexAndorra requested changes May 4, 2020

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

pymc3/sampling.py Outdated Show resolved Hide resolved

AlexAndorra reviewed May 4, 2020

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

OriolAbril reviewed May 4, 2020

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

michaelosthege removed the request discussion label May 6, 2020

michaelosthege added the don't merge label May 10, 2020

update arviz to 0.8.1 because of bugfix

96a2387

AlexAndorra reviewed May 25, 2020

View reviewed changes

docs/source/notebooks/api_quickstart.ipynb Outdated Show resolved Hide resolved

michaelosthege added 2 commits May 26, 2020 01:12

incorporate review feedback

6ff17a2

+ more direct use of ArviZ + some wording things

use arviz plot_ppc

fa324ff

michaelosthege added 3 commits May 26, 2020 11:38

also ignore Visual Studio cache

9d208c4

fix warmup saving logic and test

d154a04

require latest ArviZ patch

2912825

AlexAndorra requested changes May 26, 2020

View reviewed changes

RELEASE-NOTES.md Outdated Show resolved Hide resolved

pymc3/sampling.py Outdated Show resolved Hide resolved

michaelosthege added 2 commits May 26, 2020 11:58

change warning to nuget users towards InferenceData

50d7588

update ArviZ minimum version

ff7e1ea

michaelosthege added 2 commits May 26, 2020 14:40

address review feedback

e698387

start showing the FutureWarning about return_inferencedata in minor r…

3f6aafd

…elease >=3.10

AlexAndorra reviewed May 26, 2020

View reviewed changes

require arviz>=0.8.3 for latest bugfix

0f756e3

OriolAbril approved these changes May 28, 2020

View reviewed changes

AlexAndorra approved these changes May 28, 2020

View reviewed changes

AlexAndorra merged commit aafa00b into pymc-devs:master May 28, 2020

michaelosthege deleted the return-inferencedata-option branch May 28, 2020 18:40

eigenfoo mentioned this pull request Jun 1, 2020

New commits to pymc3/sampling.py or pymc3/step_methods/hmc/ eigenfoo/littlemcmc#66

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

return_inferencedata option for pm.sample #3911

return_inferencedata option for pm.sample #3911

michaelosthege commented May 4, 2020 •

edited

Loading

michaelosthege commented May 4, 2020

codecov bot commented May 4, 2020 •

edited

Loading

michaelosthege commented May 4, 2020

rpgoldman left a comment

michaelosthege commented May 4, 2020

michaelosthege commented May 4, 2020

AlexAndorra left a comment

AlexAndorra commented May 10, 2020

michaelosthege commented May 10, 2020

fonnesbeck commented May 10, 2020

michaelosthege commented May 25, 2020

AlexAndorra left a comment •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

AlexAndorra commented May 26, 2020 •

edited

Loading

michaelosthege commented May 26, 2020

AlexAndorra left a comment

AlexAndorra commented May 27, 2020

OriolAbril commented May 27, 2020

AlexAndorra commented May 28, 2020

OriolAbril commented May 28, 2020

AlexAndorra left a comment

return_inferencedata option for pm.sample #3911

return_inferencedata option for pm.sample #3911

Conversation

michaelosthege commented May 4, 2020 • edited Loading

michaelosthege commented May 4, 2020

codecov bot commented May 4, 2020 • edited Loading

Codecov Report

michaelosthege commented May 4, 2020

rpgoldman left a comment

Choose a reason for hiding this comment

michaelosthege commented May 4, 2020

michaelosthege commented May 4, 2020

AlexAndorra left a comment

Choose a reason for hiding this comment

AlexAndorra commented May 10, 2020

michaelosthege commented May 10, 2020

fonnesbeck commented May 10, 2020

michaelosthege commented May 25, 2020

AlexAndorra left a comment • edited Loading

Choose a reason for hiding this comment

review-notebook-app bot commented May 26, 2020 • edited Loading

review-notebook-app bot commented May 26, 2020 • edited Loading

review-notebook-app bot commented May 26, 2020 • edited Loading

review-notebook-app bot commented May 26, 2020 • edited Loading

AlexAndorra commented May 26, 2020 • edited Loading

michaelosthege commented May 26, 2020

AlexAndorra left a comment

Choose a reason for hiding this comment

AlexAndorra commented May 27, 2020

OriolAbril commented May 27, 2020

AlexAndorra commented May 28, 2020

OriolAbril commented May 28, 2020

AlexAndorra left a comment

Choose a reason for hiding this comment

michaelosthege commented May 4, 2020 •

edited

Loading

codecov bot commented May 4, 2020 •

edited

Loading

AlexAndorra left a comment •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

review-notebook-app bot commented May 26, 2020 •

edited

Loading

AlexAndorra commented May 26, 2020 •

edited

Loading