Let plot_posterior_predictive_glm work with inferencedata too #4234

MarcoGorelli · 2020-11-19T16:57:12Z

BTW, extending it to work with InferenceData and time series data would be awesome 🤩 So the best may be to add it to Bambi instead

I haven't (yet) made sense of Bambi (though it does look like an awesome project), so for now here's a little PR to slightly extend plot_posterior_predictive_glm's functionality - then, the glm notebooks can be re-run with return_inferencedata=True

BTW, what's the rule with the copyright header? I see it in some files, but not all

RELEASE-NOTES.md

codecov · 2020-11-20T04:58:09Z

Codecov Report

Merging #4234 (2a38f67) into master (b7b145d) will increase coverage by 0.09%.
The diff coverage is 88.88%.

@@            Coverage Diff             @@
##           master    #4234      +/-   ##
==========================================
+ Coverage   87.85%   87.95%   +0.09%     
==========================================
  Files          88       88              
  Lines       14495    14499       +4     
==========================================
+ Hits        12734    12752      +18     
+ Misses       1761     1747      -14

Impacted Files	Coverage Δ
pymc3/plots/posteriorplot.py	`95.65% <88.88%> (+74.59%)`	⬆️

AlexAndorra

Thanks @MarcoGorelli, this is a nice start! I think the plotting functions can actually be refactored into one unique function -- see my comments below and feel free to ask if anything is unclear 😉

AlexAndorra · 2020-11-20T08:42:57Z

pymc3/plots/posteriorplot.py

+def _plot_multitrace(trace, eval, lm, samples, kwargs):
    for rand_loc in np.random.randint(0, len(trace), samples):
        rand_sample = trace[rand_loc]
        plt.plot(eval, lm(eval, rand_sample), **kwargs)
        # Make sure to not plot label multiple times
        kwargs.pop("label", None)

-    plt.title("Posterior predictive")
+
+def _plot_inferencedata(trace, eval, lm, samples, kwargs):
+    trace_df = trace.posterior.to_dataframe()
+    for rand_loc in np.random.randint(0, len(trace_df), samples):
+        rand_sample = trace_df.iloc[rand_loc]
+        plt.plot(eval, lm(eval, rand_sample), **kwargs)
+        # Make sure to not plot label multiple times
+        kwargs.pop("label", None)


These two functions have a lot of duplicated lines; I think they can be merged into one by checking if isinstance(trace, MultiTrace) at the beginning of the function (or just before) and casting the InferenceData to_array (I think this is the name of the function but you can check on ArviZ website) instead of to a dataframe.
After that, the handling should be the same as you're dealing with numpy arrays in both cases

if we cast to array then I think we wouldn't be able to access the different parameters (e.g. 'Intercept' or 'x'), which appear in lm. Each element here is a dict in the multitrace case:

> /home/mgorelli/pymc3-dev/pymc3/plots/posteriorplot.py(61)_plot_multitrace() -> plt.plot(eval, lm(eval, rand_sample), **kwargs) (Pdb) type(rand_sample) <class 'dict'> (Pdb) rand_sample {'x': 1.0, 'Intercept': 1.0}

at this point, the only lines they have in common are

plt.plot(eval, lm(eval, rand_sample), **kwargs) # Make sure to not plot label multiple times kwargs.pop("label", None)

the others are slightly different. My reason for making two separate helper functions is that I thought it'd be more readable than a single function with many if/then statements - I'll go with whatever you think is best though 😇

Ah right, I forgot the whole trace was given here, and not only trace["y"] for instance. But then, wouldn't trace.posterior.to_dataframe().to_dict() get the format we want? That way we'd need only one plotting function

sure - I think this'd be slightly more expensive, but arguably it's worth it for the sake of much simpler code

pymc3/tests/test_plots.py

…cogorelli/pymc3 into extend-plot_posterior_predictive_glm

…rior_predictive_glm

MarcoGorelli · 2020-11-26T11:15:42Z

pymc3/plots/posteriorplot.py

@@ -12,18 +12,28 @@
 #   See the License for the specific language governing permissions and
 #   limitations under the License.

-try:
-    import matplotlib.pyplot as plt
-except ImportError:  # mpl is optional


I don't think mpl is optional anymore - arviz is a required dependency, and mpl is a required dependency of arviz

michaelosthege · 2020-12-15T20:32:40Z

Without going into the details myself, can you consider to mark the non-InferenceData based API as deprecated?
GLMs should work just fine with pm.sample(return_inferencedata=True), right?

MarcoGorelli · 2020-12-16T14:25:17Z

GLMs should work just fine with pm.sample(return_inferencedata=True), right?

Not currently, no

Is pm.sample no longer going to return MultiTrace objects at all, or is that just no longer going to be the default?

michaelosthege · 2020-12-16T14:33:21Z

GLMs should work just fine with pm.sample(return_inferencedata=True), right?

Not currently, no

Is pm.sample no longer going to return MultiTrace objects at all, or is that just no longer going to be the default?

I don't think we can our should drop the MultiTrace option as long as it's still used internally.
If at some point we implement an xarray-backend, we should do it.

But I think we should switch the default.

MarcoGorelli · 2020-12-16T14:40:04Z

Sure, but then why should it be marked as deprecated in plot_posterior_predictive_glm if it's not going to be removed in the future?

I may have misunderstood - could you clarify what exactly should throw a deprecation warning?

michaelosthege · 2020-12-16T14:41:29Z

Sure, but then why should it be marked as deprecated in plot_posterior_predictive_glm if it's not going to be removed in the future?

I may have misunderstood - could you clarify what exactly should throw a deprecation warning?

Oh, you're right. Unless GLMs work fine with InferenceData, we can't take away MultiTrace support from the plotting.

MarcoGorelli · 2020-12-16T14:48:39Z

Here's the issue: currently, this works

with glm_model:
    trace = pm.sample()
pm.plot_posterior_predictive_glm(trace)

but this doesn't

with glm_model:
    trace = pm.sample(return_inferencedata=True)
pm.plot_posterior_predictive_glm(trace)

This PR would let both of the above work. Unless pm.sample(return_inferencedata=False) will be deprecated, I don't think we need to deprecate plot_posterior_predictive_glm taking a MultiTrace

…rior_predictive_glm

…coGorelli/pymc3 into extend-plot_posterior_predictive_glm

AlexAndorra · 2020-12-17T14:37:00Z

Is this ready for review @MarcoGorelli ?

MarcoGorelli · 2020-12-17T14:41:36Z

sure!

AlexAndorra

All good now, thanks for sticking to it @MarcoGorelli !

MarcoGorelli added 6 commits November 19, 2020 13:23

adapt for both

cbab50c

add multitrace test

2eeb5ae

add inferencedata test, cover all lines

22991c9

update release notes

3eb3410

sort imports (it's not checked yet?)

5c37ef7

fixup test

58d0839

MarcoGorelli commented Nov 19, 2020

View reviewed changes

RELEASE-NOTES.md Outdated Show resolved Hide resolved

fixup PR number

c145b0a

AlexAndorra requested changes Nov 20, 2020

View reviewed changes

MarcoGorelli added 5 commits November 20, 2020 11:43

🏷️ add type annotations, correct docstring

e28f2b8

Merge branch 'extend-plot_posterior_predictive_glm' of github.com:mar…

08c11fc

…cogorelli/pymc3 into extend-plot_posterior_predictive_glm

sort

987e170

sort

f93da72

add copyright note

40ac14b

michaelosthege added the enhancements label Nov 20, 2020

MarcoGorelli added 8 commits November 20, 2020 15:08

use todict

9540abb

🎨

a0fa02b

don't cover import error (matplotlib is a dev requirement)

258018a

don't cover import typechecking

9147b34

remove optional mpl import

ddcebeb

remove optional mpl import

fa7687f

remove optional mpl import

2ceaa13

Merge remote-tracking branch 'upstream/master' into extend-plot_poste…

9cf211a

…rior_predictive_glm

MarcoGorelli commented Nov 26, 2020

View reviewed changes

michaelosthege added this to the vNext (3.11.0) milestone Dec 15, 2020

MarcoGorelli added 5 commits December 16, 2020 14:49

Merge branch 'master' into extend-plot_posterior_predictive_glm

63a5a85

Update RELEASE-NOTES.md

0106d77

use Python3.7+ type hints

3fe8210

Merge remote-tracking branch 'upstream/master' into extend-plot_poste…

9e012fb

…rior_predictive_glm

Merge branch 'extend-plot_posterior_predictive_glm' of github.com:Mar…

2a38f67

…coGorelli/pymc3 into extend-plot_posterior_predictive_glm

AlexAndorra approved these changes Dec 17, 2020

View reviewed changes

AlexAndorra merged commit 4e5edd5 into pymc-devs:master Dec 17, 2020

MarcoGorelli deleted the extend-plot_posterior_predictive_glm branch December 17, 2020 15:24

MarcoGorelli mentioned this pull request Feb 16, 2021

Re-run glm-linear pymc-devs/pymc-examples#38

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let plot_posterior_predictive_glm work with inferencedata too #4234

Let plot_posterior_predictive_glm work with inferencedata too #4234

MarcoGorelli commented Nov 19, 2020 •

edited

Loading

codecov bot commented Nov 20, 2020 •

edited

Loading

AlexAndorra left a comment

AlexAndorra Nov 20, 2020

MarcoGorelli Nov 20, 2020 •

edited

Loading

AlexAndorra Nov 20, 2020

MarcoGorelli Nov 20, 2020

MarcoGorelli Nov 26, 2020

michaelosthege commented Dec 15, 2020

MarcoGorelli commented Dec 16, 2020

michaelosthege commented Dec 16, 2020

MarcoGorelli commented Dec 16, 2020

michaelosthege commented Dec 16, 2020

MarcoGorelli commented Dec 16, 2020

AlexAndorra commented Dec 17, 2020

MarcoGorelli commented Dec 17, 2020

AlexAndorra left a comment

Let plot_posterior_predictive_glm work with inferencedata too #4234

Let plot_posterior_predictive_glm work with inferencedata too #4234

Conversation

MarcoGorelli commented Nov 19, 2020 • edited Loading

codecov bot commented Nov 20, 2020 • edited Loading

Codecov Report

AlexAndorra left a comment

Choose a reason for hiding this comment

AlexAndorra Nov 20, 2020

Choose a reason for hiding this comment

MarcoGorelli Nov 20, 2020 • edited Loading

Choose a reason for hiding this comment

AlexAndorra Nov 20, 2020

Choose a reason for hiding this comment

MarcoGorelli Nov 20, 2020

Choose a reason for hiding this comment

MarcoGorelli Nov 26, 2020

Choose a reason for hiding this comment

michaelosthege commented Dec 15, 2020

MarcoGorelli commented Dec 16, 2020

michaelosthege commented Dec 16, 2020

MarcoGorelli commented Dec 16, 2020

michaelosthege commented Dec 16, 2020

MarcoGorelli commented Dec 16, 2020

AlexAndorra commented Dec 17, 2020

MarcoGorelli commented Dec 17, 2020

AlexAndorra left a comment

Choose a reason for hiding this comment

MarcoGorelli commented Nov 19, 2020 •

edited

Loading

codecov bot commented Nov 20, 2020 •

edited

Loading

MarcoGorelli Nov 20, 2020 •

edited

Loading