Proposal: Dist shape refactor #1125

twiecki · 2016-05-23T05:45:54Z

No description provided.

twiecki · 2016-05-23T05:47:24Z

pymc3/distributions/continuous.py

@@ -194,7 +210,7 @@ def logp(self, value):
                     tau > 0, sd > 0)


-class HalfNormal(PositiveContinuous):
+class HalfNormal(PositiveUnivariateContinuous):


This might be a case where we want to start using multiple inheritance, e.g. HalfNormal(Positive, Univariate, Continuous)

Yeah, I thought so, too, but the old Java side of me tries to avoid it.

It seems pretty innocuous here.

I think we should go that route here unless it would cause other unanticipated problems.

Started splitting those up in the most recent commit; however, it became apparent that constructor call order, and other confusing things, could arise in this scenario.

brandonwillard · 2016-05-28T19:52:27Z

These changes should help address #535 .

twiecki · 2016-05-30T04:36:27Z

@brandonwillard Are you going to make more changes (like the multiple inheritance) or want another round of review?

brandonwillard · 2016-05-30T05:47:16Z

I planned on spending some part of tomorrow getting the current tests to pass, so I might have a lot to push soon.

twiecki · 2016-05-30T05:52:51Z

Great! You'll probably also want to rebase.

brandonwillard · 2016-05-31T01:30:41Z

@twiecki: To really go forward with symbolic scalars in RV shapes (again, the number of scalars is not symbolic, since Theano doesn't handle that), it looks like we would have to put some/all of the MCMC sampling logic into a Theano gof.Op.
The block-step code seems to depend on explicit shape values (within ArrayOrdering), and a different, non-Op approach to those operations doesn't immediately come to mind. Otherwise, compiling Theano functions to return the exact shapes seems lame, but feasible.

We could always limit the symbolic shape capabilities by assuming that the shape scalars are always constant values, then we can keep the new Theano shape logic, add some steps to extract the constant values, and work toward more Theano integration as we go. In this case, the additions would simply serve to establish some organizational elements that are useful for multivariate work (i.e. the separate shape tuples for support, replications, etc.) and some Theano manipulations that (attempt) to perform automatic shape inference.

Any thoughts?

twiecki · 2016-05-31T07:34:10Z

@brandonwillard

I think rewriting the samplers in theano would make things way more complicated. Samplers have loops and if/then clauses, both things that are quite tricky in theano. The benefit would be GPU sampling but that would be it's own project.

I think unified shape handling (even if constant) is a major improvement. Note that there are also consistencies in the univariate RVs (e.g.#1079) that hopefully can be fixed with this (is it already?).

And as you said, we can always move closer to symbolic shapes later on. Out of curiosity, would this allow changing the dimensionality of the model while sampling? Thinking of non-parametrics.

Should we take a closer look? Still needs to be rebased which I suspect will cause some hick-ups with the T -> tt change which we already have on master now.

The latter should at least make

Seems like that stopped mid-way through.

twiecki · 2016-05-31T07:35:29Z

4ccac8a this commit doesn't seem to change anything in regards to random behavior?

brandonwillard · 2016-05-31T18:31:29Z

Yeah, that change in the commit might just be my automatic formatting.

brandonwillard · 2016-05-31T18:38:12Z

Regarding the samplers, we might not want/need to use the high-level Theano operations. I was actually thinking that we move to more low-level Theano, such as implementing a theano.Op. That way we might be able to put most of the current non-theano Python in a theano.Op.perform method. There would be a little work in matching up the inputs and outputs, but it could be worth it in the end.

brandonwillard · 2016-05-31T23:21:25Z

Oh, and #1079 would be addressed by these changes (might need the associated changes to the Distribution.random framework, though), since it specifies an order of dimensions for all distributions (i.e. shape_reps + shape_ind + shape_supp).

For the exact problem in #1079, these changes intend to allow the following:

# artificial data
k = 2
ndata = 10
data = np.array([1, 2.2, 2.5, 2.8, 4, 6, 7.2, 7.5, 7.8, 9])

# model
alpha = 0.1 * np.ones((ndata, k))
with pm.Model() as model:
    # Dirichlet will know its support is a vector of length k, i.e. shape_supp = (k,),
    # and that there are ndata-many independent copies, i.e. shape_ind = (ndata,),
    # finally, there are no replications, so shape_reps = tuple()
    p = pm.Dirichlet('p', alpha)
    # Notice below that the "shape" keyword has been exchanged for "size", and 
    # both Normal and Uniform have scalar support, i.e. shape_supp = tuple(), so 
    # we've really just specified k-many replications, i.e. shape_reps = (k,).
    # Also, shape_ind = tuple(), but would generally be whatever shape the parameters
    # are.
    mu = pm.Normal('mu', mu=5, sd=3, size=k)
    sd = pm.Uniform('sd', lower=0.1, upper=0.5, size=k)
    # Now, the following should work, since shape_supp = p.shape_supp,
    # shape_ind = p.shape_ind, shape_reps = p.shape_reps.
    categ = pm.Categorical('categ', p=p)  

    ...

The shapes for the last statement involving the Categorical distribution could be obtained directly , and in a generic way, from its parameter p with shape_supp = p.shape[-1], shape_ind = p.shape[:-1], since we know--by definition--that the Categorical distribution has vector support.
If we tried this in the current codebase, we'd p.shape[-1] would return a Theano scalar object and break things down the line.

twiecki · 2016-06-01T06:48:16Z

pymc3/distributions/distribution.py

+
+        Parameters
+        ----------
+        shape_supp


This needs to be reformatted eventually to match the numpy-doc style.

No problem. Documenting is much easier with a standard!

twiecki · 2016-06-01T07:02:56Z

OK, I really like where this is going. The example is super helpful. Would you mind also adding one with multiple multivariate distributions? #1153 seems to be another good example.

Also want to get @fonnesbeck to take a look at the API.

So if I passed pm.Dirichlet('p', alpha.T) shape_supp would be ndata and shape_ind would be k? What if I did pm.Dirichlet('p', 0.1 * np.ones(k), size=ndata) -- would that be equivalent?

I'm not sure I fully grog the difference between shape_ind and shape_rep. For the dirichlet, couldn't ndata be the shape_rep?

What's left to do here? Certainly changing the examples (I can help with that), adding some unittests, random support (do you think this will be tricky?).

brandonwillard · 2016-06-01T17:30:11Z

Yeah, in many cases shape_ind and shape_reps will serve the same purpose, but the distinction should be made. For example, the issue I was running into with bounded/truncated sampling arose because there was no distinction. Again, you can efficiently perform rejection sampling on many independent, identically distributed variables, but as soon as some of those variables have non-identical distributions (e.g. different means) you must handle those separately.
When someone instantiates a univariate distribution, like Normal, with non-scalar parameters, I'm just automatically assuming that the parameter tensors could have different values, making them independent but not identical, and the size parameter explicitly serves to produce independent and identical copies of a distribution, since there's no way of specifying different distribution parameters with size alone.

To get this truly working, we need to address every current use of the shape information and either Theano-ize those parts--if possible--or restrict the symbolic scalar functionality by forcibly extracting the constant values from the shape scalars (and giving errors if they aren't effectively constant). Similarly, we could create Theano functions that evaluate the symbolic shapes in those non-Theano places they're used. Of course those Theano functions should not take arguments (e.g. other random variables, etc.), but perhaps they could depend on shared variables, which can be evaluated at any time I believe.
The random support is probably just a lot of typing and connecting the dots with Theano's random functions. However, there is the matter of symbolic random values (that's what Theano's random actually returns); we might want to integrate that idea/type of object into the API. Gotta think about that one, though.

springcoil · 2016-06-05T20:27:39Z

I think this needs a rebase. What else needs to be done?

fonnesbeck · 2016-06-05T21:06:40Z

I need to take a close look. Also, there are no tests written yet, which will be important.

brandonwillard · 2016-06-05T21:33:39Z

There's a lot to be done; this branch was only intended to facilitate a discussion on symbolic shape handling. It does seem possible to make a real PR out of it, though.

In summary:

Update the current non-symbolic uses of shape information. This may involve further integration with Theano or on the spot evaluation of shapes. See my comments above.
- In ArrayOrdering
- In Distribution.random()
  - This also seems like a great time and place to start using Theano's random code and RandomFunction class.
- ...
Implement the size parameter for univariate distributions, so that they work like the example described above.
Finish updating multivariate distributions so that they follow the shape convention described in the example above.
Update old tests and/or (re-)add a shape keyword for backward compatibility.

twiecki · 2016-06-06T11:45:36Z

Waiting for @fonnesbeck to comment on the API.

brandonwillard · 2016-06-08T17:31:52Z

On a very related note, why aren't the random variable classes (i.e. FreeRV, ObservedRV, TransformedRV) Theano SharedVariables?
PyMC3's use of test/default values sort of implies that SharedVariables might be reasonable. Also, the random variables in PyMC3 would operate more like they do in PyMC2; that is, they would have a value member, as well.

FYI: Theano's random framework appears to use a gof.Op (RandomFunction, specifically) for the type of object PyMC3 refers to as a random variable. That Op then works like a factory to produce a TensorType, and update the rng state.

fonnesbeck · 2016-06-08T18:32:30Z

@brandonwillard you'd have to ask @jsalvatier to comment on that, as he designed most of PyMC3. PyMC3 is more functional in design than PyMC2, so I believe the idea was to have the values pass through the random variable directly to the trace, thereby relieving the random variable of having to manage its value. I can see the advantages (and disadvantages) of both approaches -- having the values together in one place (the Point) versus distributed across their respective variables.

jsalvatier · 2016-06-08T23:23:51Z

@fonnesbeck @brandonwillard yup, that's basically the reason I didn't choose SharedVariable. In retrospect things have moved more in the direction of using shared variables since samplers replace some variables with shared variables. Its possible that it would good to make them be shared variables, but that would require some investigation.

brandonwillard · 2016-06-09T00:03:07Z

Ah, ok; I think I can see what you mean. A random variable is expected to depend on other symbolic variables, and I'm not clear on what a "dependent" SharedVariable in Theano would be.

twiecki · 2016-06-15T11:18:44Z

@brandonwillard Sorry being unresponsive. What's the status on this?

…evaluation step

twiecki · 2017-01-03T07:02:54Z

pymc3/variational/advi.py

            updates.update(
-                {v: (n * tt.exp(w) + u).reshape(v.tag.test_value.shape)})
+                {v: (n * tt.exp(w) + u).reshape(np.shape(v.tag.test_value))})


Why is that required?

test_value was returning a primitive (not a numpy array) in the tests. Not sure if that's due to another issue, yet; still clearing out the big issues following the evaluation hack I just added. Will comment on all this once the tests are passing again (or sooner, if necessary).

Not necessary, was just curious. Great to see you picking this back up!

brandonwillard · 2017-01-03T20:32:53Z

Quick update: I've put in a small hack that should allow these Distribution changes to function in most cases (and, thus, provide the details necessary for addressing issues like #1069).

The basic idea is that we're fine evaluating a symbolic Theano shape object if all its inputs are constants. We can then use the evaluated shape in all the places that aren't ready for purely symbolic shape objects. At the very least it allows us to start setting up automatic [Theano] shape logic, which could make the high level interface simpler.

Some of this shape logic is demonstrated for the simplest case (univariate) here.

I added a function has_const_inputs that helps determine if all inputs are constant. Seems like a thing that might already exist in Theano, but nothing came to mind and that was simple enough to write. The function theano.tensor.get_scalar_constant_value seemed like the right direction, but it doesn't handle enough cases to work for this.
Originally, the idea was to compile a Theano function to evaluate the shape object, which could also be used when the shape has variable dependencies. The eval approach is just for simplicity, and, since I'm sure that some of the intermediate steps in constructing a proper Theano function could replace has_const_inputs, I'll revisit this Theano function approach soon.

…) shape argument

…still not ready)

brandonwillard · 2017-01-05T02:34:12Z

I gathered all the shape logic and put it into one class (here). That should cover univariate and multivariate cases, as well as keep the code organized and contained. Next step is to go through the distributions and wire them to use that DistributionShape class.

I've also added some logic to handle "old style" distributions that only specify shape. Hopefully that will put things back on track when everything is setup.

twiecki · 2017-01-06T10:35:49Z

@brandonwillard that looks like a useful abstraction. btw. will this allow dynamic shapes? CC @ferrine

brandonwillard · 2017-01-06T17:32:41Z

@twiecki, it won't allow dynamic shapes, but it should get rid of the need to specify shape in cases where the shape is actually fixed/constant. More important, it leaves us in a position to start gradually enabling dynamic shapes.

twiecki · 2017-01-18T13:53:10Z

@brandonwillard That would already be a huge win.

brandonwillard · 2018-02-15T21:27:05Z

Looks like #2833 is doing some very related work.

twiecki · 2018-02-16T14:32:48Z

Yes, been meaning to reach out to you about that. Would be great to get your input on that PR.

…

On Thu, Feb 15, 2018 at 10:32 PM, Brandon T. Willard < ***@***.***> wrote: Looks like #2833 <#2833> is doing some very related work. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1125 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApJmM5xkv0RfmLPUizfewyyLA2onDiMks5tVKErgaJpZM4IkNKh> .

brandonwillard · 2018-02-18T23:36:51Z

Looking closer at #2833 (and its possible source/motivation, #2406), it is addressing the same thing(s).

My first-pass general solution to the issue of symbolic shapes is here. It's self-contained and requires more work in order to be used within the Distribution class(es), but it does single-out and provide the needed shape functionality. Also, it might provide some shape-handling steps that can be generalized (e.g. to other backends).

In the meantime, I'm looking over the current state of the project, since I've been out of the loop for a while, and I'll try to follow-up soon. Also, this PR could use a description summarizing its efforts! If it isn't salvageable—relative to the current master—then perhaps I'll just submit a new PR to replace this one (and give it a description)

twiecki · 2018-02-19T12:53:06Z

@brandonwillard Thanks for following up. Ideally we'd come up with a joint proposal. Perhaps @aseyboldt could read through your approach and we discuss if and what we want to take. I do like the modularization of this approach.

twiecki · 2018-12-22T12:03:07Z

Closing because of age. Feel free to reopen if you want to fix this up.

brandonwillard · 2018-12-25T00:37:09Z

@twiecki Here's a draft describing the kinds of things I was trying to do with PyMC3, how they relate to the issue(s) in this PR, and some proposals that could become PRs (picking up from this one, perhaps). I'll keep updating it as I find more of my old work—which I'll most likely spread across a couple posts—but input is more than welcome at any time.

junpenglao · 2018-12-25T02:48:33Z

This is pretty nice @brandonwillard ! thanks for sharing.
I remembered I have seen something about combining sympy with theano to do symbolic computation - is your work in anyway related to that?

In case you missed it: we recently published our developer guide, https://docs.pymc.io/developer_guide.html, I think some graph related explanation in your blog post could go into the developer guide as well.

twiecki · 2018-12-25T11:38:39Z

@brandonwillard That is extremely cool! I'm excited about this deep thinking and I think it's a good time for some more low-level changes, as we're already changing a few things around for 3.7 (dropping python 2 support, sd->sigma renaming).

twiecki changed the title ~~Dist shape refactor~~ Proposal: Dist shape refactor May 23, 2016

twiecki reviewed May 23, 2016
View reviewed changes

twiecki reviewed Jun 1, 2016
View reviewed changes

brandonwillard mentioned this pull request Jun 2, 2016

quick solution to Bounded sampling of arbitrary shapes #1069

Closed

brandonwillard force-pushed the dist-shape-refactor branch from f2ee8ca to 23b9cea Compare June 5, 2016 22:18

brandonwillard added 2 commits January 2, 2017 14:44

started splitting up dist. classes; some small testval test changes

9563d8a

forced derived shape components to be tuples

e6e1b4a

brandonwillard force-pushed the dist-shape-refactor branch from 24c0180 to e6e1b4a Compare January 2, 2017 20:46

brandonwillard added 4 commits January 2, 2017 14:55

fixed typos from rebase

a8b26ec

removed unnecessary shape args in univariates; added check and shape …

72e025e

…evaluation step

fixed some shape issues in transform and advi logic

c8a578c

started shape handling updates for missing data

30133e1

twiecki commented Jan 3, 2017

View reviewed changes

brandonwillard added 4 commits January 3, 2017 15:54

refactored discrete dist. constructors; added support for (depricated…

842277c

…) shape argument

fixes to univariate shape determination; typo fixes in multivatiate (…

19497d4

…still not ready)

fixed test_value replications issue

56d9a1e

fixed another broadcasting issue; started updating tests

e702da1

taku-y mentioned this pull request Jan 18, 2017

Added base class for variational methods #1600

Closed

twiecki mentioned this pull request Jan 18, 2017

support for multi-dimensional Dirichlet is not working #1676

Closed

ferrine mentioned this pull request Apr 19, 2017

API proposal: container or data class #2056

Closed

twiecki mentioned this pull request May 8, 2017

Should Categorical broadcast? #2082

Closed

twiecki closed this Dec 22, 2018

Proposal: Dist shape refactor #1125

Proposal: Dist shape refactor #1125

Conversation

twiecki commented May 23, 2016

Choose a reason for hiding this comment

brandonwillard May 23, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brandonwillard commented May 28, 2016

twiecki commented May 30, 2016

brandonwillard commented May 30, 2016

twiecki commented May 30, 2016

brandonwillard commented May 31, 2016 • edited Loading

twiecki commented May 31, 2016

twiecki commented May 31, 2016

brandonwillard commented May 31, 2016 • edited Loading

brandonwillard commented May 31, 2016 • edited Loading

brandonwillard commented May 31, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twiecki commented Jun 1, 2016

brandonwillard commented Jun 1, 2016 • edited Loading

springcoil commented Jun 5, 2016

fonnesbeck commented Jun 5, 2016

brandonwillard commented Jun 5, 2016 • edited Loading

twiecki commented Jun 6, 2016

brandonwillard commented Jun 8, 2016 • edited Loading

fonnesbeck commented Jun 8, 2016

jsalvatier commented Jun 8, 2016

brandonwillard commented Jun 9, 2016

twiecki commented Jun 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brandonwillard commented Jan 3, 2017

brandonwillard commented Jan 5, 2017

twiecki commented Jan 6, 2017

brandonwillard commented Jan 6, 2017

twiecki commented Jan 18, 2017

brandonwillard commented Feb 15, 2018

twiecki commented Feb 16, 2018 via email

brandonwillard commented Feb 18, 2018

twiecki commented Feb 19, 2018

twiecki commented Dec 22, 2018

brandonwillard commented Dec 25, 2018

junpenglao commented Dec 25, 2018 • edited Loading

twiecki commented Dec 25, 2018

brandonwillard May 23, 2016 •

edited

Loading

brandonwillard commented May 31, 2016 •

edited

Loading

brandonwillard commented May 31, 2016 •

edited

Loading

brandonwillard commented May 31, 2016 •

edited

Loading

brandonwillard commented May 31, 2016 •

edited

Loading

brandonwillard commented Jun 1, 2016 •

edited

Loading

brandonwillard commented Jun 5, 2016 •

edited

Loading

brandonwillard commented Jun 8, 2016 •

edited

Loading

junpenglao commented Dec 25, 2018 •

edited

Loading