-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Dist shape refactor #1125
Conversation
@@ -194,7 +210,7 @@ def logp(self, value): | |||
tau > 0, sd > 0) | |||
|
|||
|
|||
class HalfNormal(PositiveContinuous): | |||
class HalfNormal(PositiveUnivariateContinuous): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a case where we want to start using multiple inheritance, e.g. HalfNormal(Positive, Univariate, Continuous)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought so, too, but the old Java side of me tries to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems pretty innocuous here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should go that route here unless it would cause other unanticipated problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started splitting those up in the most recent commit; however, it became apparent that constructor call order, and other confusing things, could arise in this scenario.
These changes should help address #535 . |
@brandonwillard Are you going to make more changes (like the multiple inheritance) or want another round of review? |
I planned on spending some part of tomorrow getting the current tests to pass, so I might have a lot to push soon. |
Great! You'll probably also want to rebase. |
@twiecki: To really go forward with symbolic scalars in RV shapes (again, the number of scalars is not symbolic, since Theano doesn't handle that), it looks like we would have to put some/all of the MCMC sampling logic into a Theano We could always limit the symbolic shape capabilities by assuming that the shape scalars are always constant values, then we can keep the new Theano shape logic, add some steps to extract the constant values, and work toward more Theano integration as we go. In this case, the additions would simply serve to establish some organizational elements that are useful for multivariate work (i.e. the separate shape tuples for support, replications, etc.) and some Theano manipulations that (attempt) to perform automatic shape inference. Any thoughts? |
I think rewriting the samplers in theano would make things way more complicated. Samplers have loops and if/then clauses, both things that are quite tricky in theano. The benefit would be GPU sampling but that would be it's own project. I think unified shape handling (even if constant) is a major improvement. Note that there are also consistencies in the univariate RVs (e.g.#1079) that hopefully can be fixed with this (is it already?). And as you said, we can always move closer to symbolic shapes later on. Out of curiosity, would this allow changing the dimensionality of the model while sampling? Thinking of non-parametrics. Should we take a closer look? Still needs to be rebased which I suspect will cause some hick-ups with the
Seems like that stopped mid-way through. |
4ccac8a this commit doesn't seem to change anything in regards to random behavior? |
Yeah, that change in the commit might just be my automatic formatting. |
Regarding the samplers, we might not want/need to use the high-level Theano operations. I was actually thinking that we move to more low-level Theano, such as implementing a |
Oh, and #1079 would be addressed by these changes (might need the associated changes to the For the exact problem in #1079, these changes intend to allow the following: # artificial data
k = 2
ndata = 10
data = np.array([1, 2.2, 2.5, 2.8, 4, 6, 7.2, 7.5, 7.8, 9])
# model
alpha = 0.1 * np.ones((ndata, k))
with pm.Model() as model:
# Dirichlet will know its support is a vector of length k, i.e. shape_supp = (k,),
# and that there are ndata-many independent copies, i.e. shape_ind = (ndata,),
# finally, there are no replications, so shape_reps = tuple()
p = pm.Dirichlet('p', alpha)
# Notice below that the "shape" keyword has been exchanged for "size", and
# both Normal and Uniform have scalar support, i.e. shape_supp = tuple(), so
# we've really just specified k-many replications, i.e. shape_reps = (k,).
# Also, shape_ind = tuple(), but would generally be whatever shape the parameters
# are.
mu = pm.Normal('mu', mu=5, sd=3, size=k)
sd = pm.Uniform('sd', lower=0.1, upper=0.5, size=k)
# Now, the following should work, since shape_supp = p.shape_supp,
# shape_ind = p.shape_ind, shape_reps = p.shape_reps.
categ = pm.Categorical('categ', p=p)
... The shapes for the last statement involving the |
|
||
Parameters | ||
---------- | ||
shape_supp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be reformatted eventually to match the numpy-doc style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. Documenting is much easier with a standard!
OK, I really like where this is going. The example is super helpful. Would you mind also adding one with multiple multivariate distributions? #1153 seems to be another good example. Also want to get @fonnesbeck to take a look at the API. So if I passed I'm not sure I fully grog the difference between What's left to do here? Certainly changing the examples (I can help with that), adding some unittests, random support (do you think this will be tricky?). |
Yeah, in many cases To get this truly working, we need to address every current use of the shape information and either Theano-ize those parts--if possible--or restrict the symbolic scalar functionality by forcibly extracting the constant values from the shape scalars (and giving errors if they aren't effectively constant). Similarly, we could create Theano functions that evaluate the symbolic shapes in those non-Theano places they're used. Of course those Theano functions should not take arguments (e.g. other random variables, etc.), but perhaps they could depend on shared variables, which can be evaluated at any time I believe. |
I think this needs a rebase. What else needs to be done? |
I need to take a close look. Also, there are no tests written yet, which will be important. |
There's a lot to be done; this branch was only intended to facilitate a discussion on symbolic shape handling. It does seem possible to make a real PR out of it, though. In summary:
|
f2ee8ca
to
23b9cea
Compare
Waiting for @fonnesbeck to comment on the API. |
On a very related note, why aren't the random variable classes (i.e. FYI: Theano's random framework appears to use a |
@brandonwillard you'd have to ask @jsalvatier to comment on that, as he designed most of PyMC3. PyMC3 is more functional in design than PyMC2, so I believe the idea was to have the values pass through the random variable directly to the trace, thereby relieving the random variable of having to manage its value. I can see the advantages (and disadvantages) of both approaches -- having the values together in one place (the |
@fonnesbeck @brandonwillard yup, that's basically the reason I didn't choose |
Ah, ok; I think I can see what you mean. A random variable is expected to depend on other symbolic variables, and I'm not clear on what a "dependent" |
@brandonwillard Sorry being unresponsive. What's the status on this? |
24c0180
to
e6e1b4a
Compare
updates.update( | ||
{v: (n * tt.exp(w) + u).reshape(v.tag.test_value.shape)}) | ||
{v: (n * tt.exp(w) + u).reshape(np.shape(v.tag.test_value))}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_value
was returning a primitive (not a numpy array) in the tests. Not sure if that's due to another issue, yet; still clearing out the big issues following the evaluation hack I just added. Will comment on all this once the tests are passing again (or sooner, if necessary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary, was just curious. Great to see you picking this back up!
Quick update: I've put in a small hack that should allow these The basic idea is that we're fine evaluating a symbolic Theano shape object if all its inputs are constants. We can then use the evaluated shape in all the places that aren't ready for purely symbolic shape objects. At the very least it allows us to start setting up automatic [Theano] shape logic, which could make the high level interface simpler. Some of this shape logic is demonstrated for the simplest case (univariate) here. I added a function |
I gathered all the shape logic and put it into one class (here). That should cover univariate and multivariate cases, as well as keep the code organized and contained. Next step is to go through the distributions and wire them to use that I've also added some logic to handle "old style" distributions that only specify |
@brandonwillard that looks like a useful abstraction. btw. will this allow dynamic shapes? CC @ferrine |
@twiecki, it won't allow dynamic shapes, but it should get rid of the need to specify |
@brandonwillard That would already be a huge win. |
Looks like #2833 is doing some very related work. |
Yes, been meaning to reach out to you about that. Would be great to get
your input on that PR.
…On Thu, Feb 15, 2018 at 10:32 PM, Brandon T. Willard < ***@***.***> wrote:
Looks like #2833 <#2833> is doing
some very related work.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1125 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AApJmM5xkv0RfmLPUizfewyyLA2onDiMks5tVKErgaJpZM4IkNKh>
.
|
Looking closer at #2833 (and its possible source/motivation, #2406), it is addressing the same thing(s). My first-pass general solution to the issue of symbolic shapes is here. It's self-contained and requires more work in order to be used within the In the meantime, I'm looking over the current state of the project, since I've been out of the loop for a while, and I'll try to follow-up soon. Also, this PR could use a description summarizing its efforts! If it isn't salvageable—relative to the current master—then perhaps I'll just submit a new PR to replace this one (and give it a description) |
@brandonwillard Thanks for following up. Ideally we'd come up with a joint proposal. Perhaps @aseyboldt could read through your approach and we discuss if and what we want to take. I do like the modularization of this approach. |
Closing because of age. Feel free to reopen if you want to fix this up. |
@twiecki Here's a draft describing the kinds of things I was trying to do with PyMC3, how they relate to the issue(s) in this PR, and some proposals that could become PRs (picking up from this one, perhaps). I'll keep updating it as I find more of my old work—which I'll most likely spread across a couple posts—but input is more than welcome at any time. |
This is pretty nice @brandonwillard ! thanks for sharing. In case you missed it: we recently published our developer guide, https://docs.pymc.io/developer_guide.html, I think some graph related explanation in your blog post could go into the developer guide as well. |
@brandonwillard That is extremely cool! I'm excited about this deep thinking and I think it's a good time for some more low-level changes, as we're already changing a few things around for 3.7 (dropping python 2 support, sd->sigma renaming). |
No description provided.