-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API proposal: container or data class #2056
Comments
there are several problems.
data_gen = pm.Data(numpy_array, in_memory=10000, minibatch=500, memory_update=custom_generator)
# creates shared with size 10000 and slices randomly for 500 samples
# also has data_gen.callback with signature `(*_)`
# callback updates in_memory storage
# model collects that callbacks and creates a single callback `minibatch_update`
# minibatch_update should be called by demand. recommended delay is 10000/500 |
It would be nice if our PyMC nodes were able to infer shape from the data, and change when data are swapped out. Not sure why we would want the I like the |
Hi all,
Agreed with what's been said so far from @fonnesbeck and @twiecki.
I like the whole data class - I think it's intuitive and is what we do.
…On Wed, Apr 19, 2017 at 3:05 PM, Chris Fonnesbeck ***@***.***> wrote:
It would be nice if our PyMC nodes were able to infer shape from the data,
and change when data are swapped out.
Not sure why we would want the Data object to be independent of the
model. I thought that would be part of the point of having a class. What's
your thinking there @ferrine <https://github.com/ferrine> ?
I like the Data name.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2056 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8DiP1oiaEU3NSz7-IHobcnk-N8Jdxvks5rxgabgaJpZM4NBd8A>
.
--
Peadar Coyle
Skype: springcoilarch
www.twitter.com/springcoil
peadarcoyle.wordpress.com
|
Agree it should be part of the model, that's the point. |
But we don't currently have consistent shape inference. For example, try parameterizing a |
How can we track shape change with arbitrary theano operations? I think it's a like a dream.
|
Well, I did say "it would be nice" ... but, if we have a robust |
@ferrine That's possible if |
Or with a |
I would like to work in this issue. What is the solution you're looking for:
|
Sounds great @jmloyola! It's the first: create a new data container (that probably inherits from theano.shared) that registers itself to the model to allow API as outlined above. |
This issue can be closed. 😃 |
Currently it's a bit clunky (but at least possible) to change out data in a model. E.g.:
Then, if I want to predict on new data, I have to call:
The fact that the model has no idea of the data is a bit problematic for cases where we want to automate this process, e.g. when predicting or running ppc on hold-out data.
I'm proposing a
pm.Data
container (better ideas for names welcome) that would work like this:Then, if I want to predict on new data, I have to call:
or, with nicer api:
and
predictions
would be a dict like{'y': [[4, 5, 6], ...]
What is
pm.Data
? Just atheano.shared
that is known the model.The model is now aware of its in- and outputs. For example, if the model is a glm, we could very easily have API that just plots the PPC over a range, e.g.
with model: pm.plot_posterior_glm(trace, eval={'x': np.linspace(-3, 3, 100)
. Behind the scenes it would replace the value, call ppc, and plot the result.Other things I haven't thought about is if this can also help with mini-batching, making that API nicer. Maybe @ferrine has some thoughts on this.
The text was updated successfully, but these errors were encountered: