Add save_trace and load_trace#2975
Conversation
|
THANK YOU! This will solve so much pickling issues! |
|
Just so I understand what's the pickle issues? Incompatible between python versions and security concerns? |
|
This looks good to me and ready to merge. Any objections @ColCarroll |
|
Great stuff! |
|
Oh, I forgot, we should add this to the release-notes, and also add some example docs somewhere. |
|
A separate PR for that would work. I'll have a stab at adding some docs on
this this afternoon.
…On Fri, 18 May 2018, 11:03 am Thomas Wiecki, ***@***.***> wrote:
Oh, I forgot, we should add this to the release-notes, and also add some
example docs somewhere.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2975 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA8DiGDKgQzXEbf0Oi_HRtC5JUQZ-KUSks5tzpxZgaJpZM4UAZ6x>
.
|
|
I will add release notes, and I realized that there's an edge case I didn't cover that requires deleting the directory before writing to it (if you save a model with variables |
|
Does anyone have a sample code to predict with this trace load functionality? I have a Gaussian model : My objective is to predict f(x) for a new x in a new python session without running the model training piece... |
|
I might need more detail for what you're trying to do. Here's an example, though: First, generate a random model: import os
import numpy as np
import matplotlib.pyplot as plt
import theano
import theano.tensor as tt
dims = 2
N = 100
true_weights = np.random.normal(size=(dims,))
data = np.random.normal(size=(N, dims))
noise = np.random.normal(0, 0.5, size=N)
y = np.dot(data, true_weights) + noise
print(true_weights)Now do a cached prediction -- running this multiple times will work, even changing the cache_file = 'my_trace.trace'
s_data = theano.shared(data)
with pm.Model() as model:
weights = pm.Normal('weights', mu=0, sd=1, shape=dims)
y_obs = pm.Normal('y_obs', mu=tt.dot(s_data, weights), sd=0.5, observed=y, shape=s_data.shape[0].eval())
if not os.path.exists(cache_file):
with model:
trace = pm.sample()
pm.save_trace(trace, directory=cache_file)
else:
trace = pm.load_trace(cache_file, model=model)
predict_data = np.array([
[0, 1],
[1, 0],
[1, 1],
[2, 2],
])
s_data.set_value(predict_data)
with model:
ppc = pm.sample_ppc(trace)
print(trace['weights'].mean(axis=0)) # pretty close to true weights
print(ppc['y_obs'].mean(axis=0)) # should be reasonable |
|
Thanks very much. Here is what I'm trying to do....
*Defining Priors:*
with pm.Model() as gp_fit:
ρ = pm.Gamma('ρ', 1, 2)
η = pm.Gamma('η', 1, 2)
K = η * pm.gp.cov.ExpQuad(2, ρ)
with gp_fit:
M = pm.gp.mean.Zero()
σ = pm.HalfCauchy('σ', 0.5)
Initial Pseudo Points
k_m_point=20
Xu_init = pm.gp.util.kmeans_inducing_points(k_m_point, Xs)
*Sparse Gaussian Optimization:*
with gp_fit:
gp = pm.gp.MarginalSparse(cov_func=K, approx="VFE")
#Xu=Xu_init
Xu = pm.Flat("Xu", shape=(20, 2), testval=Xu_init)
y_ = gp.marginal_likelihood("y_", X=Xs, Xu=Xu, y=y, noise=σ)
#y_ = gp.marginal_likelihood("y_", X=Xs, Xu=Xu, y=y, noise=σ)
mp = pm.find_MAP()
trace = pm.sample(500, n_init=1000)
*Like to Save the fitted model at this stage....*
*Prediction:*
mu_dev, var_dev = gp.predict(X_new, point=mp, diag=True)
I would like to do this prediction without the preceding steps by loading
the model/trace... What all do I need to save? Do I need to also load the
pseudo points at the time of inference?
…On Sat, Aug 25, 2018 at 1:11 PM Colin ***@***.***> wrote:
I might need more detail for what you're trying to do. Here's an example,
though:
First, generate a random model:
import os
import numpy as npimport matplotlib.pyplot as pltimport theanoimport theano.tensor as tt
dims = 2
N = 100
true_weights = np.random.normal(size=(dims,))
data = np.random.normal(size=(N, dims))
noise = np.random.normal(0, 0.5, size=N)
y = np.dot(data, true_weights) + noiseprint(true_weights)
Now do a cached prediction -- running this multiple times will work, even
changing the predict_data.
cache_file = 'my_trace.trace'
s_data = theano.shared(data)
with pm.Model() as model:
weights = pm.Normal('weights', mu=0, sd=1, shape=dims)
y_obs = pm.Normal('y_obs', mu=tt.dot(s_data, weights), sd=0.5, observed=y, shape=s_data.shape[0].eval())
if not os.path.exists(cache_file):
with model:
trace = pm.sample()
pm.save_trace(trace, directory=cache_file)else:
trace = pm.load_trace(cache_file, model=model)
predict_data = np.array([
[0, 1],
[1, 0],
[1, 1],
[2, 2],
])
s_data.set_value(predict_data)
with model:
ppc = pm.sample_ppc(trace)
print(trace['weights'].mean(axis=0)) # pretty close to true weightsprint(ppc['y_obs'].mean(axis=0)) # should be reasonable
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2975 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AotFpVie8LwPrLBfizLnhbmAThOOBjfEks5uUYVbgaJpZM4UAZ6x>
.
--
-------------------------
Sudipta Mazumdar
Home: 905-604-3325
Cell: 647-687-5900
|

This provides functions to save and load traces, avoiding
pickle. My main use would be saving traces while running a large notebook, or distributing the traces with code containing the models used to produce them.Pros:
Cons:
trace.reportyet (though that could be added without breaking compatibility)