Skip to content

Maximum a posterior for Turing models #605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yebai opened this issue Nov 22, 2018 · 13 comments
Closed

Maximum a posterior for Turing models #605

yebai opened this issue Nov 22, 2018 · 13 comments

Comments

@yebai
Copy link
Member

yebai commented Nov 22, 2018

Excerpt from slack discussions

@goedman

Not sure how often this channel is used but I wonder if Turing has a way to retrieve the model portion passed into the @model ... macro. Before actually running the sampler I would like to compute the maximum a posterior (as is done in McElreath’s book StatisticalRethinking, I would use e.g. Optim.jl).

@mohamed82008

using Turing

@model gdemo(x, y) = begin
s ~ InverseGamma(2,3)
m ~ Normal(0,sqrt(s))
x ~ Normal(m, sqrt(s))
y ~ Normal(m, sqrt(s))
return s, m
end
x = 1.5
y = 2.0
model = gdemo(x, y)
vi = Turing.VarInfo()
model(vi, Turing.SampleFromPrior())

function nlogp(sm)
vi.vals .= sm
model(vi, Turing.SampleFromPrior())
-vi.logp
end

using Optim
sm_0 = Float64.(vi.vals)
lb = [0.0, -Inf]
ub = [Inf, Inf]
result = optimize(nlogp, lb, ub, sm_0, Fminbox())

Related: #421 #72

@yebai
Copy link
Member Author

yebai commented Nov 22, 2018

@cpfiffer could you add some doc (perhaps under internals section) on this topic?

@goedman
Copy link

goedman commented Nov 22, 2018

Hi @yebai I hesitated when I asked the question to switch to Slack. If you prefer that folks ask these questions using Github issues, please let me know.

@yebai
Copy link
Member Author

yebai commented Nov 22, 2018

@goedman No worries, either Slack / GitHub works! I moved this discussion here because it's something we want to solve in the near future.

@cpfiffer
Copy link
Member

I'll add a little section about this behavior.

@cpfiffer
Copy link
Member

I have typed up something very quick, which mostly just liberally copies @mohamed82008's great little hack. Once I send up a pull request I'll probably stick it at the bottom of the Advanced section. Any comments to add in here?

Maximum a Posteriori Estimation

Turing does not currently have built-in methods for calculating the maximum a posteriori (MAP) for a model. This is a goal for Turing's implementation, but for the moment, we present here a method for estimating the MAP using Optim.jl.

using Turing

# Define the simple gdemo model.
@model gdemo(x, y) = begin
    s ~ InverseGamma(2,3)
    m ~ Normal(0,sqrt(s))
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
    return s, m
end

# Define our data points.
x = 1.5
y = 2.0

# Set up the model call, sample from the prior.
model = gdemo(x, y)
vi = Turing.VarInfo()
model(vi, Turing.SampleFromPrior())

# Define a function to optimize.
function nlogp(sm)
    vi.vals .= sm
    model(vi, Turing.SampleFromPrior())
    -vi.logp
end

# Import Optim.jl.
using Optim

# Create a starting point, call the optimizer.
sm_0 = Float64.(vi.vals)
lb = [0.0, -Inf]
ub = [Inf, Inf]
result = optimize(nlogp, lb, ub, sm_0, Fminbox())

@goedman
Copy link

goedman commented Nov 24, 2018

Thank you Cameron ( @cpfiffer ).

I have been experimenting a bit with creating a function for the lower part. Would you mind taking a quick peek as a sanity check?

using Turing, Optim

function maximum_a_posteriori(model, lower_bound, upper_bound)
  
  vi = Turing.VarInfo()
  model(vi, Turing.SampleFromPrior())
  
  # Define a function to optimize.
  function nlogp(sm)
    vi.vals .= sm
    model(vi, Turing.SampleFromPrior())
    -vi.logp
  end

  start_value = Float64.(vi.vals)
  optimize((v)->nlogp(v), lower_bound, upper_bound, start_value, Fminbox())
end


# Define the simple gdemo model.
@model gdemo(x, y) = begin
    s ~ InverseGamma(2,3)
    m ~ Normal(0,sqrt(s))
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
    return s, m
end

# Define our data points.
x = 1.5
y = 2.0

# Set search bounds
lb = [0.0, -Inf]
ub = [Inf, Inf]

model = gdemo(x, y)

result = maximum_a_posteriori(model, lb, ub)
println()

result.minimizer

which returns:

julia> include("/Users/rob/.julia/dev/StatisticalRethinking/chapters/02/map/map05.jl")
loaded
[ Info:  Assume - `s` is a parameter
[ Info:  Assume - `m` is a parameter
[ Info:  Observe - `x` is an observation
[ Info:  Observe - `y` is an observation

2-element Array{Float64,1}:
 0.9074074076939405
 1.1666666666666472

I assume the "loaded" comes from the Turing compiler?

@cpfiffer
Copy link
Member

Looks sane to me! The loaded line I believe comes from Julia when it loads some package.

@joshualeond
Copy link

Looks like this was removed from the docs. Has MAP been added to Turing yet? Cool to see how easy it is to implement with Optim though.

@goedman
Copy link

goedman commented Feb 15, 2019

Did you use it? I can easily put it back into the TuringModels.jl repo. Having CmdStan, Mamba, DynamicHMC and Turing all in a single project became simply too slow to work on. Let me know.

I typically use a log likelihood function + Optim. But getting it as rock-solid as in quap() in rethinking is tricky w.r.t. the initial values, in e.g. 04/clip-24-29s.jl and 10/m10.02d1.

In fact, I should put it back!

@cpfiffer
Copy link
Member

Strange. I seem to recall having put the MAP guide into the docs, but the history shows it never actually made it in. I'll add it (back?) to the docs section, at least until we have a more formal implementation for MAP estimation.

@joshualeond
Copy link

Hey @goedman, I hadn't used it before. I've been going through McElreath's book with R and thought it would be fun to reproduce it in Julia. Just stumbled across this issue after some searching. I think the DynamicHMC examples are really interesting to see.

@goedman
Copy link

goedman commented Feb 18, 2019

Hi @joshualeond, thank you.

I too have been very impressed by the performance and how to formulate the models in DynamicHMC. In fact I've started to use the logpdf formulations more often in CmdStan. At least one model in DynamicHMC do still run into (AD?) problems (m10.4d.jl is not correct).

I can't wait until (maybe early next year?) there are 2 additional (DynamicHMC and Turing) pure Julia options available for MCMC!

I did add the maximum_a_posterior example back in m2.1t.jl and in the master docs for TuringModels. I hadn't realized Cameron had documented it as well.

@yebai yebai mentioned this issue Jul 4, 2019
56 tasks
@yebai yebai added this to the 0.8 milestone Dec 17, 2019
@yebai
Copy link
Member Author

yebai commented Dec 16, 2021

Fixed by #1448

@yebai yebai closed this as completed Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants