Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostic plotting with ArviZ.jl #113

Closed
kskyten opened this issue Nov 22, 2019 · 4 comments
Closed

Diagnostic plotting with ArviZ.jl #113

kskyten opened this issue Nov 22, 2019 · 4 comments

Comments

@kskyten
Copy link

kskyten commented Nov 22, 2019

ArviZ.jl is a package for exploratory analysis of Bayesian models and has many useful diagnostics plots. I'm trying to convert the diagnostics from DynamicHMC into a compatible format (See also: arviz-devs/ArviZ.jl#4 and arviz-devs/ArviZ.jl#25). Here is the code that I currently have

using DynamicHMC

function diverging(tree_stat)
    return DynamicHMC.Diagnostics.is_divergent(tree_stat.termination)
end

"Compute the sample stats for ArviZ from the output of mcmc_with_warmup."
function sample_stats(results, with_warmup=true)
    treestats = results.tree_statistics
    if with_warmup == true
        stats = Dict(
            "lp" => getfield.(treestats, ),
            "tune" => fill(false, length(results.chain)),
            "depth" => getfield.(treestats, :depth),
            "tree_size" => getfield.(treestats, :steps),
            "mean_tree_accept" => getfield.(treestats, :acceptance_rate),
            "diverging" => diverging.(treestats),
            "energy" => .- getfield.(treestats, )
            # "step_size" =>,
            # "step_size_bar" =>,
            # "energy_error" =>,
            # "max_energy_error" =>
        )
    end
    return stats
end

The specification for sample_stats is not properly documented (https://arviz-devs.github.io/arviz/schema/schema.html#sample-stats), so I'm not certain if the code is correct. @sethaxen clarified the fields in a Slack discussion:

Those stats are more or less copied from PyMC3's so that might clarify: https://docs.pymc.io/api/inference.html#pymc3.step_methods.hmc.nuts.NUTS.
step_size will always be the “unjittered” step size. Stan internally calls it the “nom_step_size” (nominal). step_size_bar is a parameter of the step size adaptation and is only relevant during warmup. AFAIK, Stan doesn’t return this, nor does Turing; only PyMC3 does. tree_size is the number of leapfrogs taken before acceptance. It’s usually close to but less than 2^depth. energy_error is the difference between the initial energy and energy of the proposed point on the trajectory (in a perfect integrator, the error is 0 over the whole trajectory). max_energy_error is the same quantity but over the entire trajectory. mean_tree_accept is the mean acceptance ratio over the entire proposed balanced binary tree.

How would I compute the remaining fields? Would it make sense to include this code in DynamicHMC? It doesn't add any dependencies, but perhaps there is a better solution for the sampler outputs that could be standardized (see TuringLang/AdvancedHMC.jl#101).

@tpapp
Copy link
Owner

tpapp commented Nov 22, 2019

I am not familiar with PyMC3, this package is not related to that in any way (except, I assume, in the sense that both impement NUTS), so I am not sure what you are asking for. But fields so far look OK.

If the step size refers to the leapfrog step, that is called ϵ in this package. It is not returned for every sample, since it only changes during the warmup. Then it is returned as a vector, see warmup in mcmc.jl. I don't know what the other quantities are, but if you explain them I will be happy to help.

(note that most of the code in this package follows the Betancourt paper, so reading that may make it easier to understand).

@kskyten
Copy link
Author

kskyten commented Nov 23, 2019

ArviZ is agnostic to the sampler, so it is not really related to PyMC3 either (except for being compatible with its outputs). However, they based their abstractions on PyMC3, which is currently documented better.

I think I have everything else figured out except for how to compute energy_error and max_energy_error. energy_error is the difference between the energies of the initial point and the proposed (and accepted) point on the Hamiltonian trajectory. max_energy_error is the maximum value of this difference over the whole trajectory.

@tpapp
Copy link
Owner

tpapp commented Nov 24, 2019

energy_error is the difference between the energies of the initial point and the proposed (and accepted) point on the Hamiltonian trajectory.

You can calculate this as the difference between subsequent π fields of TreeStatisticsNUTS.

max_energy_error is the maximum value of this difference over the whole trajectory.

This is not saved. I don't know how this is used, but if you link the some literature on this I would consider adding it (please open another issue for this though).

@tpapp
Copy link
Owner

tpapp commented Feb 9, 2021

Closing for lack of activity and/or underspecified feature request. Feel free to ping here if you want to reopen.

@tpapp tpapp closed this as completed Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants