Diagnostic plotting with ArviZ.jl #113

kskyten · 2019-11-22T17:45:45Z

ArviZ.jl is a package for exploratory analysis of Bayesian models and has many useful diagnostics plots. I'm trying to convert the diagnostics from DynamicHMC into a compatible format (See also: arviz-devs/ArviZ.jl#4 and arviz-devs/ArviZ.jl#25). Here is the code that I currently have

using DynamicHMC

function diverging(tree_stat)
    return DynamicHMC.Diagnostics.is_divergent(tree_stat.termination)
end

"Compute the sample stats for ArviZ from the output of mcmc_with_warmup."
function sample_stats(results, with_warmup=true)
    treestats = results.tree_statistics
    if with_warmup == true
        stats = Dict(
            "lp" => getfield.(treestats, :π),
            "tune" => fill(false, length(results.chain)),
            "depth" => getfield.(treestats, :depth),
            "tree_size" => getfield.(treestats, :steps),
            "mean_tree_accept" => getfield.(treestats, :acceptance_rate),
            "diverging" => diverging.(treestats),
            "energy" => .- getfield.(treestats, :π)
            # "step_size" =>,
            # "step_size_bar" =>,
            # "energy_error" =>,
            # "max_energy_error" =>
        )
    end
    return stats
end

The specification for sample_stats is not properly documented (https://arviz-devs.github.io/arviz/schema/schema.html#sample-stats), so I'm not certain if the code is correct. @sethaxen clarified the fields in a Slack discussion:

Those stats are more or less copied from PyMC3's so that might clarify: https://docs.pymc.io/api/inference.html#pymc3.step_methods.hmc.nuts.NUTS.
step_size will always be the “unjittered” step size. Stan internally calls it the “nom_step_size” (nominal). step_size_bar is a parameter of the step size adaptation and is only relevant during warmup. AFAIK, Stan doesn’t return this, nor does Turing; only PyMC3 does. tree_size is the number of leapfrogs taken before acceptance. It’s usually close to but less than 2^depth. energy_error is the difference between the initial energy and energy of the proposed point on the trajectory (in a perfect integrator, the error is 0 over the whole trajectory). max_energy_error is the same quantity but over the entire trajectory. mean_tree_accept is the mean acceptance ratio over the entire proposed balanced binary tree.

How would I compute the remaining fields? Would it make sense to include this code in DynamicHMC? It doesn't add any dependencies, but perhaps there is a better solution for the sampler outputs that could be standardized (see TuringLang/AdvancedHMC.jl#101).

The text was updated successfully, but these errors were encountered:

tpapp · 2019-11-22T18:17:35Z

I am not familiar with PyMC3, this package is not related to that in any way (except, I assume, in the sense that both impement NUTS), so I am not sure what you are asking for. But fields so far look OK.

If the step size refers to the leapfrog step, that is called ϵ in this package. It is not returned for every sample, since it only changes during the warmup. Then it is returned as a vector, see warmup in mcmc.jl. I don't know what the other quantities are, but if you explain them I will be happy to help.

(note that most of the code in this package follows the Betancourt paper, so reading that may make it easier to understand).

kskyten · 2019-11-23T18:10:42Z

ArviZ is agnostic to the sampler, so it is not really related to PyMC3 either (except for being compatible with its outputs). However, they based their abstractions on PyMC3, which is currently documented better.

I think I have everything else figured out except for how to compute energy_error and max_energy_error. energy_error is the difference between the energies of the initial point and the proposed (and accepted) point on the Hamiltonian trajectory. max_energy_error is the maximum value of this difference over the whole trajectory.

tpapp · 2019-11-24T07:37:19Z

energy_error is the difference between the energies of the initial point and the proposed (and accepted) point on the Hamiltonian trajectory.

You can calculate this as the difference between subsequent π fields of TreeStatisticsNUTS.

max_energy_error is the maximum value of this difference over the whole trajectory.

This is not saved. I don't know how this is used, but if you link the some literature on this I would consider adding it (please open another issue for this though).

tpapp · 2021-02-09T16:15:07Z

Closing for lack of activity and/or underspecified feature request. Feel free to ping here if you want to reopen.

tpapp closed this as completed Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diagnostic plotting with ArviZ.jl #113

Diagnostic plotting with ArviZ.jl #113

kskyten commented Nov 22, 2019 •

edited

Loading

tpapp commented Nov 22, 2019

kskyten commented Nov 23, 2019

tpapp commented Nov 24, 2019

tpapp commented Feb 9, 2021

Diagnostic plotting with ArviZ.jl #113

Diagnostic plotting with ArviZ.jl #113

Comments

kskyten commented Nov 22, 2019 • edited Loading

tpapp commented Nov 22, 2019

kskyten commented Nov 23, 2019

tpapp commented Nov 24, 2019

tpapp commented Feb 9, 2021

kskyten commented Nov 22, 2019 •

edited

Loading