Skip to content

Commit 4825447

Browse files
horacegrlouf
authored andcommitted
Create documentation and github workflow (blackjax-devs#155)
1 parent 6f728a9 commit 4825447

18 files changed

+383
-62
lines changed

.github/workflows/build_doc.yml

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
name: Publish docs
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
8+
jobs:
9+
build-and-deploy:
10+
runs-on: ubuntu-latest
11+
steps:
12+
- name: Checkout
13+
uses: actions/[email protected]
14+
with:
15+
persist-credentials: false
16+
17+
- name: Set up Python 3.9
18+
uses: actions/setup-python@v1
19+
with:
20+
python-version: 3.9
21+
22+
- name: Build docs
23+
run: |
24+
pip install -r requirements-dev.txt
25+
sphinx-build -b html docs docs/build/html
26+
27+
- name: Publish docs
28+
uses: JamesIves/[email protected]
29+
with:
30+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
31+
BRANCH: gh-pages
32+
FOLDER: docs/build/html
33+
CLEAN: true

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ pip-delete-this-directory.txt
4444

4545
# Sphinx documentation
4646
docs/_build/
47+
docs/_autosummary
4748

4849
# pyenv
4950
.python-version

Makefile

+4
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,7 @@ test:
88
publish:
99
git tag -a $(LIB_VERSION) -m $(LIB_VERSION)
1010
git push --tag
11+
12+
13+
build-docs:
14+
sphinx-build -b html docs docs/_build/html

blackjax/adaptation/step_size.py

+21-20
Original file line numberDiff line numberDiff line change
@@ -63,8 +63,8 @@ def dual_averaging_adaptation(
6363
the error at time t. We would like to find a procedure that adapts the
6464
value of :math:`\\epsilon` such that :math:`h(x) =\\mathbb{E}\\left[H_t|\\epsilon\\right] = 0`
6565
66-
Following [1]_, the authors of [2]_ proposed the following update scheme. If
67-
we note :math:``x = \\log \\epsilon` we follow:
66+
Following [Nesterov2009]_, the authors of [Hoffman2014]_ proposed the following update scheme. If
67+
we note :math:`x = \\log \\epsilon` we follow:
6868
6969
.. math:
7070
x_{t+1} \\LongLeftArrow \\mu - \\frac{\\sqrt{t}}{\\gamma} \\frac{1}{t+t_0} \\sum_{i=1}^t H_i
@@ -74,21 +74,21 @@ def dual_averaging_adaptation(
7474
:math:`h(\\overline{x}_t)` converges to 0, i.e. the Metropolis acceptance
7575
rate converges to the desired rate.
7676
77-
See reference [2]_ (section 3.2.1) for a detailed discussion.
77+
See reference [Hoffman2014]_ (section 3.2.1) for a detailed discussion.
7878
7979
Parameters
8080
----------
8181
t0: float >= 0
8282
Free parameter that stabilizes the initial iterations of the algorithm.
83-
Large values may slow down convergence. Introduced in [2]_ with a default
83+
Large values may slow down convergence. Introduced in [Hoffman2014]_ with a default
8484
value of 10.
85-
gamma
86-
Controls the speed of convergence of the scheme. The authors of [2]_ recommend
85+
gamma:
86+
Controls the speed of convergence of the scheme. The authors of [Hoffman2014]_ recommend
8787
a value of 0.05.
8888
kappa: float in ]0.5, 1]
8989
Controls the weights of past steps in the current update. The scheme will
9090
quickly forget earlier step for a small value of `kappa`. Introduced
91-
in [2]_, with a recommended value of .75
91+
in [Hoffman2014]_, with a recommended value of .75
9292
target:
9393
Target acceptance rate.
9494
@@ -102,11 +102,11 @@ def dual_averaging_adaptation(
102102
References
103103
----------
104104
105-
.. [1]: Nesterov, Yurii. "Primal-dual subgradient methods for convex
105+
.. [Nesterov2009] Nesterov, Yurii. "Primal-dual subgradient methods for convex
106106
problems." Mathematical programming 120.1 (2009): 221-259.
107-
.. [2]: Hoffman, Matthew D., and Andrew Gelman. "The No-U-Turn sampler:
108-
adaptively setting path lengths in Hamiltonian Monte Carlo." Journal
109-
of Machine Learning Research 15.1 (2014): 1593-1623.
107+
.. [Hoffman2014] Hoffman, Matthew D., and Andrew Gelman. "The No-U-Turn sampler:
108+
adaptively setting path lengths in Hamiltonian Monte Carlo." Journal
109+
of Machine Learning Research 15.1 (2014): 1593-1623.
110110
"""
111111
da_init, da_update, da_final = optimizers.dual_averaging(t0, gamma, kappa)
112112

@@ -183,7 +183,7 @@ def find_reasonable_step_size(
183183
value for the step size starting from any value, choosing a good first
184184
value can speed up the convergence. This heuristics doubles and halves the
185185
step size until the acceptance probability of the HMC proposal crosses the
186-
target value.
186+
target value [Hoffman2014]_.
187187
188188
Parameters
189189
----------
@@ -208,11 +208,12 @@ def find_reasonable_step_size(
208208
float
209209
A reasonable first value for the step size.
210210
211-
Reference
212-
---------
213-
.. [1]: Hoffman, Matthew D., and Andrew Gelman. "The No-U-Turn sampler:
214-
adaptively setting path lengths in Hamiltonian Monte Carlo." Journal
215-
of Machine Learning Research 15.1 (2014): 1593-1623.
211+
References
212+
----------
213+
.. [Hoffman2014] Hoffman, Matthew D., and Andrew Gelman. "The No-U-Turn sampler:
214+
adaptively setting path lengths in Hamiltonian Monte Carlo." Journal
215+
of Machine Learning Research 15.1 (2014): 1593-1623.
216+
216217
"""
217218
fp_limit = jnp.finfo(jax.lax.dtype(initial_step_size))
218219

@@ -228,9 +229,9 @@ def do_continue(rss_state: ReasonableStepSizeState) -> bool:
228229
occur any performance penalty when calling it repeatedly inside this
229230
function.
230231
231-
Reference
232-
---------
233-
.. [1]: jax.numpy.finfo documentation. https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.finfo.html
232+
References
233+
----------
234+
.. [1] jax.numpy.finfo documentation. https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.finfo.html
234235
235236
"""
236237
_, direction, previous_direction, step_size = rss_state

blackjax/diagnostics.py

+9-9
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,8 @@ def potential_scale_reduction(
4141
4242
References
4343
----------
44-
.. [1]: https://mc-stan.org/docs/2_27/reference-manual/notation-for-samples-chains-and-draws.html#potential-scale-reduction
45-
.. [2]: Gelman, Andrew, and Donald B. Rubin. (1992) “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7 (4): 457–72.
44+
.. [1] https://mc-stan.org/docs/2_27/reference-manual/notation-for-samples-chains-and-draws.html#potential-scale-reduction
45+
.. [2] Gelman, Andrew, and Donald B. Rubin. (1992) “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7 (4): 457–72.
4646
4747
"""
4848
assert (
@@ -95,19 +95,19 @@ def effective_sample_size(
9595
.. math:: \\hat{\\tau} = -1 + 2 \\sum_{t'=0}^K \\hat{P}_{t'}
9696
9797
where :math:`M` is the number of chains, :math:`N` the number of draws,
98-
:math:`\\hat{\rho}_t` is the estimated _autocorrelation at lag :math:`t`, and
99-
:math:`K` is the last integer for which :math:`\\hat{P}_{K} = \\hat{\rho}_{2K} +
100-
\\hat{\rho}_{2K+1}` is still positive.
98+
:math:`\\hat{\\rho}_t` is the estimated _autocorrelation at lag :math:`t`, and
99+
:math:`K` is the last integer for which :math:`\\hat{P}_{K} = \\hat{\\rho}_{2K} +
100+
\\hat{\\rho}_{2K+1}` is still positive.
101101
102102
The current implementation is similar to Stan, which uses Geyer's initial monotone sequence
103103
criterion (Geyer, 1992; Geyer, 2011).
104104
105105
References
106106
----------
107-
.. [1]: https://mc-stan.org/docs/2_27/reference-manual/effective-sample-size-section.html
108-
.. [2]: Gelman, Andrew, J. B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. (2013). Bayesian Data Analysis. Third Edition. Chapman; Hall/CRC.
109-
.. [3]: Geyer, Charles J. (1992). “Practical Markov Chain Monte Carlo.” Statistical Science, 473–83.
110-
.. [4]: Geyer, Charles J. (2011). “Introduction to Markov Chain Monte Carlo.” In Handbook of Markov Chain Monte Carlo, edited by Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng, 3–48. Chapman; Hall/CRC.
107+
.. [1] https://mc-stan.org/docs/2_27/reference-manual/effective-sample-size-section.html
108+
.. [2] Gelman, Andrew, J. B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. (2013). Bayesian Data Analysis. Third Edition. Chapman; Hall/CRC.
109+
.. [3] Geyer, Charles J. (1992). “Practical Markov Chain Monte Carlo.” Statistical Science, 473–83.
110+
.. [4] Geyer, Charles J. (2011). “Introduction to Markov Chain Monte Carlo.” In Handbook of Markov Chain Monte Carlo, edited by Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng, 3–48. Chapman; Hall/CRC.
111111
112112
"""
113113
input_shape = input_array.shape

blackjax/nuts.py

+24-23
Original file line numberDiff line numberDiff line change
@@ -67,13 +67,36 @@ def kernel(
6767
) -> Callable:
6868
"""Build an iterative NUTS kernel.
6969
70+
This algorithm is an iteration on the original NUTS algorithm [Hoffman2014]_ with two major differences:
71+
- We do not use slice samplig but multinomial sampling for the proposal [Betancourt2017]_;
72+
- The trajectory expansion is not recursive but iterative [Phan2019]_, [Lao2020]_.
73+
74+
The implementation can seem unusual for those familiar with similar
75+
algorithms. Indeed, we do not conceptualize the trajectory construction as
76+
building a tree. We feel that the tree lingo, inherited from the recursive
77+
version, is unnecessarily complicated and hides the more general concepts
78+
on which the NUTS algorithm is built.
79+
80+
NUTS, in essence, consists in sampling a trajectory by iteratively choosing
81+
a direction at random and integrating in this direction a number of times
82+
that doubles at every step. From this trajectory we continuously sample a
83+
proposal. When the trajectory turns on itself or when we have reached the
84+
maximum trajectory length we return the current proposal.
85+
7086
Parameters
7187
----------
7288
logprob_fb
7389
Log probability function we wish to sample from.
7490
parameters
7591
A NamedTuple that contains the parameters of the kernel to be built.
7692
93+
References
94+
----------
95+
.. [Hoffman2014] Hoffman, Matthew D., and Andrew Gelman. "The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo." J. Mach. Learn. Res. 15.1 (2014): 1593-1623.
96+
.. [Betancourt2017] Betancourt, Michael. "A conceptual introduction to Hamiltonian Monte Carlo." arXiv preprint arXiv:1701.02434 (2017).
97+
.. [Phan2019] Phan, Du, Neeraj Pradhan, and Martin Jankowiak. "Composable effects for flexible and accelerated probabilistic programming in NumPyro." arXiv preprint arXiv:1912.11554 (2019).
98+
.. [Lao2020] Lao, Junpeng, et al. "tfp. mcmc: Modern markov chain monte carlo tools built for modern hardware." arXiv preprint arXiv:2002.01184 (2020).
99+
77100
"""
78101

79102
def potential_fn(x):
@@ -105,23 +128,7 @@ def iterative_nuts_proposal(
105128
max_num_expansions: int = 10,
106129
divergence_threshold: float = 1000,
107130
) -> Callable:
108-
"""Iterative NUTS algorithm.
109-
110-
This algorithm is an iteration on the original NUTS algorithm [1]_ with two major differences:
111-
- We do not use slice samplig but multinomial sampling for the proposal [2]_;
112-
- The trajectory expansion is not recursive but iterative [3,4]_.
113-
114-
The implementation can seem unusual for those familiar with similar
115-
algorithms. Indeed, we do not conceptualize the trajectory construction as
116-
building a tree. We feel that the tree lingo, inherited from the recursive
117-
version, is unnecessarily complicated and hides the more general concepts
118-
on which the NUTS algorithm is built.
119-
120-
NUTS, in essence, consists in sampling a trajectory by iteratively choosing
121-
a direction at random and integrating in this direction a number of times
122-
that doubles at every step. From this trajectory we continuously sample a
123-
proposal. When the trajectory turns on itself or when we have reached the
124-
maximum trajectory length we return the current proposal.
131+
"""Iterative NUTS proposal.
125132
126133
Parameters
127134
----------
@@ -142,12 +149,6 @@ def iterative_nuts_proposal(
142149
-------
143150
A kernel that generates a new chain state and information about the transition.
144151
145-
References
146-
----------
147-
.. [1]: Hoffman, Matthew D., and Andrew Gelman. "The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo." J. Mach. Learn. Res. 15.1 (2014): 1593-1623.
148-
.. [2]: Betancourt, Michael. "A conceptual introduction to Hamiltonian Monte Carlo." arXiv preprint arXiv:1701.02434 (2017).
149-
.. [3]: Phan, Du, Neeraj Pradhan, and Martin Jankowiak. "Composable effects for flexible and accelerated probabilistic programming in NumPyro." arXiv preprint arXiv:1912.11554 (2019).
150-
.. [4]: Lao, Junpeng, et al. "tfp. mcmc: Modern markov chain monte carlo tools built for modern hardware." arXiv preprint arXiv:2002.01184 (2020).
151152
"""
152153
(
153154
new_termination_state,

blackjax/stan_warmup.py

+2-3
Original file line numberDiff line numberDiff line change
@@ -115,12 +115,11 @@ def stan_warmup(
115115
116116
Schematically:
117117
118-
```
119118
+---------+---+------+------------+------------------------+------+
120119
| fast | s | slow | slow | slow | fast |
121120
+---------+---+------+------------+------------------------+------+
122-
1 2 3 3 3 3
123-
```
121+
|1 |2 |3 |3 |3 |3 |
122+
+---------+---+------+------------+------------------------+------+
124123
125124
Step (1) consists in find a "reasonable" first step size that is used to
126125
initialize the dual averaging scheme. In (2) we initialize the mass matrix

blackjax/tempered_smc.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ def tempered_smc(
116116
117117
Tempered SMC uses tempering to sample from a distribution given by
118118
119-
:math..
119+
.. math::
120120
p(x) \\propto p_0(x) \\exp(-V(x)) \\mathrm{d}x
121121
122122
where :math:`p_0` is the prior distribution, typically easy to sample from and for which the density

blackjax/types.py

+4
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,14 @@
33
import jax.numpy as jnp
44
import numpy as np
55

6+
#: JAX or Numpy array
67
Array = Union[np.ndarray, jnp.ndarray]
8+
9+
#: JAX PyTrees
710
PyTree = Union[Array, Iterable[Array], Mapping[Any, Array]]
811
# It is not currently tested but we also support recursive PyTrees.
912
# Once recursive typing is fully supported (https://github.com/python/mypy/issues/731), we can uncomment the line below.
1013
# PyTree = Union[Array, Iterable["PyTree"], Mapping[Any, "PyTree"]]
1114

15+
#: JAX PRNGKey
1216
PRNGKey = jnp.ndarray

docs/_static/blackjax.png

131 KB
Loading

docs/_static/custom.css

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#site-navigation{
2+
background: #E6E7E8;
3+
h1.site-logo {
4+
font-weight: bold;
5+
}
6+
}

docs/api.rst

+86
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
Common Kernels
2+
==============
3+
4+
.. currentmodule:: blackjax
5+
6+
.. autosummary::
7+
8+
hmc
9+
nuts
10+
rmh
11+
tempered_smc
12+
13+
HMC
14+
~~~
15+
16+
.. automodule:: blackjax.hmc
17+
:members: HMCInfo, kernel, new_state
18+
19+
NUTS
20+
~~~~
21+
22+
.. automodule:: blackjax.nuts
23+
:members: NUTSInfo, kernel, new_state
24+
25+
RMH
26+
~~~
27+
28+
.. automodule:: blackjax.rmh
29+
:members:
30+
:undoc-members:
31+
32+
Tempered SMC
33+
~~~~~~~~~~~~
34+
35+
.. automodule:: blackjax.tempered_smc
36+
:members: TemperedSMCState, adaptive_tempered_smc, tempered_smc
37+
38+
39+
Adaptation
40+
==========
41+
42+
43+
Stan full warmup
44+
~~~~~~~~~~~~~~~~
45+
46+
.. currentmodule:: blackjax
47+
48+
.. automodule:: blackjax.stan_warmup
49+
:members: run
50+
51+
Step-size adataptation
52+
~~~~~~~~~~~~~~~~~~~~~~
53+
54+
.. currentmodule:: blackjax.adaptation.step_size
55+
56+
.. autofunction:: dual_averaging_adaptation
57+
58+
.. autofunction:: find_reasonable_step_size
59+
60+
Mass matrix adataptation
61+
~~~~~~~~~~~~~~~~~~~~~~~~
62+
63+
.. currentmodule:: blackjax.adaptation.mass_matrix
64+
65+
.. autofunction:: mass_matrix_adaptation
66+
67+
Diagnostics
68+
===========
69+
70+
.. currentmodule:: blackjax.diagnostics
71+
72+
.. autosummary::
73+
74+
effective_sample_size
75+
potential_scale_reduction
76+
77+
Effective sample size
78+
~~~~~~~~~~~~~~~~~~~~~
79+
80+
.. autofunction:: effective_sample_size
81+
82+
83+
Potential scale reduction
84+
~~~~~~~~~~~~~~~~~~~~~~~~~
85+
86+
.. autofunction:: potential_scale_reduction

0 commit comments

Comments
 (0)