What? Pytest plugin for testing and releasing notebook documentation
Why? To raise the quality of scientific material through better automation
Who is this for? Research/Machine Learning Software Engineers who maintain packages/teaching materials with documentation written in notebooks.
- Executes notebooks using pytest and nbclient, allowing parallel notebook testing
- Optionally writes back to the repo, allowing faster building of nbsphinx or jupyter book docs
If you have a notebook that runs interactively using an ipython kernel, you can try testing it automatically as follows:
pip install pytest nbmake
pytest --nbmake **/*ipynb
You can configure the cell timeout with the following pytest flag:
pytest --nbmake --nbmake-timeout=3000 # allows each cell 3000 seconds to finish
This configuration must be placed in the notebook's top-level metadata (not cell-level metadata).
Your notebook should look like this:
{
"cells": [ ... ],
"metadata": {
"kernelspec": { ... },
"execution": {
"allow_errors": true,
"timeout": 300
}
}
}
A cell with the following metadata can throw an exception without failing the test:
"metadata": {
"tags": [
"raises-exception"
]
}
A cell with the following metadata will not be executed by nbmake
{
"language": "python",
"custom": {
"metadata": {
"tags": [
"skip-execution"
]
}
}
}
Regardless of the kernel configured in the notebook JSON, you can force nbmake to use a specific kernel when testing:
pytest --nbmake --nbmake-kernel=mycustomkernel
If you are not using the flag above and are using a kernel name other than the default βpython3β, you will see an error message when executing your notebooks in a fresh CI environment: Error - No such kernel: 'mycustomkernel'
Use ipykernel to install the custom kernel:
python -m ipykernel install --user --name mycustomkernel
If you are using another language such as c++ in your notebooks, you may have a different process for installing your kernel.
For repos containing a large number of notebooks that run slowly, you can run each notebook
in parallel using pytest-xdist
.
pip install pytest-xdist
pytest --nbmake -n=auto
It is also possible to parallelise at a CI-level using strategies, see example
Using xdist and the --overwrite
flag let you build a large jupyter book repo faster:
pytest --nbmake --overwrite -n=auto examples
jb build examples
It's not always feasible to get notebooks running from top to bottom from the start.
You can however, use nbmake to check that there are no ModuleNotFoundError
s:
pytest \
--nbmake \
--nbmake-find-import-errors \ # Ignore all errors except ModuleNotFoundError
--nbmake-timeout=20 # Skip past cells longer than 20s
If your notebook runs a training process that takes a long time to run, you can use nbmake's mocking feature to overwrite variables after a cell runs:
{
"cells": [
...,
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbmake": {
"mock": {
// these keys will override global variables after this cell runs
"epochs": 2,
"config": "/test/config.json",
"args": {
"env": "test"
}
}
}
},
"outputs": [],
"source": [
"epochs = 10\n",
"..."
]
},
...
],
...
}
You can fetch CI secrets and run assertions after any cell by putting scripts in the cell metadata under nbmake.post_cell_execute
:
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbmake": {
"post_cell_execute": [
"y = 3",
"z = x+y"
]
}
},
"outputs": [],
"source": [
"x = 1\n",
"y = 2\n",
"z = 0\n",
"# this cell has a post_cell_execute that assigns y and z"
]
},
nbmake is best used in a scenario where you use the ipynb files only for development. Consumption of notebooks is primarily done via a docs site, built through jupyter book, nbsphinx, or some other means. If using one of these tools, you are able to write assertion code in cells which will be hidden from readers.
Treating notebooks like source files lets you keep your repo minimal. Some tools, such as plotly may drop several megabytes of javascript in your output cells, as a result, stripping out notebooks on pre-commit is advisable:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/kynan/nbstripout
rev: master
hooks:
- id: nbstripout
See https://pre-commit.com/ for more...
Implicitly:
pytest
Explicitly:
pytest -p no:nbmake
- A more in-depth intro to nbmake running on Semaphore CI
- nbmake action
- pytest
- jupyter book
- jupyter cache
- MyST-NB