This repository contains the code developed for the paper "Data-driven personalised recommendations for eczema treatment using a Bayesian model of severity dynamics" (submitted to publication, preprint here).
The code is written in the R language for statistical computing and the probabilistic programming language Stan for the models.
This repository is organized as a research compendium, with a similar structure as R packages. Nevertheless, this project is not a literal package with a DESCRIPTION file but prefers to use renv to manage package dependencies (see Reproduciblity section) and git tags for version control.
Functions specific to this project are located in the R/
directory.
All Stan models are implemented using a single Stan file, FullModel.stan, with optional parameters that can be switched on and off for evaluating the contribution of the different model components.
The model is manipulated using ScoradPred
objects (inherinting from the class EczemaModel
defined in EczemaPred).
The analysis code is located in the analysis/
directory:
-
01_check_models.R
: conduct prior predictive checks and fake data checks. -
02_run_fit.R
: fit the model to data. -
03_check_fit.Rmd
: diagnose fit, inspect posterior, posterior predictive checks. -
04a_run_validation.R
: run the validation process (forward chaining). -
04b_run_validation_reference.R
: run the validation process (forward chaining) for the reference (univariate) models. -
05_check_performance.Rmd
: analyse validation results, performance. -
06_analyse_recommendations.Rmd
: generate and analyse treatment recommendations. -
Scripts to generate figures for the paper:
In addition, generate_reports.R
renders reports from 03_check_fit.Rmd
and 05_check_performance.Rmd
for all models and severity items/scores.
view_reports.R
creates an HTML document to easily browse these reports.
- "ScoradPred" refers to the base model (independent state-space models for all severity items, defined by an ordered logistic measurement distribution and latent random walk dynamic).
- Modifications/Improvements of the "ScoradPred" model as referred to as "ScoradPred+improvement". For example, the model consisting of ordered logistic measurement distribution for all severity items and a latent multivariate random walk (i.e. modelling the correlations between changes of latent severity) is denoted as "ScoradPred+corr".
- The different flavours of the base ScoradPred model are implemented in a model class also called ScoradPred.
- The Stan file implementing all of these models is called "FullModel.stan".
This project uses renv to manage R package dependencies.
The details of the packages needed to reproduce the analysis is stored in renv.lock
and configuration files and the project library (ignored by git) is stored in renv/
.
After installing renv
itself (install.packages("renv")
), the project library can be restored by calling renv::restore(exclude = "TanakaData")
.
Note that this command explicitly avoid installing the TanakaData
package, a proprietary (unavailable) package containing the data used in this project.
The data is not available at the time of writing.
In addition, we provide a Dockerfile to fully reproduce the computational environment with Docker:
- building the image:
docker build . -t eczematreat
. In addition to installing R packages using renv, the Docker image will also install the correct version of R and system dependencies required to use Stan. - running the container
docker run -d --rm -p 1212:8787 -e ROOT=TRUE -e DISABLE_AUTH=true -v ${PWD}:/home/rstudio/EczemaTreat -v /home/rstudio/EczemaTreat/renv eczematreat
. This commands launch an RStudio server session (without authentication, giving the user access to root) accessible athttp://localhost:1212/
, while mounting the current directory into the container.
After that, to reproduce the analysis, we suggest to open the RStudio project (.Rproj
file) and runs the analysis scripts in the order indicated by their prefix.
Intermediate and output files are saved to a results/
directory.
NB: this project relies on EczemaPred version v0.3.0.
This open source version of this project is licensed under the GPLv3 license, which can be seen in the LICENSE file.