Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add package scope page #44

Closed
wants to merge 15 commits into from
1 change: 1 addition & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ book:
- principles.qmd
- git-branching-merging.qmd
- code-review.qmd
- package-scope.qmd

format:
html:
Expand Down
55 changes: 55 additions & 0 deletions package-scope.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
title: "Package scope"
---

## Visualisation in R packages

The use of visualisation is a fundamental step in scientific analysis and a [core component of data science](https://r4ds.had.co.nz/introduction.html). However, the question of whether a package should include plotting functionality is not obvious.

::: {.callout-note title="Not all package plotting requires are equal"}

This section details the Epiverse-TRACE reasoning for the inclusion or exclusion of functionality around plotting. It is not a commentary generalisation to all R packages.

:::

## Epiverse-TRACE philosophy
chartgerink marked this conversation as resolved.
Show resolved Hide resolved

The Epiverse-TRACE philosophy is to only include minimal, dependency-free, plotting functionality, and to instead provide examples and templates of plotting using {ggplot2} in vignettes. An example of the inclusion of a plotting function is the [`<epidist>` plotting method](https://epiverse-trace.github.io/epiparameter/reference/plot.epidist.html) in {epiparameter}.

Epiverse-TRACE packages aim to provide comprehensive documentation that explains exported functions and demonstrates their applications, both through [{roxygen2}](https://roxygen2.r-lib.org/) function-specific documentation, or via longer-form vignettes. It is in the vignettes that Epiverse-TRACE packages show examples of how the data or model outputs can be visualised, with the associated code chunks required to produce the plot.

It is with this provision of plotting code chunks we hope to provide the community with a resource from which their desired plots can be produced. Vignettes use the {ggplot2} package to provide easily extensible plots with the addition of [layers](https://rpubs.com/hadley/ggplot2-layers).

Epiverse-TRACE packages are not developed uniformly and certain packages may deviate from this philosophy. In these cases the reasoning for the difference should be documented in that package's design vignette.

## Epiverse-TRACE reasoning

A benefit of including plotting functions is it enables package users to quickly inspect a feature of the package (e.g. model output) without having to write their own script or function. Despite this clear benefit of shipping plotting with a package there are several
drawbacks, and here we outline the reasons why only dependency-free plotting (also commonly referred to as base R plotting) is included in Epiverse-TRACE R packages.
chartgerink marked this conversation as resolved.
Show resolved Hide resolved

The most frequently used plotting package in R is [{ggplot2}](https://ggplot2.tidyverse.org/index.html). This package implements the grammar of graphics and provides users with a extensive and flexible set of tools to build custom plots. These are the reasons it is used for plotting in package vignettes. However, in
order for an R package to export plotting functions that use {ggplot2}, it must list this package
as a [dependency](https://cran.r-project.org/doc/manuals/R-exts.html#Package-Dependencies). {ggplot2} is a "heavy" package, not so much from the size of the code base (as CRAN limits the size of packages to 5MB), but due to it's own dependencies.

```{r, pkg-deps, eval=FALSE}
tools::package_dependencies(packages = "ggplot2", recursive = FALSE)

tools::package_dependencies(packages = "ggplot2", recursive = TRUE)
```

The relatively large number of dependencies introduces a couple of issues:

1. Breaking changes in, or deprecation of, {ggplot2} or one of it's dependencies may cause breaking changes in Epiverse-TRACE packages hampering sustainable package maintenance.
chartgerink marked this conversation as resolved.
Show resolved Hide resolved
2. If depended on, {ggplot2} would potentially install it's own dependencies on a user's device.

## Alternative options

There is a middle ground between providing users with dependency laden plotting functions, and relying on them to write their own or copy from documentation. This is to [extend {ggplot2}](https://ggplot2.tidyverse.org/articles/extending-ggplot2.html) by providing custom `geom` objects (for example [{ggtree}](https://github.com/YuLab-SMU/ggtree)) or `scales` objects (no examples outside of {ggplot2} exist for these). At present, there are no Epiverse-TRACE packages using this strategy.

Another option is to provide a separate R package whose sole purpose is to provide plotting functions. This gives the user the choice to install the plotting package if deemed necessary. This has been found to be a good option to enhance the maintenance of the core package and it's plotting accessory package. A marginal downside of this approach is requiring the user to install and attach both packages, as well as the assumption that they know the plotting package exists. There are currently no plotting-specific packages within Epiverse-TRACE.

## Flexible philosophy

Lastly, most Epiverse-TRACE R packages are under active development and what is considered optimal and discussed above may evolve over time. User feedback and user training may elucidate a need that was overlooked in initial development and the inclusion or exclusion of dependencies may change.

---