epiforecasts · nikosbosse · Nov 22, 2021 · Jul 23, 2021 · Jul 23, 2021 · Jul 23, 2021
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -72,3 +72,4 @@ BugReports: https://github.com/epiforecasts/scoringutils/issues
 VignetteBuilder: knitr
 Depends: 
     R (>= 2.10)
+Roxygen: list(markdown = TRUE)
diff --git a/R/absolute_error.R b/R/absolute_error.R
@@ -53,7 +53,7 @@ ae_median_sample <- function(true_values, predictions) {
 #' @param quantiles numeric vector that denotes the quantile for the values
 #' in `predictions`. Only those predictions where `quantiles == 0.5` will
 #' be kept. If `quantiles` is `NULL`, then all `predictions` and
-#' `true_values` will be used (this is then the same as `absolute_error()`)
+#' `true_values` will be used (this is then the same as [absolute_error()])
 #' @param verbose logical, return a warning is something unexpected happens
 #' @return vector with the scoring values
 #' @importFrom stats median

diff --git a/R/bias.R b/R/bias.R
@@ -34,7 +34,7 @@
 #' number of Monte Carlo samples
 #' @return vector of length n with the biases of the predictive samples with
 #' respect to the true values.
-#' @author Nikos Bosse \email{[email protected]}
+#' @author Nikos Bosse \email{nikosbosse@@gmail.com}
 #' @examples
 #'
 #' ## integer valued forecasts
@@ -51,8 +51,12 @@
 #' @export
 #' @references
 #' The integer valued Bias function is discussed in
-#' Assessing the performance of real-time epidemic forecasts: A case study of Ebola in the Western Area region of Sierra Leone, 2014-15
-#' Funk S, Camacho A, Kucharski AJ, Lowe R, Eggo RM, et al. (2019) Assessing the performance of real-time epidemic forecasts: A case study of Ebola in the Western Area region of Sierra Leone, 2014-15. PLOS Computational Biology 15(2): e1006785. https://doi.org/10.1371/journal.pcbi.1006785
+#' Assessing the performance of real-time epidemic forecasts: A case study of
+#' Ebola in the Western Area region of Sierra Leone, 2014-15 Funk S, Camacho A,
+#' Kucharski AJ, Lowe R, Eggo RM, et al. (2019) Assessing the performance of
+#' real-time epidemic forecasts: A case study of Ebola in the Western Area
+#' region of Sierra Leone, 2014-15. PLOS Computational Biology 15(2): e1006785.
+#' <doi:10.1371/journal.pcbi.1006785>
 
 
 bias <- function(true_values, predictions) {
@@ -160,7 +164,7 @@ bias <- function(true_values, predictions) {
 #' of the central prediction interval
 #' @param true_value a single true value
 #' @return scalar with the quantile bias for a single quantile prediction
-#' @author Nikos Bosse \email{[email protected]}
+#' @author Nikos Bosse \email{nikosbosse@@gmail.com}
 #' @examples
 #'
 #' lower <- c(6341.000, 6329.500, 6087.014, 5703.500,

diff --git a/R/eval_forecasts.R b/R/eval_forecasts.R
@@ -1,114 +1,111 @@
 #' @title Evaluate forecasts
 #'
-#' @description The function \code{eval_forecasts} is an easy to use wrapper
-#' of the lower level functions in the \code{scoringutils} package.
+#' @description The function `eval_forecasts` is an easy to use wrapper
+#' of the lower level functions in the \pkg{scoringutils} package.
 #' It can be used to score probabilistic or quantile forecasts of
 #' continuous, integer-valued or binary variables.
 #'
 #' @details the following metrics are used where appropriate:
 #' \itemize{
 #'   \item {Interval Score} for quantile forecasts. Smaller is better. See
-#'   \code{\link{interval_score}} for more information. By default, the
+#'   [interval_score()] for more information. By default, the
 #'   weighted interval score is used.
 #'   \item {Brier Score} for a probability forecast of a binary outcome.
-#'   Smaller is better. See \code{\link{brier_score}} for more information.
+#'   Smaller is better. See [brier_score()] for more information.
 #'   \item {aem} Absolute error of the median prediction
 #'   \item {Bias} 0 is good, 1 and -1 are bad.
-#'   See \code{\link{bias}} for more information.
-#'   \item {Sharpness} Smaller is better. See \code{\link{sharpness}} for more
+#'   See [bias()] for more information.
+#'   \item {Sharpness} Smaller is better. See [sharpness()] for more
 #'   information.
 #'   \item {Calibration} represented through the p-value of the
 #'   Anderson-Darling test for the uniformity of the Probability Integral
 #'   Transformation (PIT). For integer valued forecasts, this p-value also
 #'   has a standard deviation. Larger is better.
-#'   See \code{\link{pit}} for more information.
+#'   See [pit()] for more information.
 #'   \item {DSS} Dawid-Sebastiani-Score. Smaller is better.
-#'   See \code{\link{dss}} for more information.
+#'   See [dss()] for more information.
 #'   \item {CRPS} Continuous Ranked Probability Score. Smaller is better.
-#'   See \code{\link{crps}} for more information.
+#'   See [crps()] for more information.
 #'   \item {Log Score} Smaller is better. Only for continuous forecasts.
-#'   See \code{\link{logs}} for more information.
+#'   See [logs()] for more information.
 #' }
 #'
 #' @param data A data.frame or data.table with the predictions and observations.
 #' Note: it is easiest to have a look at the example files provided in the
 #' package and in the examples below.
 #' The following columns need to be present:
 #' \itemize{
-#'   \item \code{true_value} - the true observed values
-#'   \item \code{prediction} - predictions or predictive samples for one
+#'   \item `true_value` - the true observed values
+#'   \item `prediction` - predictions or predictive samples for one
 #'   true value. (You only don't need to provide a prediction column if
 #'   you want to score quantile forecasts in a wide range format.)}
-#' For integer and continuous forecasts a \code{sample} column is needed:
+#' For integer and continuous forecasts a `sample` column is needed:
 #' \itemize{
-#'   \item \code{sample} - an index to identify the predictive samples in the
+#'   \item `sample` - an index to identify the predictive samples in the
 #'   prediction column generated by one model for one true value. Only
 #'   necessary for continuous and integer forecasts, not for
 #'   binary predictions.}
 #' For quantile forecasts the data can be provided in variety of formats. You
 #' can either use a range-based format or a quantile-based format. (You can
-#' convert between formats using \code{\link{quantile_to_range_long}},
-#' \code{\link{range_long_to_quantile}},
-#' \code{\link{sample_to_range_long}},
-#' \code{\link{sample_to_quantile}})
+#' convert between formats using [quantile_to_range_long()],
+#' [range_long_to_quantile()],
+#' [sample_to_range_long()],
+#' [sample_to_quantile()])
 #' For a quantile-format forecast you should provide:
-#' \itemize{
-#'   \item {prediction} - prediction to the corresponding quantile
-#'   \item {quantile} - quantile to which the prediction corresponds}
+#'   - `prediction`: prediction to the corresponding quantile
+#'   - `quantile`: quantile to which the prediction corresponds
 #' For a range format (long) forecast you need
-#' \itemize{
-#'   \item \code{prediction} the quantile forecasts
-#'   \item \code{boundary} values should be either "lower" or "upper", depending
+#'   - `prediction`: the quantile forecasts
+#'   - `boundary`: values should be either "lower" or "upper", depending
 #'   on whether the prediction is for the lower or upper bound of a given range
-#'   \item {range} the range for which a forecast was made. For a 50\% interval
-#'   the range should be 50. The forecast for the 25\% quantile should have
-#'   the value in the \code{prediction} column, the value of \code{range}
-#'   should be 50 and the value of \code{boundary} should be "lower".
-#'   If you want to score the median (i.e. \code{range = 0}), you still
+#'   - `range` the range for which a forecast was made. For a 50%% interval
+#'   the range should be 50. The forecast for the 25%% quantile should have
+#'   the value in the `prediction` column, the value of `range`
+#'   should be 50 and the value of `boundary` should be "lower".
+#'   If you want to score the median (i.e. `range = 0`), you still
 #'   need to include a lower and an upper estimate, so the median has to
-#'   appear twice.}
+#'   appear twice.
 #' Alternatively you can also provide the format in a wide range format.
-#' This format needs
-#' \itemize{
-#'   \item pairs of columns called something like 'upper_90' and 'lower_90',
+#' This format needs:
+#'   - pairs of columns called something like 'upper_90' and 'lower_90',
 #'   or 'upper_50' and 'lower_50', where the number denotes the interval range.
-#'   For the median, you need to provide columns called 'upper_0' and 'lower_0'}
+#'   For the median, you need to provide columns called 'upper_0' and 'lower_0'
 #' @param by character vector of columns to group scoring by. This should be the
 #' lowest level of grouping possible, i.e. the unit of the individual
 #' observation. This is important as many functions work on individual
 #' observations. If you want a different level of aggregation, you should use
-#' \code{summarise_by} to aggregate the individual scores.
-#' Also not that the pit will be computed using \code{summarise_by}
-#' instead of \code{by}
+#' `summarise_by` to aggregate the individual scores.
+#' Also not that the pit will be computed using `summarise_by`
+#' instead of `by`
 #' @param summarise_by character vector of columns to group the summary by. By
 #' default, this is equal to `by` and no summary takes place.
 #' But sometimes you may want to to summarise
 #' over categories different from the scoring.
-#' \code{summarise_by} is also the grouping level used to compute
+#' `summarise_by` is also the grouping level used to compute
 #' (and possibly plot) the probability integral transform(pit).
 #' @param metrics the metrics you want to have in the output. If `NULL` (the
 #' default), all available metrics will be computed.
 #' @param quantiles numeric vector of quantiles to be returned when summarising.
 #' Instead of just returning a mean, quantiles will be returned for the
 #' groups specified through `summarise_by`. By default, no quantiles are
 #' returned.
-#' @param sd if TRUE (the default is FALSE) the standard deviation of all
+#' @param sd if `TRUE` (the default is `FALSE`) the standard deviation of all
 #' metrics will be returned when summarising.
-#' @param pit_plots if TRUE (not the default), pit plots will be returned. For
-#' details see \code{\link{pit}}.
+#' @param pit_plots if `TRUE` (not the default), pit plots will be returned. For
+#' details see [pit()].
 #' @param interval_score_arguments list with arguments for the calculation of
 #' the interval score. These arguments get passed down to
-#' \code{interval_score}, except for the argument `count_median_twice` that
+#' `interval_score`, except for the argument `count_median_twice` that
 #' controls how the interval scores for different intervals are summed up. This
-#' should be a logical (default is FALSE) that indicates whether or not
+#' should be a logical (default is `FALSE`) that indicates whether or not
 #' to count the median twice when summarising. This would conceptually treat the
 #' median as a 0\% prediction interval, where the median is the lower as well as
 #' the upper bound. The alternative is to treat the median as a single quantile
 #' forecast instead of an interval. The interval score would then
 #' be better understood as an average of quantile scores.)
 #' @param summarised Summarise arguments (i.e. take the mean per group
-#' specified in group_by. Default is TRUE.
-#' @param verbose print out additional helpful messages (default is TRUE)
+#' specified in group_by. Default is `TRUE.`
+#' @param verbose print out additional helpful messages (default is `TRUE`)
 #' @param forecasts data.frame with forecasts, that should follow the same
 #' general guidelines as the `data` input. Argument can be used to supply
 #' forecasts and truth data independently. Default is `NULL`.
@@ -118,9 +115,9 @@
 #' `truth_data` should be merged on. Default is `NULL` and merge will be
 #' attempted automatically.
 #' @param compute_relative_skill logical, whether or not to compute relative
-#' performance between models. If `TRUE` (default is FALSE), then a column called
+#' performance between models. If `TRUE` (default is `FALSE`), then a column called
 #' 'model' must be present in the input data. For more information on
-#' the computation of relative skill, see \code{\link{pairwise_comparison}}.
+#' the computation of relative skill, see [pairwise_comparison()].
 #' Relative skill will be calculated for the aggregation level specified in
 #' `summarise_by`.
 #' @param rel_skill_metric character string with the name of the metric for which
@@ -139,7 +136,7 @@
 #' forecasts, pit_sd is returned (to account for the randomised PIT),
 #' but no Log Score is returned (the internal estimation relies on a
 #' kernel density estimate which is difficult for integer-valued forecasts).
-#' If \code{summarise_by} is specified differently from \code{by},
+#' If `summarise_by` is specified differently from `by`,
 #' the average score per summary unit is returned.
 #' If specified, quantiles and standard deviation of scores can also be returned
 #' when summarising.
@@ -190,11 +187,11 @@
 #'                                      sd = TRUE,
 #'                                      summarise_by = c("model"))
 #'
-#' @author Nikos Bosse \email{[email protected]}
+#' @author Nikos Bosse \email{nikosbosse@@gmail.com}
 #' @references Funk S, Camacho A, Kucharski AJ, Lowe R, Eggo RM, Edmunds WJ
 #' (2019) Assessing the performance of real-time epidemic forecasts: A
 #' case study of Ebola in the Western Area region of Sierra Leone, 2014-15.
-#' PLoS Comput Biol 15(2): e1006785. <doi.org/10.1371/journal.pcbi.1006785>
+#' PLoS Comput Biol 15(2): e1006785. <doi:10.1371/journal.pcbi.1006785>
 #' @export
 
 eval_forecasts <- function(data = NULL,

diff --git a/R/eval_forecasts_binary.R b/R/eval_forecasts_binary.R
@@ -2,7 +2,7 @@
 #'
 #' @inheritParams eval_forecasts
 #' @return A data.table with appropriate scores. For more information see
-#' \code{\link{eval_forecasts}}
+#' [eval_forecasts()]
 #'
 #' @importFrom data.table ':='
 #'
@@ -14,7 +14,7 @@
 #'                                      quantiles = c(0.5), sd = TRUE,
 #'                                      verbose = FALSE)
 #'
-#' @author Nikos Bosse \email{[email protected]}
+#' @author Nikos Bosse \email{nikosbosse@@gmail.com}
 
 eval_forecasts_binary <- function(data,
                                   by,

diff --git a/R/eval_forecasts_continuous_integer.R b/R/eval_forecasts_continuous_integer.R
@@ -5,7 +5,7 @@
 #' @param prediction_type character, should be either "continuous" or "integer"
 #'
 #' @return A data.table with appropriate scores. For more information see
-#' \code{\link{eval_forecasts}}
+#' [eval_forecasts()]
 #'
 #' @importFrom data.table ':=' as.data.table rbindlist %like%
 #'
@@ -30,11 +30,12 @@
 #'                                      sd = TRUE,
 #'                                      summarise_by = c("model"))
 #'
-#' @author Nikos Bosse \email{[email protected]}
 #' @references Funk S, Camacho A, Kucharski AJ, Lowe R, Eggo RM, Edmunds WJ
 #' (2019) Assessing the performance of real-time epidemic forecasts: A
 #' case study of Ebola in the Western Area region of Sierra Leone, 2014-15.
 #' PLoS Comput Biol 15(2): e1006785. <doi:10.1371/journal.pcbi.1006785>
+#' @author Nikos Bosse \email{nikosbosse@@gmail.com}
+#' @inherit eval_forecasts references
 
 
 eval_forecasts_sample <- function(data,

diff --git a/R/eval_forecasts_helper.R b/R/eval_forecasts_helper.R
@@ -6,7 +6,7 @@
 #' @param dt the data.table operated on
 #' @param varnames names of the variables for which to calculate quantiles
 #' @param quantiles the desired quantiles
-#' @param by grouping variable in `eval_forecasts()
+#' @param by grouping variable in [eval_forecasts()]
 #'
 #' @return `data.table` with quantiles added
 #'
@@ -30,7 +30,7 @@ add_quantiles <- function(dt, varnames, quantiles, by) {
 #' Helper function used within eval_forecasts
 #' @param dt the data.table operated on
 #' @param varnames names of the variables for which to calculate the sd
-#' @param by grouping variable in `eval_forecasts()
+#' @param by grouping variable in [eval_forecasts()]
 #' @importFrom data.table `%like%`
 #' @return `data.table` with sd added
 #'

diff --git a/R/interval_score.R b/R/interval_score.R
@@ -16,7 +16,7 @@
 #' To improve usability, the user is asked to provide an interval range in
 #' percentage terms, i.e. interval_range = 90 (percent) for a 90 percent
 #' prediction interval. Correspondingly, the user would have to provide the
-#' 5\% and 95\% quantiles (the corresponding alpha would then be 0.1).
+#' 5%% and 95%% quantiles (the corresponding alpha would then be 0.1).
 #' No specific distribution is assumed,
 #' but the range has to be symmetric (i.e you can't use the 0.1 quantile
 #' as the lower bound and the 0.7 quantile as the upper).
@@ -34,13 +34,13 @@
 #' to alpha.
 #' @param weigh if TRUE, weigh the score by alpha / 4, so it can be averaged
 #' into an interval score that, in the limit, corresponds to CRPS. Default:
-#' FALSE.
-#' @param separate_results if TRUE (default is FALSE), then the separate parts
-#' of the interval score (sharpness, penalties for over- and under-prediction
-#' get returned as separate elements of a list). If you want a `data.frame`
-#' instead, simply call `as.data.frmae()` on the output.
+#' `FALSE.`
+#' @param separate_results if `TRUE` (default is `FALSE`), then the separate
+#' parts of the interval score (sharpness, penalties for over- and
+#' under-prediction get returned as separate elements of a list). If you want a
+#' `data.frame` instead, simply call [as.data.frame()] on the output.
 #' @return vector with the scoring values, or a list with separate entries if
-#' \code{separate_results} is TRUE.
+#' `separate_results` is `TRUE`.
 #' @examples
 #' true_values <- rnorm(30, mean = 1:30)
 #' interval_range = rep(90, 30)
@@ -65,10 +65,7 @@
 #'
 #' Evaluating epidemic forecasts in an interval format,
 #' Johannes Bracher, Evan L. Ray, Tilmann Gneiting and Nicholas G. Reich,
-#' <arXiv:2005.12881v1>
-#'
-#' Bracher J, Ray E, Gneiting T, Reich, N (2020) Evaluating epidemic forecasts
-#' in an interval format. \url{https://arxiv.org/abs/2005.12881}
+#' <https://arxiv.org/abs/2005.12881>
 #'