-
Notifications
You must be signed in to change notification settings - Fork 7
/
how-to-bench.Rmd
295 lines (251 loc) · 11.2 KB
/
how-to-bench.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
---
title: "Benchmarking with decoupleR Tutorial"
author:
- name: "Daniel Dimitrov"
affiliation:
- Saezlab
email: [email protected]
output:
BiocStyle::html_document:
self_contained: true
toc: true
toc_float: true
toc_depth: 3
code_folding: show
package: "`r pkg_ver('decoupleRBench')`"
vignette: >
%\VignetteIndexEntry{Benchmarking with decoupleR Tutorial}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r chunk_setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Prerequsites
This benchmark setting builds on `decoupleR`, and more specifically the
`decouple` wrapper function. As such, it requires the decoupleR package to be
installed and it is recommended that the user is familiar with the [basics of decoupleR](https://saezlab.github.io/decoupleR/articles/decoupleR.html#basics-1).
The benchmark pipeline requires an input tibble with user-specified settings,
benchmark data in the form of a count table and a corresponding metadata table.
```{r load_decoupler, include = TRUE}
# load dependencies
library(decoupleRBench)
library(ggplot2)
library(pheatmap)
library(dplyr)
library(tibble)
# We load the files directly from drive urls
# example expression data
bexample_url <- 'https://docs.google.com/uc?export=download&id=1S9YcfQuG2cKLhUJKQjCtxtqdAvFbm9B9'
# benchmark metadata
bmeta_url <- "https://docs.google.com/uc?export=download&id=1nkvdtR0M1CvQPvFfz5OzJ75eicgE-0wu"
# dorothea gene sets
source_url <- "https://docs.google.com/uc?export=download&id=1CRrEcLxUqIp6qKyTIpF7Wuv1dZsiqe0j"
```
## Input tibble
The input tibble serves as the description for each benchmark run and each of
its rows corresponds to a separate call of the `decouple` wrapper. This enables
the user to specify any number of 'benchmark experiments' or combinations of
benchmark data, networks (or any other type of gene sets), and statistical
methods that they wish to explore.
The ease of use and flexibility of this approach is demonstrated here - e.g.
to run the pipeline with a network resource filtered for a given condition only
requires the filtering criteria to be changed. This also applies to the
statistical methods and their settings, different set sources/networks,
and benchmark data sets.
Note that here we use only an example of the DoRoThEA benchmarking set
[(Holland et al., 2020)](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1949-z).
```{r input_example}
design_row <-
tibble(
set_name = "dorothea", # name of the set resource
bench_name = "dbd", # name of the benchmark data
stats_list = list( # a list of the stats to call
c(
"wmean",
"mlm",
"gsva",
"ora"
)
),
opts_list = list(list( # list of options for each stat method
wmean = list(times = 100, .mor='mor', .likelihood='likelihood'),
mlm = list(.mor='mor', .likelihood='likelihood'),
gsva = list(verbose = FALSE, method = "gsva"),
ora = list(n_up = 300, n_bottom = 300, n_background = 20000)
)),
bexpr_loc = bexample_url, # benchmark data location
bmeta_loc = bmeta_url, # metadata location
source_loc = source_url, # set source location
source_col = "tf", # source name of the gene set source
target_col = "target", # target name of the set source
filter_col = "confidence", # column by which we wish to filter
filter_crit = list(c("A")), # criteria by which we wish to filter
)
# input tibble where each run/row filters according to different
# confidence level combinations from dorothea.
input_tibble <- bind_rows(
design_row,
design_row %>%
mutate(filter_crit = list(c("A", "B", "C")))
)
input_tibble <- input_tibble[2,]
input_tibble %>%
rmarkdown::paged_table()
```
**Attention** should be paid to the arguments (options) passed to each
statistical method. In particular, if the set resource contains columns such
as weights (=`.likelihood`) or mode of regulation (=`.mor`) then the names of
these columns should also be passed as arguments.
In the case when these columns are absent, or not provided, arbitrary values
will be assigned.
Please refer to the `run_*` functions for more information about
statistic-specific arguments.
## Benchmark data and metadata
The benchmark data should be in the form of a tibble with count data
(e.g. gene expression table) with the ID of each perturbation experiment as
column names and gene IDs as rows.
```{r bexpr_example}
decoupleRBench::readRDS_helper(bexample_url, TRUE) %>%
# show only first 10 genes of first 3 experiments
slice_head(n = 10) %>%
select(1:3)
```
The benchmark metadata should also be in the form of a tibble containing the
'meta' information with rows corresponding to each perturbation experiment
or condition, with columns containing information about perturbation target,
cell line, etc.
```{r bmeta_example}
readRDS_helper(bmeta_url, TRUE) %>%
slice_head(n = 10) %>%
# select to show only most relevant columns
select(id, target, platform, sign)
```
Note that the Experiment ID is used as the key to join the benchmark data and
metadata.
## Set Source
Here, we define a 'Set Source' as any resource containing information about sets
of sources and targets, where a `source` (e.g. Transcription factors (TFs) or
Kinases in regulatory networks or GO terms in Gene Ontologies) is linked to its
`targets` (e.g. the Genes or Phosphosites composing the set of targets for a
given source).
It is worth pointing out that similarly to `.mor` and `.likelihood`,
`source_col`, `target_col`, and `filter_col` can take any user-specified name.
The `filter_col` and hence source_set pre-filtering can be omitted by passing
`NA` as its name.
In this tutorial, we will use [DoRothEA](https://github.com/saezlab/dorothea) -
a gene set resource containing signed TF-target interactions.
```{r source_example}
readRDS_helper(source_url, TRUE) %>%
slice_head(n = 10) # show only first 10 rows
```
# Benchmark run
Now that we have checked the prerequisites and we are confident that they are
formatted appropriately, we can proceed with running the decoupleR benchmark
pipeline.
## Benchmark Settings and Assumptions
Here we run the benchmark pipeline with DoRoThEA using the default settings.
Note that the benchmark pipeline currently evaluates precision and that here
we define the 'Condition positive' as the perturbed targets for each
condition/experiment.
Then we make the assumption that 'True positives (TPs)' would get a 'score'
from each statistical method that would rank them within the top `n` scores,
where `n` = the number of experiments covered by the set resource. On the other
hand, a 'True Negative (TN)' is defined as a TF not perturbed and not ranked within
the top `n` of TFs. 'Condition Negatives' are defined in two ways depending
whether downsampling is performed.
By default, Condition Negatives are defined as the set of any TF regulon not
perturbed in the current experiment. As such, each of these non-perturbed
TF sets is appended and considered a Condition negative. This results in
Condition Negatives (equal to the number of non-perturbed TFs for all of the
experiments) vastly outnumbering the Condition positives (equal to the number of
truly perturbed target in each experiment).
When downsampling is performed, the number of Condition Negatives is set to be
the same as the number of Condition Positives (i.e. number of experiments covered
by the gene set).
In some cases, assuming that the perturbed target (in this case a TF) in a given
experiment would be the most deregulated one (i.e. the one with highest `score`)
is a flawed assumption due to the nature of regulation in biology. For example,
a TF correlated or downstream of the perturbed target might be more perturbed
than the experiment target itself. However, at this stage, we believe that this
is likely our best choice. Nonetheless, we are open to suggestions and implementations
of alternative source set performance methods.
```{r bench_run, echo=TRUE, results='hide', message=FALSE, warning=FALSE}
dor_run <- run_benchmark(
.design = input_tibble, # provide input tibble
.minsize = 5, # filter gene sets with size < 10
.form = TRUE, # format the benchmark results
.perform = TRUE, # evaluate benchmarking performance
.silent = FALSE, # silently run the pipeline
.downsample_pr = TRUE, # downsample TNs for precision-recall curve
.downsample_roc = TRUE, # downsample TNs for ROC
.downsample_times = 100, # downsampling iterations
.url_bool = TRUE # whether to load from url
)
```
## Pipeline Output
The output of the pipeline is a pre-defined object with benchmark results,
results summary and plots, and the input tibble.
### Benchmarking results
The benchmarking results provide meta information about the run: set_name (name
of the set source), bench_name (name of the benchmark dataset), filter_crit
(filter criteria), and statistic (statistical method).
`Activity` contains tibbles returned from each call of the decouple wrapper.
`roc` contains Receiver Operator Curve calculations for each decouple run.
`prc` contains Precision-Recall Curve calculations for each decouple run.
```{r bench_results}
dor_run@bench_res %>%
rmarkdown::paged_table()
```
### Benchmarking Summary
First, we explore the `Summary table`.
It is a rather lengthy, yet highly informative table, containing information
such as: meta information for each run, AUROC (auc), Precision-Recall Curve AUC
(pr_auc), source_cov (number of TFs), condition_cov (number of experiments),
number of condition negatives for the roc (roc_neg) and pr curve (pr_neg),
statistic_time (Computational time for the statistical method), regulon_time
(combined time for each of the statistics for each network/set_source variation).
```{r bench_summary}
dor_run@summary$summary_table %>%
rmarkdown::paged_table()
```
### Benchmark Plots
Now we look at the different plots:
*ROC plot*
```{r bench_roc}
dor_run@summary$roc_plot
```
*Precision-Recall Curve*
```{r bench_prc}
dor_run@summary$pr_plot
```
Note that the straight lines represent random baselines for each network variant
and are proportional to the number condition positives (i.e. perturbed/predicted
experiments) against the total number of instances (positives+negatives).
*AUROC heatmap*
```{r bench_auroc}
dor_run@summary$auroc_heat
```
*PR AUC heatmap*
```{r bench_prauc}
dor_run@summary$pr_heat
```
Please note that these plots are ggplot2 objects and are thus customisable.
# Contributing to decoupleR
We are committed to the further development of decoupleR and we hope
that with the help of the community, it would become the go-to place for
benchmarking any combination of set sources and statistical methods. As such,
we invite the community to share or implement any ideas, statistical methods,
network or gene resources, and benchmark data sets.
Are you interested in adding a new statistical method or in extending the
benchmark environment of decoupleR? Please check out our
[contribution guide](https://saezlab.github.io/decoupleR/CONTRIBUTING.html)
and do not hesitate to contact me!
# Session Info
```{r sessionInfo, echo=FALSE}
options(width = 120)
sessioninfo::session_info()
```