The workers
package is a platform-agnostic R-focused parallel job
scheduler. For computationally-demanding workflows, schedulers are
important. Some tasks need to complete before others start (for example,
the data munging steps that precede analysis) and workers
takes
advantages of parallel computing opportunities while saving you the
trouble of figuring out what needs to run when.
devtools::install_github("wlandau/workers")
Represent your workflow as a dependency graph with functions as attributes. Each function is a step in the pipeline.
success <- function() {
future::future(list(success = TRUE))
}
code <- list(
a = function() {
x <<- 2
success()
},
b = function() {
y <<- x + 1
success()
},
c = function() {
z <<- x * 2
success()
},
d = function() {
w <<- 3 * y + z
success()
}
)
vertices <- tibble::tibble(name = letters[1:4], code)
edges <- tibble::tibble(
from = c("a", "a", "b", "c"),
to = c("b", "c", "d", "d")
)
graph <- igraph::graph_from_data_frame(edges, vertices = vertices)
plot(graph)
Then, run your workflow with schedule(graph)
.
library(workers)
schedule(graph)