Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions R/pkg/R/install.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@
#' Download and Install Apache Spark to a Local Directory
#'
#' \code{install.spark} downloads and installs Spark to a local directory if
#' it is not found. The Spark version we use is the same as the SparkR version.
#' Users can specify a desired Hadoop version, the remote mirror site, and
#' the directory where the package is installed locally.
#' it is not found. If SPARK_HOME is set in the environment, and that directory is found, that is
#' returned. The Spark version we use is the same as the SparkR version. Users can specify a desired
#' Hadoop version, the remote mirror site, and the directory where the package is installed locally.
#'
#' The full url of remote file is inferred from \code{mirrorUrl} and \code{hadoopVersion}.
#' \code{mirrorUrl} specifies the remote path to a Spark folder. It is followed by a subfolder
Expand Down Expand Up @@ -68,6 +68,16 @@
#' \href{http://spark.apache.org/downloads.html}{Apache Spark}
install.spark <- function(hadoopVersion = "2.7", mirrorUrl = NULL,
localDir = NULL, overwrite = FALSE) {
sparkHome <- Sys.getenv("SPARK_HOME")
if (isSparkRShell()) {
stopifnot(nchar(sparkHome) > 0)
message("Spark is already running in sparkR shell.")
return(invisible(sparkHome))
} else if (!is.na(file.info(sparkHome)$isdir)) {
message("Spark package found in SPARK_HOME: ", sparkHome)
return(invisible(sparkHome))
}

version <- paste0("spark-", packageVersion("SparkR"))
hadoopVersion <- tolower(hadoopVersion)
hadoopVersionName <- hadoopVersionName(hadoopVersion)
Expand Down
6 changes: 2 additions & 4 deletions R/pkg/R/sparkR.R
Original file line number Diff line number Diff line change
Expand Up @@ -588,13 +588,11 @@ processSparkPackages <- function(packages) {
sparkCheckInstall <- function(sparkHome, master, deployMode) {
if (!isSparkRShell()) {
if (!is.na(file.info(sparkHome)$isdir)) {
msg <- paste0("Spark package found in SPARK_HOME: ", sparkHome)
message(msg)
message("Spark package found in SPARK_HOME: ", sparkHome)
NULL
} else {
if (interactive() || isMasterLocal(master)) {
msg <- paste0("Spark not found in SPARK_HOME: ", sparkHome)
message(msg)
message("Spark not found in SPARK_HOME: ", sparkHome)
packageLocalDir <- install.spark()
packageLocalDir
} else if (isClientMode(master) || deployMode == "client") {
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/tests/run-all.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,7 @@ library(SparkR)
# Turn all warnings into errors
options("warn" = 2)

# Setup global test environment
install.spark()

test_package("SparkR")
3 changes: 3 additions & 0 deletions R/pkg/vignettes/sparkr-vignettes.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ library(SparkR)

We use default settings in which it runs in local mode. It auto downloads Spark package in the background if no previous installation is found. For more details about setup, see [Spark Session](#SetupSparkSession).

```{r, include=FALSE}
install.spark()
```
```{r, message=FALSE, results="hide"}
sparkR.session()
```
Expand Down