Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions R/pkg/R/sparkR.R
Original file line number Diff line number Diff line change
Expand Up @@ -585,16 +585,17 @@ processSparkPackages <- function(packages) {
# @param deployMode whether to deploy your driver on the worker nodes (cluster)
# or locally as an external client (client).
# @return NULL if no need to update sparkHome, and new sparkHome otherwise.
sparkCheckInstall <- function(sparkHome, master, deployMode) {
sparkCheckInstall <- function(
sparkHome = Sys.getenv("SPARK_HOME"),
master = "local",
deployMode = "") {
if (!isSparkRShell()) {
if (!is.na(file.info(sparkHome)$isdir)) {
msg <- paste0("Spark package found in SPARK_HOME: ", sparkHome)
message(msg)
message("Spark package found in SPARK_HOME: ", sparkHome)
NULL
} else {
if (interactive() || isMasterLocal(master)) {
msg <- paste0("Spark not found in SPARK_HOME: ", sparkHome)
message(msg)
message("Spark not found in SPARK_HOME: ", sparkHome)
packageLocalDir <- install.spark()
packageLocalDir
} else if (isClientMode(master) || deployMode == "client") {
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_Serde.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("SerDe functionality")

# Ensure Spark is installed
sparkCheckInstall()

sparkSession <- sparkR.session(enableHiveSupport = FALSE)

test_that("SerDe of primitive types", {
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_binaryFile.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("functions on binary files")

# Ensure Spark is installed
sparkCheckInstall()

# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_binary_function.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("binary functions")

# Ensure Spark is installed
sparkCheckInstall()

# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_broadcast.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("broadcast variables")

# Ensure Spark is installed
sparkCheckInstall()

# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_context.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("test functions in sparkR.R")

# Ensure Spark is installed
sparkCheckInstall()

test_that("Check masked functions", {
# Check that we are not masking any new function from base, stats, testthat unexpectedly
# NOTE: We should avoid adding entries to *namesOfMaskedCompletely* as masked functions make it
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_includePackage.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("include R packages")

# Ensure Spark is installed
sparkCheckInstall()

# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_jvm_api.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("JVM API")

# Ensure Spark is installed
sparkCheckInstall()

sparkSession <- sparkR.session(enableHiveSupport = FALSE)

test_that("Create and call methods on object", {
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_mllib_classification.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ library(testthat)

context("MLlib classification algorithms, except for tree-based algorithms")

# Ensure Spark is installed
sparkCheckInstall()

# Tests for MLlib classification algorithms in SparkR
sparkSession <- sparkR.session(enableHiveSupport = FALSE)

Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_mllib_clustering.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ library(testthat)

context("MLlib clustering algorithms")

# Ensure Spark is installed
sparkCheckInstall()

# Tests for MLlib clustering algorithms in SparkR
sparkSession <- sparkR.session(enableHiveSupport = FALSE)

Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_mllib_recommendation.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ library(testthat)

context("MLlib recommendation algorithms")

# Ensure Spark is installed
sparkCheckInstall()

# Tests for MLlib recommendation algorithms in SparkR
sparkSession <- sparkR.session(enableHiveSupport = FALSE)

Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_mllib_regression.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ library(testthat)

context("MLlib regression algorithms, except for tree-based algorithms")

# Ensure Spark is installed
sparkCheckInstall()

# Tests for MLlib regression algorithms in SparkR
sparkSession <- sparkR.session(enableHiveSupport = FALSE)

Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_mllib_stat.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ library(testthat)

context("MLlib statistics algorithms")

# Ensure Spark is installed
sparkCheckInstall()

# Tests for MLlib statistics algorithms in SparkR
sparkSession <- sparkR.session(enableHiveSupport = FALSE)

Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_mllib_tree.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ library(testthat)

context("MLlib tree-based algorithms")

# Ensure Spark is installed
sparkCheckInstall()

# Tests for MLlib tree-based algorithms in SparkR
sparkSession <- sparkR.session(enableHiveSupport = FALSE)

Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_parallelize_collect.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("parallelize() and collect()")

# Ensure Spark is installed
sparkCheckInstall()

# Mock data
numVector <- c(-10:97)
numList <- list(sqrt(1), sqrt(2), sqrt(3), 4 ** 10)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_rdd.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("basic RDD functions")

# Ensure Spark is installed
sparkCheckInstall()

# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_shuffle.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("partitionBy, groupByKey, reduceByKey etc.")

# Ensure Spark is installed
sparkCheckInstall()

# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_sparkSQL.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ library(testthat)

context("SparkSQL functions")

# Ensure Spark is installed
sparkCheckInstall()

# Utility function for easily checking the values of a StructField
checkStructField <- function(actual, expectedName, expectedType, expectedNullable) {
expect_equal(class(actual), "structField")
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_take.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("tests RDD function take()")

# Ensure Spark is installed
sparkCheckInstall()

# Mock data
numVector <- c(-10:97)
numList <- list(sqrt(1), sqrt(2), sqrt(3), 4 ** 10)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_textFile.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("the textFile() function")

# Ensure Spark is installed
sparkCheckInstall()

# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/inst/tests/testthat/test_utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@

context("functions in utils.R")

# Ensure Spark is installed
sparkCheckInstall()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had in mind was to combine the sparkR.session and this sparkCheckInstall into one function so its easy to remember for a new test file. Any thoughts on this ?

Copy link
Member Author

@felixcheung felixcheung Feb 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that, but as pointed out #16720 (comment), some tests don't need SparkSession, and some tests will create/stop one as needed, and to have a function that does that all just mean more complexity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - that sounds fine. I was looking to see if testthat had any support for writing a setup that gets called before each test - Doesn't look like it has that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thats a great idea - Can you see if that works (unfortunately it needs manual verification) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any luck testing this out ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I"m really swamped, haven't had the chance to test that out yet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested this by putting SparkR:::checkInstall in run-all.R (before calling test_package) and that seems to do the trick on a custom 2.1.0 build !

@felixchueng when you get a chance can you update the PR with that ? The only thing that I'm concerned about is calling a private function from run-all.R - We could either export this function or move some of this functionality into install.spark


# JavaSparkContext handle
sparkSession <- sparkR.session(enableHiveSupport = FALSE)
sc <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getJavaSparkContext", sparkSession)
Expand Down
3 changes: 3 additions & 0 deletions R/pkg/vignettes/sparkr-vignettes.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ library(SparkR)

We use default settings in which it runs in local mode. It auto downloads Spark package in the background if no previous installation is found. For more details about setup, see [Spark Session](#SetupSparkSession).

```{r, include=FALSE}
SparkR:::sparkCheckInstall()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to include a ::: function in the vignette ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has include=FALSE so it will run but the code and output will not be included in the vignettes text

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Rmd file a part of the install that the users see ? I just dont want to put in any code that people might copy-paste etc. Is it not good enough to pass in master=local[*] here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW These vignette changes are still needed even if we update run-all.R

```
```{r, message=FALSE, results="hide"}
sparkR.session()
```
Expand Down