Skip to content

Commit be7425e

Browse files
felixcheungFelix Cheung
authored andcommitted
[SPARKR][DOCS] update R API doc for subset/extract
## What changes were proposed in this pull request? With extract `[[` or replace `[[<-`, the parameter `i` is a column index, that needs to be corrected in doc. Also a few minor updates: examples, links. ## How was this patch tested? manual Author: Felix Cheung <[email protected]> Closes apache#16721 from felixcheung/rsubsetdoc.
1 parent f9156d2 commit be7425e

File tree

5 files changed

+20
-9
lines changed

5 files changed

+20
-9
lines changed

R/pkg/R/DataFrame.R

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1831,6 +1831,8 @@ setMethod("[", signature(x = "SparkDataFrame"),
18311831
#' Return subsets of SparkDataFrame according to given conditions
18321832
#' @param x a SparkDataFrame.
18331833
#' @param i,subset (Optional) a logical expression to filter on rows.
1834+
#' For extract operator [[ and replacement operator [[<-, the indexing parameter for
1835+
#' a single Column.
18341836
#' @param j,select expression for the single Column or a list of columns to select from the SparkDataFrame.
18351837
#' @param drop if TRUE, a Column will be returned if the resulting dataset has only one column.
18361838
#' Otherwise, a SparkDataFrame will always be returned.
@@ -1841,6 +1843,7 @@ setMethod("[", signature(x = "SparkDataFrame"),
18411843
#' @export
18421844
#' @family SparkDataFrame functions
18431845
#' @aliases subset,SparkDataFrame-method
1846+
#' @seealso \link{withColumn}
18441847
#' @rdname subset
18451848
#' @name subset
18461849
#' @family subsetting functions
@@ -1858,6 +1861,10 @@ setMethod("[", signature(x = "SparkDataFrame"),
18581861
#' subset(df, df$age %in% c(19, 30), 1:2)
18591862
#' subset(df, df$age %in% c(19), select = c(1,2))
18601863
#' subset(df, select = c(1,2))
1864+
#' # Columns can be selected and set
1865+
#' df[["age"]] <- 23
1866+
#' df[[1]] <- df$age
1867+
#' df[[2]] <- NULL # drop column
18611868
#' }
18621869
#' @note subset since 1.5.0
18631870
setMethod("subset", signature(x = "SparkDataFrame"),
@@ -1982,7 +1989,7 @@ setMethod("selectExpr",
19821989
#' @aliases withColumn,SparkDataFrame,character-method
19831990
#' @rdname withColumn
19841991
#' @name withColumn
1985-
#' @seealso \link{rename} \link{mutate}
1992+
#' @seealso \link{rename} \link{mutate} \link{subset}
19861993
#' @export
19871994
#' @examples
19881995
#'\dontrun{
@@ -1993,6 +2000,10 @@ setMethod("selectExpr",
19932000
#' # Replace an existing column
19942001
#' newDF2 <- withColumn(newDF, "newCol", newDF$col1)
19952002
#' newDF3 <- withColumn(newDF, "newCol", 42)
2003+
#' # Use extract operator to set an existing or new column
2004+
#' df[["age"]] <- 23
2005+
#' df[[2]] <- df$col1
2006+
#' df[[2]] <- NULL # drop column
19962007
#' }
19972008
#' @note withColumn since 1.4.0
19982009
setMethod("withColumn",

R/pkg/R/mllib_classification.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ setClass("NaiveBayesModel", representation(jobj = "jobj"))
4141

4242
#' Logistic Regression Model
4343
#'
44-
#' Fits an logistic regression model against a Spark DataFrame. It supports "binomial": Binary logistic regression
44+
#' Fits an logistic regression model against a SparkDataFrame. It supports "binomial": Binary logistic regression
4545
#' with pivoting; "multinomial": Multinomial logistic (softmax) regression without pivoting, similar to glmnet.
4646
#' Users can print, make predictions on the produced model and save the model to the input path.
4747
#'

R/pkg/R/mllib_clustering.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ setClass("LDAModel", representation(jobj = "jobj"))
4747

4848
#' Bisecting K-Means Clustering Model
4949
#'
50-
#' Fits a bisecting k-means clustering model against a Spark DataFrame.
50+
#' Fits a bisecting k-means clustering model against a SparkDataFrame.
5151
#' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make
5252
#' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models.
5353
#'
@@ -189,7 +189,7 @@ setMethod("write.ml", signature(object = "BisectingKMeansModel", path = "charact
189189

190190
#' Multivariate Gaussian Mixture Model (GMM)
191191
#'
192-
#' Fits multivariate gaussian mixture model against a Spark DataFrame, similarly to R's
192+
#' Fits multivariate gaussian mixture model against a SparkDataFrame, similarly to R's
193193
#' mvnormalmixEM(). Users can call \code{summary} to print a summary of the fitted model,
194194
#' \code{predict} to make predictions on new data, and \code{write.ml}/\code{read.ml}
195195
#' to save/load fitted models.
@@ -314,7 +314,7 @@ setMethod("write.ml", signature(object = "GaussianMixtureModel", path = "charact
314314

315315
#' K-Means Clustering Model
316316
#'
317-
#' Fits a k-means clustering model against a Spark DataFrame, similarly to R's kmeans().
317+
#' Fits a k-means clustering model against a SparkDataFrame, similarly to R's kmeans().
318318
#' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make
319319
#' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models.
320320
#'

R/pkg/R/mllib_regression.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ setClass("IsotonicRegressionModel", representation(jobj = "jobj"))
4141

4242
#' Generalized Linear Models
4343
#'
44-
#' Fits generalized linear model against a Spark DataFrame.
44+
#' Fits generalized linear model against a SparkDataFrame.
4545
#' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make
4646
#' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models.
4747
#'
@@ -259,7 +259,7 @@ setMethod("write.ml", signature(object = "GeneralizedLinearRegressionModel", pat
259259

260260
#' Isotonic Regression Model
261261
#'
262-
#' Fits an Isotonic Regression model against a Spark DataFrame, similarly to R's isoreg().
262+
#' Fits an Isotonic Regression model against a SparkDataFrame, similarly to R's isoreg().
263263
#' Users can print, make predictions on the produced model and save the model to the input path.
264264
#'
265265
#' @param data SparkDataFrame for training.

R/pkg/vignettes/sparkr-vignettes.Rmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -923,9 +923,9 @@ The main method calls of actual computation happen in the Spark JVM of the drive
923923

924924
Two kinds of RPCs are supported in the SparkR JVM backend: method invocation and creating new objects. Method invocation can be done in two ways.
925925

926-
* `sparkR.invokeJMethod` takes a reference to an existing Java object and a list of arguments to be passed on to the method.
926+
* `sparkR.callJMethod` takes a reference to an existing Java object and a list of arguments to be passed on to the method.
927927

928-
* `sparkR.invokeJStatic` takes a class name for static method and a list of arguments to be passed on to the method.
928+
* `sparkR.callJStatic` takes a class name for static method and a list of arguments to be passed on to the method.
929929

930930
The arguments are serialized using our custom wire format which is then deserialized on the JVM side. We then use Java reflection to invoke the appropriate method.
931931

0 commit comments

Comments
 (0)