Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 42 additions & 48 deletions R/pkg/R/mllib.R
Original file line number Diff line number Diff line change
Expand Up @@ -218,9 +218,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {

# Makes predictions from a generalized linear model produced by glm() or spark.glm(),
# similarly to R's predict().
#'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fixes small style issues from #13820

#' @param newData SparkDataFrame for testing
#' @return \code{predict} returns a SparkDataFrame containing predicted labels in a column named "prediction"
#' @return \code{predict} returns a SparkDataFrame containing predicted labels in a column named
#' "prediction"
#' @rdname spark.glm
#' @export
#' @note predict(GeneralizedLinearRegressionModel) since 1.5.0
Expand All @@ -229,41 +230,26 @@ setMethod("predict", signature(object = "GeneralizedLinearRegressionModel"),
return(dataFrame(callJMethod(object@jobj, "transform", newData@sdf)))
})

#' Predicted values based on model
#'
#' Makes predictions from a naive Bayes model or a model produced by spark.naiveBayes(),
#' similarly to R package e1071's predict.
#'
#' @param object A fitted naive Bayes model
#' @rdname predict
# Makes predictions from a naive Bayes model or a model produced by spark.naiveBayes(),
# similarly to R package e1071's predict.

#' @rdname spark.naiveBayes
#' @return \code{predict} returns a SparkDataFrame containing predicted labeled in a column named
#' "prediction"
#' @export
#' @examples
#' \dontrun{
#' model <- spark.naiveBayes(trainingData, y ~ x)
#' predicted <- predict(model, testData)
#' showDF(predicted)
#'}
#' @note predict(NaiveBayesModel) since 2.0.0
setMethod("predict", signature(object = "NaiveBayesModel"),
function(object, newData) {
return(dataFrame(callJMethod(object@jobj, "transform", newData@sdf)))
})

#' Get the summary of a naive Bayes model
#'
#' Returns the summary of a naive Bayes model produced by spark.naiveBayes(),
#' similarly to R's summary().
#'
#' @param object A fitted MLlib model
#' @return a list containing 'apriori', the label distribution, and 'tables', conditional
# probabilities given the target label
#' @rdname summary
# Returns the summary of a naive Bayes model produced by \code{spark.naiveBayes}

#' @param object A naive Bayes model fitted by \code{spark.naiveBayes}
#' @return \code{summary} returns a list containing \code{apriori}, the label distribution, and
#' \code{tables}, conditional probabilities given the target label
#' @rdname spark.naiveBayes
#' @export
#' @examples
#' \dontrun{
#' model <- spark.naiveBayes(trainingData, y ~ x)
#' summary(model)
#'}
#' @note summary(NaiveBayesModel) since 2.0.0
setMethod("summary", signature(object = "NaiveBayesModel"),
function(object, ...) {
Expand Down Expand Up @@ -390,23 +376,41 @@ setMethod("predict", signature(object = "KMeansModel"),
return(dataFrame(callJMethod(object@jobj, "transform", newData@sdf)))
})

#' Fit a Bernoulli naive Bayes model
#' Naive Bayes Models
#'
#' Fit a Bernoulli naive Bayes model on a Spark DataFrame (only categorical data is supported).
#' \code{spark.naiveBayes} fits a Bernoulli naive Bayes model against a SparkDataFrame.
#' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make
#' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models.
#' Only categorical data is supported.
#'
#' @param data SparkDataFrame for training
#' @param data A \code{SparkDataFrame} of observations and labels for model fitting
#' @param formula A symbolic description of the model to be fitted. Currently only a few formula
#' operators are supported, including '~', '.', ':', '+', and '-'.
#' @param smoothing Smoothing parameter
#' @return a fitted naive Bayes model
#' @return \code{spark.naiveBayes} returns a fitted naive Bayes model
#' @rdname spark.naiveBayes
#' @name spark.naiveBayes
#' @seealso e1071: \url{https://cran.r-project.org/web/packages/e1071/}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to hide the url?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roxygen2 only supports raw url links.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#' @export
#' @examples
#' \dontrun{
#' df <- createDataFrame(infert)
#'
#' # fit a Bernoulli naive Bayes model
#' model <- spark.naiveBayes(df, education ~ ., smoothing = 0)
#'}
#'
#' # get the summary of the model
#' summary(model)
#'
#' # make predictions
#' predictions <- predict(model, df)
#'
#' # save and load the model
#' path <- "path/to/model"
#' write.ml(model, path)
#' savedModel <- read.ml(path)
#' summary(savedModel)
#' }
#' @note spark.naiveBayes since 2.0.0
setMethod("spark.naiveBayes", signature(data = "SparkDataFrame", formula = "formula"),
function(data, formula, smoothing = 1.0, ...) {
Expand All @@ -416,25 +420,15 @@ setMethod("spark.naiveBayes", signature(data = "SparkDataFrame", formula = "form
return(new("NaiveBayesModel", jobj = jobj))
})

#' Save fitted MLlib model to the input path
#'
#' Save the Bernoulli naive Bayes model to the input path.
#'
#' @param object A fitted Bernoulli naive Bayes model
# Saves the Bernoulli naive Bayes model to the input path.

#' @param path The directory where the model is saved
#' @param overwrite Overwrites or not if the output path already exists. Default is FALSE
#' which means throw exception if the output path exists.
#'
#' @rdname write.ml
#' @name write.ml
#' @rdname spark.naiveBayes
#' @export
#' @examples
#' \dontrun{
#' df <- createDataFrame(infert)
#' model <- spark.naiveBayes(df, education ~ ., smoothing = 0)
#' path <- "path/to/model"
#' write.ml(model, path)
#' }
#' @seealso \link{read.ml}
#' @note write.ml(NaiveBayesModel, character) since 2.0.0
setMethod("write.ml", signature(object = "NaiveBayesModel", path = "character"),
function(object, path, overwrite = FALSE) {
Expand Down