-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16053][R] Add spark_partition_id in SparkR
#13768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1179,6 +1179,27 @@ setMethod("soundex", | |
| column(jc) | ||
| }) | ||
|
|
||
| #' spark_partition_id | ||
| #' | ||
| #' Return the column for partition ID of the Spark task. | ||
|
||
| #' Note that this is indeterministic because it depends on data partitioning and | ||
| #' task scheduling. | ||
| #' | ||
| #' This is equivalent to the SPARK_PARTITION_ID function in SQL. | ||
| #' | ||
| #' @rdname spark_partition_id | ||
| #' @name spark_partition_id | ||
| #' @export | ||
| #' @examples | ||
| #' \dontrun{select(df, spark_partition_id())} | ||
| #' @note spark_partition_id since 2.0.0 | ||
| setMethod("spark_partition_id", | ||
| signature(x = "missing"), | ||
| function() { | ||
| jc <- callJStatic("org.apache.spark.sql.functions", "spark_partition_id") | ||
| column(jc) | ||
| }) | ||
|
|
||
| #' @rdname sd | ||
| #' @name stddev | ||
| setMethod("stddev", | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1126,6 +1126,10 @@ setGeneric("sort_array", function(x, asc = TRUE) { standardGeneric("sort_array") | |
| #' @export | ||
| setGeneric("soundex", function(x) { standardGeneric("soundex") }) | ||
|
|
||
| #' @rdname spark_partition_id | ||
| #' @export | ||
| setGeneric("spark_partition_id", function(x) { standardGeneric("spark_partition_id") }) | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shouldn't this go to L1080? this should be sorted
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean before |
||
| #' @rdname sd | ||
| #' @export | ||
| setGeneric("stddev", function(x) { standardGeneric("stddev") }) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit: The convention we are using in SparkR is to have a descriptive title for the function. So in this case it would be something like
Return the partition ID as a column. (There might be other places which need to fixed to match this convention as well -- We discussed this in #13394There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I'll fix them.