Skip to content

Commit a83d8d5

Browse files
committed
[SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR
## What changes were proposed in this pull request? This PR proposes to revive `stringsAsFactors` option in collect API, which was mistakenly removed in 71a138c. Simply, it casts `charactor` to `factor` if it meets the condition, `stringsAsFactors && is.character(vec)` in primitive type conversion. ## How was this patch tested? Unit test in `R/pkg/tests/fulltests/test_sparkSQL.R`. Author: hyukjinkwon <[email protected]> Closes #19551 from HyukjinKwon/SPARK-17902.
1 parent 3073344 commit a83d8d5

File tree

2 files changed

+9
-0
lines changed

2 files changed

+9
-0
lines changed

R/pkg/R/DataFrame.R

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1191,6 +1191,9 @@ setMethod("collect",
11911191
vec <- do.call(c, col)
11921192
stopifnot(class(vec) != "list")
11931193
class(vec) <- PRIMITIVE_TYPES[[colType]]
1194+
if (is.character(vec) && stringsAsFactors) {
1195+
vec <- as.factor(vec)
1196+
}
11941197
df[[colIndex]] <- vec
11951198
} else {
11961199
df[[colIndex]] <- col

R/pkg/tests/fulltests/test_sparkSQL.R

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -499,6 +499,12 @@ test_that("create DataFrame with different data types", {
499499
expect_equal(collect(df), data.frame(l, stringsAsFactors = FALSE))
500500
})
501501

502+
test_that("SPARK-17902: collect() with stringsAsFactors enabled", {
503+
df <- suppressWarnings(collect(createDataFrame(iris), stringsAsFactors = TRUE))
504+
expect_equal(class(iris$Species), class(df$Species))
505+
expect_equal(iris$Species, df$Species)
506+
})
507+
502508
test_that("SPARK-17811: can create DataFrame containing NA as date and time", {
503509
df <- data.frame(
504510
id = 1:2,

0 commit comments

Comments
 (0)