Skip to content

Commit 173aa94

Browse files
committed
[SPARK-12546][SQL] Change default number of open parquet files
A common problem that users encounter with Spark 1.6.0 is that writing to a partitioned parquet table OOMs. The root cause is that parquet allocates a significant amount of memory that is not accounted for by our own mechanisms. As a workaround, we can ensure that only a single file is open per task unless the user explicitly asks for more. Author: Michael Armbrust <[email protected]> Closes #11308 from marmbrus/parquetWriteOOM.
1 parent 4a91806 commit 173aa94

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -430,7 +430,7 @@ private[spark] object SQLConf {
430430

431431
val PARTITION_MAX_FILES =
432432
intConf("spark.sql.sources.maxConcurrentWrites",
433-
defaultValue = Some(5),
433+
defaultValue = Some(1),
434434
doc = "The maximum number of concurrent files to open before falling back on sorting when " +
435435
"writing out files using dynamic partitioning.")
436436

0 commit comments

Comments
 (0)