Commit 70b606f
committed
[SPARK-35045][SQL][FOLLOW-UP] Add a configuration for CSV input buffer size
### What changes were proposed in this pull request?
This PR makes the input buffer configurable (as an internal configuration). This is mainly to work around the regression in uniVocity/univocity-parsers#449.
This is particularly useful for SQL workloads that requires to rewrite the `CREATE TABLE` with options.
### Why are the changes needed?
To work around uniVocity/univocity-parsers#449.
### Does this PR introduce _any_ user-facing change?
No, it's only internal option.
### How was this patch tested?
Manually tested by modifying the unittest added in #31858 as below:
```diff
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
index fd25a79..705f38dbfbd 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
-2456,6 +2456,7 abstract class CSVSuite
test("SPARK-34768: counting a long record with ignoreTrailingWhiteSpace set to true") {
val bufSize = 128
val line = "X" * (bufSize - 1) + "| |"
+ spark.conf.set("spark.sql.csv.parser.inputBufferSize", 128)
withTempPath { path =>
Seq(line).toDF.write.text(path.getAbsolutePath)
assert(spark.read.format("csv")
```
Closes #32231 from HyukjinKwon/SPARK-35045-followup.
Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>1 parent a74f601 commit 70b606f
File tree
2 files changed
+11
-0
lines changed- sql/catalyst/src/main/scala/org/apache/spark/sql
- catalyst/csv
- internal
2 files changed
+11
-0
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
212 | 212 | | |
213 | 213 | | |
214 | 214 | | |
| 215 | + | |
215 | 216 | | |
216 | 217 | | |
217 | 218 | | |
| |||
Lines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2453 | 2453 | | |
2454 | 2454 | | |
2455 | 2455 | | |
| 2456 | + | |
| 2457 | + | |
| 2458 | + | |
| 2459 | + | |
| 2460 | + | |
| 2461 | + | |
| 2462 | + | |
| 2463 | + | |
| 2464 | + | |
| 2465 | + | |
2456 | 2466 | | |
2457 | 2467 | | |
2458 | 2468 | | |
| |||
0 commit comments