[SPARK-30554][SQL] Return Iterable from FailureSafeParser.rawParser#27264
[SPARK-30554][SQL] Return Iterable from FailureSafeParser.rawParser#27264MaxGekk wants to merge 8 commits intoapache:masterfrom
Iterable from FailureSafeParser.rawParser#27264Conversation
|
@HyukjinKwon @srowen Please, review this PR. |
|
Seems OK; is it just some code polish or does it make an impact on performance? |
|
@srowen Main goal is code polish. Potentially, it could improve performance slightly because |
|
Test build #116956 has finished for PR 27264 at commit
|
|
Test build #116964 has finished for PR 27264 at commit
|
…-parser-seq # Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
|
jenkins, retest this, please |
|
Test build #117002 has finished for PR 27264 at commit
|
|
jenkins, retest this, please |
|
Test build #117011 has finished for PR 27264 at commit
|
|
jenkins, retest this, please |
|
Test build #117014 has finished for PR 27264 at commit
|
|
Test build #117015 has finished for PR 27264 at commit
|
|
Thanks @MaxGekk for addressing my comment. Merged to master. |
What changes were proposed in this pull request?
Changed signature of
rawParserpassed toFailureSafeParser. I propose to change return type fromSeqtoIterable. I tookIterableto easier port the changes on Scala collections 2.13. Also, I replacedSeqbyOptionin CSV datasource -UnivocityParser, and in JSON parser exception one place in the case when specified schema isStructType, and JSON input is an array.Why are the changes needed?
Seqis unnecessary requirement for return type from rawParser which may not have multiple rows per input like CSV datasource.Does this PR introduce any user-facing change?
No
How was this patch tested?
By existing test suites
JsonSuite,UnivocityParserSuite,JsonFunctionsSuite,JsonExpressionsSuite,CsvSuite, andCsvFunctionsSuite.