Commit 1dea574
[SPARK-39496][SQL] Handle null struct in
### What changes were proposed in this pull request?
Change `Inline.eval` to return a row of null values rather than a null row in the case of a null input struct.
### Why are the changes needed?
Consider the following query:
```
set spark.sql.codegen.wholeStage=false;
select inline(array(named_struct('a', 1, 'b', 2), null));
```
This query fails with a `NullPointerException`:
```
22/06/16 15:10:06 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NullPointerException
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.execution.GenerateExec.$anonfun$doExecute$11(GenerateExec.scala:122)
```
(In Spark 3.1.3, you don't need to set `spark.sql.codegen.wholeStage` to false to reproduce the error, since Spark 3.1.3 has no codegen path for `Inline`).
This query fails regardless of the setting of `spark.sql.codegen.wholeStage`:
```
val dfWide = (Seq((1))
.toDF("col0")
.selectExpr(Seq.tabulate(99)(x => s"$x as col${x + 1}"): _*))
val df = (dfWide
.selectExpr("*", "array(named_struct('a', 1, 'b', 2), null) as struct_array"))
df.selectExpr("*", "inline(struct_array)").collect
```
It fails with
```
22/06/16 15:18:55 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ 1]
java.lang.NullPointerException
at org.apache.spark.sql.catalyst.expressions.JoinedRow.isNullAt(JoinedRow.scala:80)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_8$(Unknown Source)
```
When `Inline.eval` returns a null row in the collection, GenerateExec gets a NullPointerException either when joining the null row with required child output, or projecting the null row.
This PR avoids producing the null row and produces a row of null values instead:
```
spark-sql> set spark.sql.codegen.wholeStage=false;
spark.sql.codegen.wholeStage false
Time taken: 3.095 seconds, Fetched 1 row(s)
spark-sql> select inline(array(named_struct('a', 1, 'b', 2), null));
1 2
NULL NULL
Time taken: 1.214 seconds, Fetched 2 row(s)
spark-sql>
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
New unit test.
Closes #36903 from bersprockets/inline_eval_null_struct_issue.
Authored-by: Bruce Robbins <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit c4d5390)
Signed-off-by: Hyukjin Kwon <[email protected]>Inline.eval
1 parent ad90195 commit 1dea574
File tree
2 files changed
+18
-3
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
- core/src/test/scala/org/apache/spark/sql
2 files changed
+18
-3
lines changedLines changed: 6 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
452 | 452 | | |
453 | 453 | | |
454 | 454 | | |
| 455 | + | |
| 456 | + | |
455 | 457 | | |
456 | 458 | | |
457 | 459 | | |
458 | 460 | | |
459 | 461 | | |
460 | | - | |
461 | | - | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
462 | 466 | | |
463 | 467 | | |
464 | 468 | | |
| |||
Lines changed: 12 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| |||
389 | 390 | | |
390 | 391 | | |
391 | 392 | | |
392 | | - | |
| 393 | + | |
393 | 394 | | |
394 | 395 | | |
395 | 396 | | |
| |||
413 | 414 | | |
414 | 415 | | |
415 | 416 | | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
416 | 427 | | |
417 | 428 | | |
418 | 429 | | |
| |||
0 commit comments