Commit 2e2875b
[SPARK-47646][SQL] Make try_to_number return NULL for malformed input (apache#378)
### What changes were proposed in this pull request?
This PR proposes to add NULL check after parsing the number so the output can be safely null for `try_to_number` expression.
```scala
import org.apache.spark.sql.functions._
val df = spark.createDataset(spark.sparkContext.parallelize(Seq("11")))
df.select(try_to_number($"value", lit("$99.99"))).show()
```
```
java.lang.NullPointerException: Cannot invoke "org.apache.spark.sql.types.Decimal.toPlainString()" because "<local7>" is null
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.serializefromobject_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:50)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:894)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:894)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:368)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:332)
```
### Why are the changes needed?
To fix the bug, and let `try_to_number` return `NULL` for malformed input as designed.
### Does this PR introduce _any_ user-facing change?
Yes, it fixes a bug. Previously, `try_to_number` failed with NPE.
### How was this patch tested?
Unittest was added.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes apache#45771 from HyukjinKwon/SPARK-47646.
Authored-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit d709e20)
Signed-off-by: Hyukjin Kwon <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>1 parent 6644a9d commit 2e2875b
File tree
2 files changed
+6
-0
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
- core/src/test/scala/org/apache/spark/sql
2 files changed
+6
-0
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
89 | 90 | | |
90 | 91 | | |
91 | 92 | | |
| |||
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1173 | 1173 | | |
1174 | 1174 | | |
1175 | 1175 | | |
| 1176 | + | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
1176 | 1181 | | |
1177 | 1182 | | |
1178 | 1183 | | |
| |||
0 commit comments