Commit 30574fd
[SPARK-23054][SQL] Fix incorrect results of casting UserDefinedType to String
## What changes were proposed in this pull request?
This pr fixed the issue when casting `UserDefinedType`s into strings;
```
>>> from pyspark.ml.classification import MultilayerPerceptronClassifier
>>> from pyspark.ml.linalg import Vectors
>>> df = spark.createDataFrame([(0.0, Vectors.dense([0.0, 0.0])), (1.0, Vectors.dense([0.0, 1.0]))], ["label", "features"])
>>> df.selectExpr("CAST(features AS STRING)").show(truncate = False)
+-------------------------------------------+
|features |
+-------------------------------------------+
|[6,1,0,0,2800000020,2,0,0,0] |
|[6,1,0,0,2800000020,2,0,0,3ff0000000000000]|
+-------------------------------------------+
```
The root cause is that `Cast` handles input data as `UserDefinedType.sqlType`(this is underlying storage type), so we should pass data into `UserDefinedType.deserialize` then `toString`.
This pr modified the result into;
```
+---------+
|features |
+---------+
|[0.0,0.0]|
|[0.0,1.0]|
+---------+
```
## How was this patch tested?
Added tests in `UserDefinedTypeSuite `.
Author: Takeshi Yamamuro <[email protected]>
Closes #20246 from maropu/SPARK-23054.
(cherry picked from commit b98ffa4)
Signed-off-by: Wenchen Fan <[email protected]>1 parent 2879236 commit 30574fd
File tree
2 files changed
+20
-2
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
- core/src/test/scala/org/apache/spark/sql
2 files changed
+20
-2
lines changedLines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
282 | 282 | | |
283 | 283 | | |
284 | 284 | | |
| 285 | + | |
| 286 | + | |
285 | 287 | | |
286 | 288 | | |
287 | 289 | | |
| |||
836 | 838 | | |
837 | 839 | | |
838 | 840 | | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
839 | 846 | | |
840 | 847 | | |
841 | 848 | | |
| |||
Lines changed: 13 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
| 48 | + | |
47 | 49 | | |
48 | 50 | | |
49 | 51 | | |
| |||
143 | 145 | | |
144 | 146 | | |
145 | 147 | | |
146 | | - | |
| 148 | + | |
| 149 | + | |
147 | 150 | | |
148 | 151 | | |
149 | 152 | | |
| |||
304 | 307 | | |
305 | 308 | | |
306 | 309 | | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
307 | 318 | | |
0 commit comments