Commit 0603913
[SPARK-33593][SQL] Vector reader got incorrect data with binary partition value
### What changes were proposed in this pull request?
Currently when enable parquet vectorized reader, use binary type as partition col will return incorrect value as below UT
```scala
test("Parquet vector reader incorrect with binary partition value") {
Seq(false, true).foreach(tag => {
withSQLConf("spark.sql.parquet.enableVectorizedReader" -> tag.toString) {
withTable("t1") {
sql(
"""CREATE TABLE t1(name STRING, id BINARY, part BINARY)
| USING PARQUET PARTITIONED BY (part)""".stripMargin)
sql(s"INSERT INTO t1 PARTITION(part = 'Spark SQL') VALUES('a', X'537061726B2053514C')")
if (tag) {
checkAnswer(sql("SELECT name, cast(id as string), cast(part as string) FROM t1"),
Row("a", "Spark SQL", ""))
} else {
checkAnswer(sql("SELECT name, cast(id as string), cast(part as string) FROM t1"),
Row("a", "Spark SQL", "Spark SQL"))
}
}
}
})
}
```
### Why are the changes needed?
Fix data incorrect issue
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added UT
Closes #30824 from AngersZhuuuu/SPARK-33593.
Authored-by: angerszhu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>1 parent b0da2bc commit 0603913
File tree
4 files changed
+114
-3
lines changed- sql/core/src
- main/java/org/apache/spark/sql/execution/vectorized
- test/scala/org/apache/spark/sql
- execution/datasources
- orc
- parquet
4 files changed
+114
-3
lines changedLines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| 57 | + | |
| 58 | + | |
57 | 59 | | |
58 | 60 | | |
59 | 61 | | |
| |||
94 | 96 | | |
95 | 97 | | |
96 | 98 | | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
97 | 102 | | |
98 | 103 | | |
99 | 104 | | |
| |||
Lines changed: 26 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3745 | 3745 | | |
3746 | 3746 | | |
3747 | 3747 | | |
| 3748 | + | |
| 3749 | + | |
| 3750 | + | |
| 3751 | + | |
| 3752 | + | |
| 3753 | + | |
| 3754 | + | |
| 3755 | + | |
| 3756 | + | |
| 3757 | + | |
| 3758 | + | |
| 3759 | + | |
| 3760 | + | |
| 3761 | + | |
| 3762 | + | |
| 3763 | + | |
| 3764 | + | |
| 3765 | + | |
| 3766 | + | |
| 3767 | + | |
| 3768 | + | |
| 3769 | + | |
| 3770 | + | |
| 3771 | + | |
| 3772 | + | |
| 3773 | + | |
3748 | 3774 | | |
3749 | 3775 | | |
3750 | 3776 | | |
Lines changed: 76 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
20 | 26 | | |
21 | 27 | | |
22 | 28 | | |
23 | 29 | | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
24 | 33 | | |
25 | 34 | | |
26 | | - | |
| 35 | + | |
| 36 | + | |
27 | 37 | | |
28 | 38 | | |
29 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
30 | 43 | | |
31 | 44 | | |
32 | 45 | | |
| |||
77 | 90 | | |
78 | 91 | | |
79 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
80 | 155 | | |
Lines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
790 | 790 | | |
791 | 791 | | |
792 | 792 | | |
793 | | - | |
| 793 | + | |
794 | 794 | | |
795 | 795 | | |
796 | 796 | | |
797 | 797 | | |
798 | 798 | | |
799 | 799 | | |
800 | 800 | | |
| 801 | + | |
801 | 802 | | |
802 | 803 | | |
803 | 804 | | |
| |||
825 | 826 | | |
826 | 827 | | |
827 | 828 | | |
828 | | - | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
829 | 834 | | |
830 | 835 | | |
831 | 836 | | |
| |||
0 commit comments