Commit 3bb8cd9
[SPARK-36568][SQL] Better FileScan statistics estimation
### What changes were proposed in this pull request?
This PR modifies `FileScan.estimateStatistics()` to take the read schema into account.
### Why are the changes needed?
`V2ScanRelationPushDown` can column prune `DataSourceV2ScanRelation`s and change read schema of `Scan` operations. The better statistics returned by `FileScan.estimateStatistics()` can mean better query plans. For example, with this change the broadcast issue in SPARK-36568 can be avoided.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Added new UT.
Closes #33825 from peter-toth/SPARK-36568-scan-statistics-estimation.
Authored-by: Peter Toth <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>1 parent 159ff9f commit 3bb8cd9
File tree
6 files changed
+35
-4
lines changed- sql/core/src
- main/scala/org/apache/spark/sql/execution/datasources/v2
- text
- test/scala/org/apache/spark/sql
- test
6 files changed
+35
-4
lines changedLines changed: 6 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
| 53 | + | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
| |||
187 | 189 | | |
188 | 190 | | |
189 | 191 | | |
190 | | - | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
191 | 196 | | |
192 | 197 | | |
193 | 198 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
| 36 | + | |
37 | 37 | | |
38 | 38 | | |
Lines changed: 22 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
731 | 731 | | |
732 | 732 | | |
733 | 733 | | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
734 | 756 | | |
735 | 757 | | |
736 | 758 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
367 | 367 | | |
368 | 368 | | |
369 | 369 | | |
370 | | - | |
| 370 | + | |
371 | 371 | | |
372 | 372 | | |
373 | 373 | | |
| |||
Lines changed: 4 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
459 | 460 | | |
460 | 461 | | |
461 | 462 | | |
462 | | - | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
463 | 466 | | |
464 | 467 | | |
465 | 468 | | |
| |||
0 commit comments