Commit 65e896a
[SPARK-18679][SQL] Fix regression in file listing performance for non-catalog tables
## What changes were proposed in this pull request?
In Spark 2.1 ListingFileCatalog was significantly refactored (and renamed to InMemoryFileIndex). This introduced a regression where parallelism could only be introduced at the very top of the tree. However, in many cases (e.g. `spark.read.parquet(topLevelDir)`), the top of the tree is only a single directory.
This PR simplifies and fixes the parallel recursive listing code to allow parallelism to be introduced at any level during recursive descent (though note that once we decide to list a sub-tree in parallel, the sub-tree is listed in serial on executors).
cc mallman cloud-fan
## How was this patch tested?
Checked metrics in unit tests.
Author: Eric Liang <[email protected]>
Closes #16112 from ericl/spark-18679.
(cherry picked from commit 294163e)
Signed-off-by: Wenchen Fan <[email protected]>1 parent a7f8ebb commit 65e896a
File tree
3 files changed
+106
-34
lines changed- core/src/main/scala/org/apache/spark/metrics/source
- sql/core/src
- main/scala/org/apache/spark/sql/execution/datasources
- test/scala/org/apache/spark/sql/execution/datasources
3 files changed
+106
-34
lines changedLines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
93 | 99 | | |
94 | 100 | | |
95 | 101 | | |
| |||
98 | 104 | | |
99 | 105 | | |
100 | 106 | | |
| 107 | + | |
101 | 108 | | |
102 | 109 | | |
103 | 110 | | |
104 | 111 | | |
105 | 112 | | |
106 | 113 | | |
107 | 114 | | |
| 115 | + | |
108 | 116 | | |
Lines changed: 45 additions & 34 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
249 | 249 | | |
250 | 250 | | |
251 | 251 | | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
258 | 255 | | |
259 | 256 | | |
260 | 257 | | |
| |||
286 | 283 | | |
287 | 284 | | |
288 | 285 | | |
289 | | - | |
290 | | - | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
307 | 292 | | |
308 | | - | |
| 293 | + | |
309 | 294 | | |
310 | 295 | | |
| 296 | + | |
311 | 297 | | |
312 | | - | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
313 | 306 | | |
| 307 | + | |
314 | 308 | | |
315 | 309 | | |
316 | 310 | | |
| |||
322 | 316 | | |
323 | 317 | | |
324 | 318 | | |
325 | | - | |
| 319 | + | |
326 | 320 | | |
327 | | - | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
328 | 324 | | |
329 | 325 | | |
330 | 326 | | |
| |||
372 | 368 | | |
373 | 369 | | |
374 | 370 | | |
375 | | - | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
376 | 377 | | |
377 | | - | |
378 | | - | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
379 | 383 | | |
| 384 | + | |
380 | 385 | | |
381 | 386 | | |
382 | 387 | | |
| |||
391 | 396 | | |
392 | 397 | | |
393 | 398 | | |
394 | | - | |
395 | | - | |
396 | | - | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
397 | 408 | | |
398 | 409 | | |
399 | 410 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
| |||
81 | 82 | | |
82 | 83 | | |
83 | 84 | | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
84 | 137 | | |
85 | 138 | | |
86 | 139 | | |
| |||
0 commit comments