Commit 3bce43f
[SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join
In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as,
```Scala
val iterable = Seq(1, 2, 3).map(v => {
println(v)
v
})
println("Iterable map done")
val iterator = Seq(1, 2, 3).iterator.map(v => {
println(v)
v
})
println("Iterator map done")
```
outputed
```
1
2
3
Iterable map done
Iterator map done
```
So we should use 'iterator' to reduce memory consumed by join.
Found by Johannes Simon in http://mail-archives.apache.org/mod_mbox/spark-user/201412.mbox/%3C5BE70814-9D03-4F61-AE2C-0D63F2DE4446%40mail.de%3E
Author: zsxwing <[email protected]>
Closes #3671 from zsxwing/SPARK-4824 and squashes the following commits:
48ee7b9 [zsxwing] Remove the explicit types
95d59d6 [zsxwing] Add 'iterator' to reduce memory consumed by join
(cherry picked from commit c233ab3)
Signed-off-by: Josh Rosen <[email protected]>
Conflicts:
core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala1 parent e5f2752 commit 3bce43f
File tree
1 file changed
+5
-5
lines changed- core/src/main/scala/org/apache/spark/rdd
1 file changed
+5
-5
lines changedLines changed: 5 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
472 | 472 | | |
473 | 473 | | |
474 | 474 | | |
475 | | - | |
| 475 | + | |
476 | 476 | | |
477 | 477 | | |
478 | 478 | | |
| |||
485 | 485 | | |
486 | 486 | | |
487 | 487 | | |
488 | | - | |
| 488 | + | |
489 | 489 | | |
490 | | - | |
| 490 | + | |
491 | 491 | | |
492 | 492 | | |
493 | 493 | | |
| |||
502 | 502 | | |
503 | 503 | | |
504 | 504 | | |
505 | | - | |
| 505 | + | |
506 | 506 | | |
507 | | - | |
| 507 | + | |
508 | 508 | | |
509 | 509 | | |
510 | 510 | | |
| |||
0 commit comments