Commit ce6ed2a
[SPARK-3954][Streaming] Optimization to FileInputDStream
about convert files to RDDS there are 3 loops with files sequence in spark source.
loops files sequence:
1.files.map(...)
2.files.zip(fileRDDs)
3.files-size.foreach
It's will very time consuming when lots of files.So I do the following correction:
3 loops with files sequence => only one loop
Author: surq <[email protected]>
Closes apache#2811 from surq/SPARK-3954 and squashes the following commits:
321bbe8 [surq] updated the code style.The style from [for...yield]to [files.map(file=>{})]
88a2c20 [surq] Merge branch 'master' of https://github.com/apache/spark into SPARK-3954
178066f [surq] modify code's style. [Exceeds 100 columns]
626ef97 [surq] remove redundant import(ArrayBuffer)
739341f [surq] promote the speed of convert files to RDDS1 parent a1fc059 commit ce6ed2a
File tree
1 file changed
+4
-3
lines changed- streaming/src/main/scala/org/apache/spark/streaming/dstream
1 file changed
+4
-3
lines changedLines changed: 4 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
123 | | - | |
124 | | - | |
| 123 | + | |
| 124 | + | |
125 | 125 | | |
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
130 | | - | |
| 130 | + | |
| 131 | + | |
131 | 132 | | |
132 | 133 | | |
133 | 134 | | |
| |||
0 commit comments