Commit 3f4bda7
[SPARK-24578][CORE] Cap sub-region's size of returned nio buffer
## What changes were proposed in this pull request?
This PR tries to fix the performance regression introduced by SPARK-21517.
In our production job, we performed many parallel computations, with high possibility, some task could be scheduled to a host-2 where it needs to read the cache block data from host-1. Often, this big transfer makes the cluster suffer time out issue (it will retry 3 times, each with 120s timeout, and then do recompute to put the cache block into the local MemoryStore).
The root cause is that we don't do `consolidateIfNeeded` anymore as we are using
```
Unpooled.wrappedBuffer(chunks.length, getChunks(): _*)
```
in ChunkedByteBuffer. If we have many small chunks, it could cause the `buf.notBuffer(...)` have very bad performance in the case that we have to call `copyByteBuf(...)` many times.
## How was this patch tested?
Existing unit tests and also test in production
Author: Wenbo Zhao <[email protected]>
Closes #21593 from WenboZhao/spark-24578.1 parent c5a0d11 commit 3f4bda7
File tree
1 file changed
+5
-20
lines changed- common/network-common/src/main/java/org/apache/spark/network/protocol
1 file changed
+5
-20
lines changedLines changed: 5 additions & 20 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
137 | 137 | | |
138 | 138 | | |
139 | 139 | | |
140 | | - | |
141 | | - | |
142 | | - | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
143 | 145 | | |
144 | 146 | | |
145 | 147 | | |
146 | 148 | | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | 149 | | |
165 | 150 | | |
166 | 151 | | |
| |||
0 commit comments