Skip to content

Conversation

@mccheah
Copy link

@mccheah mccheah commented Apr 16, 2019

Ported from bloomberg#9. Credits to @ifilonenko!

* initial push

* closing input channel error

* added certain fixes to make all tests except transferTo pass

* small fix to make transferTo properly work

* Always close the serializer stream. (#10)

* But don't close the partition stream returned by the partition writer.

* fix issues that are blocking tests

* style

* further style

* resolve comments

* remove unecessary file creation

* remove .toStream

* hot fix to remove log statement

* fix contract
@mccheah mccheah changed the title [SPARK-25299] Make UnsafeShuffleWriter use the new API (#9) [SPARK-25299] Make UnsafeShuffleWriter use the new API Apr 16, 2019
@bulldozer-bot bulldozer-bot bot merged commit 8b021b3 into spark-25299 Apr 17, 2019
@bulldozer-bot bulldozer-bot bot deleted the unsafe-shuffle-writer branch April 17, 2019 02:13
Copy link

@svc-spark-25299 svc-spark-25299 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

================================================================================================
BlockStoreShuffleReader reader
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
no aggregation or sorting:                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
local fetch                                       10580          10808          90          0.9        1058.0       1.0X
remote rpc fetch                                  10704          10794          63          0.9        1070.4       1.0X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
with aggregation:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
local fetch                                       30625          31354         580          0.1       15312.7       1.0X
remote rpc fetch                                  30177          30681         480          0.1       15088.5       1.0X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
with sorting:                             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
local fetch                                       30856          31248         331          0.1       15427.8       1.0X
remote rpc fetch                                  31226          31750         546          0.1       15613.1       1.0X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
with seek:                                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
seek to last record                                   1              1           0       2695.7           0.4       1.0X

================================================================================================
BypassMergeSortShuffleWriter write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
BypassMergeSortShuffleWrite without spill:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
small dataset without disk spill                      2              4           3          0.5        2077.5       1.0X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
BypassMergeSortShuffleWrite with spill:   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
without transferTo                                 7513           7631          57          0.9        1119.5       1.0X
with transferTo                                    7585           7661         103          0.9        1130.2       1.0X

================================================================================================
SortShuffleWriter writer
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
SortShuffleWriter without spills:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
small dataset without spills                         11             19           7          0.1       10601.4       1.0X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
SortShuffleWriter with spills:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
no map side combine                               14645          14870         175          0.5        2182.2       1.0X
with map side aggregation                         14487          14611         134          0.5        2158.7       1.0X
with map side sort                                14578          14651          67          0.5        2172.3       1.0X

================================================================================================
UnsafeShuffleWriter write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
UnsafeShuffleWriter without spills:       Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
small dataset without spills                         21             25           3          0.0       21084.2       1.0X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
UnsafeShuffleWriter with spills:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
without transferTo                                15643          15768         105          0.9        1165.5       1.0X
with transferTo                                   15557          15673         114          0.9        1159.1       1.0X



mccheah added a commit that referenced this pull request Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants