Skip to content

[SPARK-27070] Fix performance bug in DefaultPartitionCoalescer#23986

Closed
fitermay wants to merge 4 commits intoapache:masterfrom
fitermay:SPARK-27070
Closed

[SPARK-27070] Fix performance bug in DefaultPartitionCoalescer#23986
fitermay wants to merge 4 commits intoapache:masterfrom
fitermay:SPARK-27070

Conversation

@fitermay
Copy link
Copy Markdown
Contributor

@fitermay fitermay commented Mar 6, 2019

When trying to coalesce a UnionRDD of two large FileScanRDDs
(each with a few million partitions) into around 8k partitions
the driver can stall for over an hour.

Profiler shows that over 90% of the time is spent in TimSort
which is invoked by pickBin.  This patch replaces sorting with a more
efficient min for the purpose of finding the least occupied
PartitionGroup

@fitermay fitermay changed the title SPARK-27070 [SPARK-27070] Fix performance bug in DefaultPartitionCoalescer Mar 6, 2019
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indent is off here. Returning the result of a subtraction in compare can overflow, though I don't think it can happen in practice here. Still, see below, I think we can just remove this.

Comment thread core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala Outdated
Comment thread core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala Outdated
When trying to coalesce a UnionRDD of two large FileScanRDDs
(each with a few million partitions) into around 8k partitions
the driver can stall for over an hour.

 Profiler shows that over 90% of the time is spent in TimSort
 which is invoked by `pickBin`.  This patch replaces sorting with a more
 efficient `min` for the purpose of finding the least occupied
 PartitionGroup
@fitermay
Copy link
Copy Markdown
Contributor Author

fitermay commented Mar 6, 2019

@srowen Hi, Thanks for the prompt review. I've amended the code to address your comments.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Mar 6, 2019

Test build #4597 has finished for PR 23986 at commit 4ffb772.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Comment thread core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala Outdated
@attilapiros
Copy link
Copy Markdown
Contributor

@fitermay It would be very nice to see some some rough numbers regarding the improvement caused by this PR. Could you please share us how it behaved before and after for your experiment mentioned in the description?

@srowen
Copy link
Copy Markdown
Member

srowen commented Mar 7, 2019

There's a little detail in the JIRA that's pretty suggestive that this is the bottleneck; if there's a stack trace or more numbers to show, that's great. Regardless I think this a clean 'win', just a question of how big.

@fitermay
Copy link
Copy Markdown
Contributor Author

fitermay commented Mar 7, 2019

@srowen @attilapiros
After digging into this further I found out what exactly is happening and why sorting here causes a major issue.

It runs out EMRFS returns the string '*' as the host of each block. This ends up invoking the worst case of this algorithm where it tries to jam everything into the same preferred partition. In turn, this ends up running sort on hundreds thousands of records each iteration to find the minimum. I've contacted the EMR team to suggest changing the host to 'localhost' but apparently that would break MR performance on Yarn.

I still think this patch is a win because:

  1. It's actually simpler and less code than the pre-patch code
  2. There are lots of EMR users who would benefit from this until a strategic solution is found
  3. It improves performance in less extreme cases as well

I will try to make the suggested changes and also generate some performance numbers for the extreme case tonight.

Thanks!

@fitermay
Copy link
Copy Markdown
Contributor Author

fitermay commented Mar 8, 2019

Benchmark with 100K blocks instead of several million. Number of hosts = 1 is clearly the worst case

Coalesced RDD:                            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Java HotSpot(TM) 64-Bit Server VM 1.8.0_112-b15 on Windows 10 10.0
Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
Coalesced RDD:                            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Coalesce Num Partitions: 100 Num Hosts: 1            492            520          33          0.2        4919.9       1.0X
Coalesce Num Partitions: 100 Num Hosts: 5            310            328          22          0.3        3103.2       1.6X
Coalesce Num Partitions: 100 Num Hosts: 10            247            267          19          0.4        2468.2       2.0X
Coalesce Num Partitions: 100 Num Hosts: 20            240            252          15          0.4        2399.7       2.1X
Coalesce Num Partitions: 100 Num Hosts: 40            229            244          16          0.4        2290.8       2.1X
Coalesce Num Partitions: 100 Num Hosts: 80            212            225          13          0.5        2123.6       2.3X
Coalesce Num Partitions: 500 Num Hosts: 1           1149           1177          26          0.1       11492.7       0.4X
Coalesce Num Partitions: 500 Num Hosts: 5            464            500          34          0.2        4643.8       1.1X
Coalesce Num Partitions: 500 Num Hosts: 10            386            397          19          0.3        3862.2       1.3X
Coalesce Num Partitions: 500 Num Hosts: 20            336            340           7          0.3        3358.1       1.5X
Coalesce Num Partitions: 500 Num Hosts: 40            269            283          17          0.4        2686.0       1.8X
Coalesce Num Partitions: 500 Num Hosts: 80            239            245           9          0.4        2391.0       2.1X
Coalesce Num Partitions: 1000 Num Hosts: 1           2213           2258          39          0.0       22131.2       0.2X
Coalesce Num Partitions: 1000 Num Hosts: 5            645            650           9          0.2        6448.8       0.8X
Coalesce Num Partitions: 1000 Num Hosts: 10            467            473           7          0.2        4673.8       1.1X
Coalesce Num Partitions: 1000 Num Hosts: 20            413            425          17          0.2        4133.7       1.2X
Coalesce Num Partitions: 1000 Num Hosts: 40            341            347          10          0.3        3412.4       1.4X
Coalesce Num Partitions: 1000 Num Hosts: 80            269            276          11          0.4        2688.8       1.8X
Coalesce Num Partitions: 5000 Num Hosts: 1          11048          11100          46          0.0      110484.2       0.0X
Coalesce Num Partitions: 5000 Num Hosts: 5           2396           2457          55          0.0       23959.0       0.2X
Coalesce Num Partitions: 5000 Num Hosts: 10           1390           1397           9          0.1       13899.1       0.4X
Coalesce Num Partitions: 5000 Num Hosts: 20            852            858           6          0.1        8516.9       0.6X
Coalesce Num Partitions: 5000 Num Hosts: 40            569            586          21          0.2        5692.7       0.9X
Coalesce Num Partitions: 5000 Num Hosts: 80            432            440           9          0.2        4322.7       1.1X
Coalesce Num Partitions: 10000 Num Hosts: 1          19685          19779          83          0.0      196853.8       0.0X
Coalesce Num Partitions: 10000 Num Hosts: 5           4044           4144          87          0.0       40437.9       0.1X
Coalesce Num Partitions: 10000 Num Hosts: 10           2393           2483          88          0.0       23931.6       0.2X
Coalesce Num Partitions: 10000 Num Hosts: 20           1242           1338          84          0.1       12419.6       0.4X
Coalesce Num Partitions: 10000 Num Hosts: 40            816            821           9          0.1        8158.7       0.6X
Coalesce Num Partitions: 10000 Num Hosts: 80            555            571          23          0.2        5554.2       0.9X

After patch:

Java HotSpot(TM) 64-Bit Server VM 1.8.0_112-b15 on Windows 10 10.0
Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
Coalesced RDD:                            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Coalesce Num Partitions: 100 Num Hosts: 1            394            433          37          0.3        3941.6       1.0X
Coalesce Num Partitions: 100 Num Hosts: 5            275            279           7          0.4        2748.4       1.4X
Coalesce Num Partitions: 100 Num Hosts: 10            236            241           9          0.4        2355.8       1.7X
Coalesce Num Partitions: 100 Num Hosts: 20            226            239          12          0.4        2259.1       1.7X
Coalesce Num Partitions: 100 Num Hosts: 40            220            233          14          0.5        2199.3       1.8X
Coalesce Num Partitions: 100 Num Hosts: 80            212            227          14          0.5        2120.3       1.9X
Coalesce Num Partitions: 500 Num Hosts: 1            961            976          24          0.1        9606.9       0.4X
Coalesce Num Partitions: 500 Num Hosts: 5            358            367          10          0.3        3580.5       1.1X
Coalesce Num Partitions: 500 Num Hosts: 10            288            299          19          0.3        2877.5       1.4X
Coalesce Num Partitions: 500 Num Hosts: 20            251            257           9          0.4        2508.5       1.6X
Coalesce Num Partitions: 500 Num Hosts: 40            248            252           4          0.4        2478.1       1.6X
Coalesce Num Partitions: 500 Num Hosts: 80            225            234          13          0.4        2247.3       1.8X
Coalesce Num Partitions: 1000 Num Hosts: 1           1575           1581           9          0.1       15747.8       0.3X
Coalesce Num Partitions: 1000 Num Hosts: 5            515            524          10          0.2        5154.8       0.8X
Coalesce Num Partitions: 1000 Num Hosts: 10            363            384          20          0.3        3633.5       1.1X
Coalesce Num Partitions: 1000 Num Hosts: 20            294            300           6          0.3        2943.6       1.3X
Coalesce Num Partitions: 1000 Num Hosts: 40            255            259           4          0.4        2549.3       1.5X
Coalesce Num Partitions: 1000 Num Hosts: 80            240            252          11          0.4        2398.7       1.6X
Coalesce Num Partitions: 5000 Num Hosts: 1           6904           6948          64          0.0       69038.0       0.1X
Coalesce Num Partitions: 5000 Num Hosts: 5           2070           2109          33          0.0       20702.0       0.2X
Coalesce Num Partitions: 5000 Num Hosts: 10           1136           1153          27          0.1       11362.4       0.3X
Coalesce Num Partitions: 5000 Num Hosts: 20            696            752          49          0.1        6964.3       0.6X
Coalesce Num Partitions: 5000 Num Hosts: 40            456            483          39          0.2        4555.8       0.9X
Coalesce Num Partitions: 5000 Num Hosts: 80            334            353          17          0.3        3340.8       1.2X
Coalesce Num Partitions: 10000 Num Hosts: 1          12789          12875         123          0.0      127889.3       0.0X
Coalesce Num Partitions: 10000 Num Hosts: 5           4040           4117          67          0.0       40402.9       0.1X
Coalesce Num Partitions: 10000 Num Hosts: 10           2141           2185          61          0.0       21414.0       0.2X
Coalesce Num Partitions: 10000 Num Hosts: 20           1152           1153           2          0.1       11516.1       0.3X
Coalesce Num Partitions: 10000 Num Hosts: 40            687            695          10          0.1        6869.5       0.6X
Coalesce Num Partitions: 10000 Num Hosts: 80            451            458           7          0.2        4505.0       0.9X

@fitermay
Copy link
Copy Markdown
Contributor Author

fitermay commented Mar 8, 2019

@fitermay @attilapiros
I've pushed the benchmark and the suggestion of using filter, map.

From the benchmark it's seems that host=1 is the absolute worst case and this change improves it by a decent margin. It improves the other cases slightly.

Copy link
Copy Markdown
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressing results.

Thanks for the improvement!

@fitermay
Copy link
Copy Markdown
Contributor Author

fitermay commented Mar 8, 2019

By the way. This are the results from the original PR before replacing min with minBy. It seems to be twice as fast. I'm guessing it's because of the reduction of indirection when passing an implicit ordering instead of a minBy lambda.

Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
Coalesced RDD:                            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Coalesce Num Partitions: 100 Num Hosts: 1            264            289          26          0.4        2644.9       1.0X
Coalesce Num Partitions: 100 Num Hosts: 5            211            220           8          0.5        2110.6       1.3X
Coalesce Num Partitions: 100 Num Hosts: 10            215            225          10          0.5        2149.9       1.2X
Coalesce Num Partitions: 100 Num Hosts: 20            200            203           6          0.5        1996.6       1.3X
Coalesce Num Partitions: 100 Num Hosts: 40            198            205          11          0.5        1983.7       1.3X
Coalesce Num Partitions: 100 Num Hosts: 80            199            203           4          0.5        1992.8       1.3X
Coalesce Num Partitions: 500 Num Hosts: 1            465            477          15          0.2        4654.2       0.6X
Coalesce Num Partitions: 500 Num Hosts: 5            271            280          11          0.4        2707.9       1.0X
Coalesce Num Partitions: 500 Num Hosts: 10            232            250          18          0.4        2320.5       1.1X
Coalesce Num Partitions: 500 Num Hosts: 20            213            222          14          0.5        2130.8       1.2X
Coalesce Num Partitions: 500 Num Hosts: 40            210            215           9          0.5        2102.9       1.3X
Coalesce Num Partitions: 500 Num Hosts: 80            206            206           0          0.5        2062.4       1.3X
Coalesce Num Partitions: 1000 Num Hosts: 1            715            716           1          0.1        7149.7       0.4X
Coalesce Num Partitions: 1000 Num Hosts: 5            310            311           1          0.3        3098.5       0.9X
Coalesce Num Partitions: 1000 Num Hosts: 10            255            266          17          0.4        2553.8       1.0X
Coalesce Num Partitions: 1000 Num Hosts: 20            230            238          12          0.4        2304.1       1.1X
Coalesce Num Partitions: 1000 Num Hosts: 40            227            242          21          0.4        2271.1       1.2X
Coalesce Num Partitions: 1000 Num Hosts: 80            211            217          10          0.5        2114.5       1.3X
Coalesce Num Partitions: 5000 Num Hosts: 1           3043           3616         634          0.0       30428.0       0.1X
Coalesce Num Partitions: 5000 Num Hosts: 5           1035           1069          52          0.1       10353.4       0.3X
Coalesce Num Partitions: 5000 Num Hosts: 10            613            617           3          0.2        6134.6       0.4X
Coalesce Num Partitions: 5000 Num Hosts: 20            408            419          11          0.2        4082.8       0.6X
Coalesce Num Partitions: 5000 Num Hosts: 40            315            340          24          0.3        3153.3       0.8X
Coalesce Num Partitions: 5000 Num Hosts: 80            258            262           5          0.4        2577.7       1.0X
Coalesce Num Partitions: 10000 Num Hosts: 1           5385           5470         124          0.0       53848.7       0.0X
Coalesce Num Partitions: 10000 Num Hosts: 5           1856           1861           7          0.1       18561.0       0.1X
Coalesce Num Partitions: 10000 Num Hosts: 10           1022           1075          48          0.1       10223.2       0.3X
Coalesce Num Partitions: 10000 Num Hosts: 20            619            626           8          0.2        6185.8       0.4X
Coalesce Num Partitions: 10000 Num Hosts: 40            417            422           5          0.2        4168.2       0.6X
Coalesce Num Partitions: 10000 Num Hosts: 80            312            316           4          0.3        3119.6       0.8X

@srowen
Copy link
Copy Markdown
Member

srowen commented Mar 8, 2019

Hm! That's surprising. Looking at min vs minBy, it even seems like min has more indirection (calls foldLeft). The implicit still involves calling a function to compare and get num partitions in both cases. If you're pretty sure this is accurate I'm OK returning to the implicit.

@attilapiros
Copy link
Copy Markdown
Contributor

Can I check the reason why is this difference on Monday/Tuesday? I mean can we wait with the merge?
I am interested what causing this and would like to look into that.

Comment thread core/src/test/scala/org/apache/spark/rdd/CoalescedRDDBenchmark.scala Outdated
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @fitermay . Thank you for your first contribution. I saw the above good previous comments, and I left a few comments, too.

After fixing that, we can trigger Jenkins for your PR. Otherwise, it will fail to build.

@dongjoon-hyun
Copy link
Copy Markdown
Member

ok to test

@SparkQA
Copy link
Copy Markdown

SparkQA commented Mar 8, 2019

Test build #103221 has finished for PR 23986 at commit 0803c63.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@fitermay
Copy link
Copy Markdown
Contributor Author

fitermay commented Mar 9, 2019

Hm! That's surprising. Looking at min vs minBy, it even seems like min has more indirection (calls foldLeft). The implicit still involves calling a function to compare and get num partitions in both cases. If you're pretty sure this is accurate I'm OK returning to the implicit.

@srowen
There is some non-obvious indirection here. However, I believe most of the overhead is attributable to boxing/boxing. Below is the relevant bytecode that ends up being generated

Sets up the lambda that's passed into minBy. Notice that the return type of the closure must be Ljava/lang/Object. So it can't return a primitive int.

    LINENUMBER 223 L0
    ALOAD 0
    INVOKEDYNAMIC apply()Lscala/Function1; [
      // handle kind 0x6 : INVOKESTATIC
      java/lang/invoke/LambdaMetafactory.altMetafactory(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
      // arguments:
      (Ljava/lang/Object;)Ljava/lang/Object;, 
      // handle kind 0x6 : INVOKESTATIC
      org/apache/spark/rdd/DefaultPartitionCoalescer.$anonfun$getLeastGroupHash$3$adapted(Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;, 
      (Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;, 
      7, 
      1, 
      scala.Serializable.class, 
      1, 
      (Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;
    ]
    GETSTATIC scala/math/Ordering$Int$.MODULE$ : Lscala/math/Ordering$Int$;
    INVOKEVIRTUAL scala/collection/mutable/ArrayBuffer.minBy (Lscala/Function1;Lscala/math/Ordering;)Ljava/lang/Object;

The lambda first invokes the below function, whose only job is to box the primitive int

  // access flags 0x1019
  public final static synthetic $anonfun$getLeastGroupHash$3$adapted(Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;
    // parameter final  x$7
   L0
    LINENUMBER 223 L0
    ALOAD 0
    INVOKESTATIC org/apache/spark/rdd/DefaultPartitionCoalescer.$anonfun$getLeastGroupHash$3 (Lorg/apache/spark/rdd/PartitionGroup;)I
    INVOKESTATIC scala/runtime/BoxesRunTime.boxToInteger (I)Ljava/lang/Integer;
    ARETURN
   L1
    LOCALVARIABLE x$7 Lorg/apache/spark/rdd/PartitionGroup; L0 L1 0
    MAXSTACK = 1
    MAXLOCALS = 1

Then the actual method that returns numParititons for the comparison gets invoked.

 public final static synthetic $anonfun$getLeastGroupHash$3(Lorg/apache/spark/rdd/PartitionGroup;)I
    // parameter final  x$7
   L0
    LINENUMBER 223 L0
    ALOAD 0
    INVOKEVIRTUAL org/apache/spark/rdd/PartitionGroup.numPartitions ()I
    IRETURN
   L1
    LOCALVARIABLE x$7 Lorg/apache/spark/rdd/PartitionGroup; L0 L1 0
    MAXSTACK = 1
    MAXLOCALS = 1

- Use `min` with Ordering instead of `minBy` to avoid boxing overhead
- Add benchmark results
- Add benchmarks description
- Don't use DebugFilesystem in benchmark
- Fix scalastyle
- Fix some minor existing codestyle issues in CoalescedRDD.scala
@fitermay
Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun @srowen
I've experimented enough today to be very confident that the reason behind the performance difference is the overhead introduced by boxing/unboxing

Pushed these changes:

  • Use min with Ordering instead of minBy to avoid boxing overhead
  • Added benchmark results
  • Added benchmarks description
  • Get rid of DebugFilesystem in benchmark
  • Fixed scalastyle
  • Fixed some minor existing codestyle issues in CoalescedRDD.scala

@SparkQA
Copy link
Copy Markdown

SparkQA commented Mar 10, 2019

Test build #103271 has finished for PR 23986 at commit c8424af.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Mar 10, 2019

Test build #103275 has finished for PR 23986 at commit 2566639.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Mar 10, 2019

retest this please

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Mar 10, 2019

sounds good to me

@SparkQA
Copy link
Copy Markdown

SparkQA commented Mar 10, 2019

Test build #103278 has finished for PR 23986 at commit 2566639.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@fitermay
Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun
I've applied your suggestions. Do you think you can give this another look?

@srowen
Copy link
Copy Markdown
Member

srowen commented Mar 15, 2019

Merged to master

@srowen srowen closed this in 21db433 Mar 15, 2019
implicit val partitionGroupOrdering: Ordering[PartitionGroup] =
(o1: PartitionGroup, o2: PartitionGroup) =>
java.lang.Integer.compare(o1.numPartitions, o2.numPartitions)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, All.
This seems to break scala-2.11 build.

[error] ../core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala:161: type mismatch;
[error]  found   : (org.apache.spark.rdd.PartitionGroup, org.apache.spark.rdd.PartitionGroup) => Int
[error]  required: Ordering[org.apache.spark.rdd.PartitionGroup]
[error]     (o1: PartitionGroup, o2: PartitionGroup) =>
[error]          

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That’s unfortunate. I’ll fix it later tonight

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, let me revert this first to recover 2.11 build for the other PRs. Since this PR is already approved, I believe that the next PR will be easily accepted, @fitermay .

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun @srowen: Would it be a good idea to extend the PR builder to run a compile with scala 2.11 (without any test run)?

I know it is an extra 10-15 minutes but for the 4 hours test run it might be worth preventing such situations on the other hand this must be very rare. What is your opinion?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you @attilapiros . But, IIRC, there was a discussion on that issue and the decision at that time was the current cost is not high enough for that.
The committers have a responsibility to monitor their commit. And, we usually are able to do HOTFIX or revert in a short time.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks.
Yes it must be very rare.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to drop 2.11 support soonish anyway, so I think for now we accept the occasional breaks and fix after the fact rather than double the PR builders.

@dongjoon-hyun
Copy link
Copy Markdown
Member

This is reverted via 4bab69b .
Please make a PR after testing both Scala 2.11 and 2.12. Thanks!

@dongjoon-hyun
Copy link
Copy Markdown
Member

@srowen
Copy link
Copy Markdown
Member

srowen commented Mar 16, 2019

@fitermay I guess it has to be more explicitly constructed as an Ordering?

[error] /home/jenkins/workspace/spark-master-test-maven-hadoop-2.7-ubuntu-scala-2.11/core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala:161: type mismatch;
[error]  found   : (org.apache.spark.rdd.PartitionGroup, org.apache.spark.rdd.PartitionGroup) => Int
[error]  required: Ordering[org.apache.spark.rdd.PartitionGroup]
[error]     (o1: PartitionGroup, o2: PartitionGroup) =>
[error]                                              ^
[info] (org.apache.spark.rdd.PartitionGroup, org.apache.spark.rdd.PartitionGroup) => Int <: Ordering[org.apache.spark.rdd.PartitionGroup]?
[info] false
[error] one error found
[error] Compile failed at Mar 14, 2019 6:17:42 PM [22.876s]

@fitermay
Copy link
Copy Markdown
Contributor Author

@srowen Yes, Scala 2.11 won't do the SAM conversion to ordering.
Sent another PR tested against both Scala versions #24116

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants