Skip to content

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Sep 9, 2020

What changes were proposed in this pull request?

-- method of AttributeSet behave differently under Scala 2.12 and 2.13 because -- method of LinkedHashSet in Scala 2.13 can't maintains the insertion order.

This pr use a Scala 2.12 based code to ensure -- method of AttributeSet have same behavior under Scala 2.12 and 2.13.

Why are the changes needed?

The behavior of AttributeSet needs to be compatible with Scala 2.12 and 2.13

Does this PR introduce any user-facing change?

No

How was this patch tested?

Scala 2.12: Pass the Jenkins or GitHub Action

Scala 2.13: Manual test sub-suites of PlanStabilitySuite

  • Before :293 TESTS FAILED

  • After:13 TESTS FAILED(The remaining failures are not associated with the current issue)

// use a Scala 2.12 based code to maintains the insertion order in Scala 2.13
case otherSet: AttributeSet =>
new AttributeSet(baseSet -- otherSet.baseSet)
new AttributeSet(baseSet.clone() --= otherSet.baseSet)
Copy link
Contributor Author

@LuciferYang LuciferYang Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Scala 2.12 the implementation of -- is

 override def --(xs: GenTraversableOnce[A]): This = clone() --= xs.seq

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff method is defined as Computes the difference of this set and another set, I haven't tested overall behavior of diff, but I think the current changes don't look complicated either

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff returns a set, so it can potentially mess up the ordering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
@hvanhovell Actually, diff returns LinkedHashSet[AttributeEquals] here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Scala 2.13

def diff(that: collection.Set[A]): C =
    toIterable.foldLeft(empty)((result, elem) => if (that contains elem) result else result += elem)

In Scala 2.12

def diff(that: GenSet[A]): This = this -- that

It looks like they're maintains ordered, but the input parameter needs to be a Set, so

other.map(a => new AttributeEquals(a.toAttribute))

part need call toSet。。。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then diff looks better.

Copy link
Contributor

@hvanhovell hvanhovell Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can avoid the conversion it would be nice. The current thing would be fine in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conversion seems inevitable if use diff :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the current method then :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok ~

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Sep 9, 2020

cc @cloud-fan @hvanhovell @dbaliafroozeh to help review this ~

@SparkQA
Copy link

SparkQA commented Sep 9, 2020

Test build #128436 has finished for PR 29689 at commit a595600.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@srowen
Copy link
Member

srowen commented Sep 9, 2020

Jenkins retest this please

@cloud-fan
Copy link
Contributor

github action passed, merging to master, thanks!

@cloud-fan cloud-fan closed this in fc10511 Sep 9, 2020
@LuciferYang
Copy link
Contributor Author

@SparkQA
Copy link

SparkQA commented Sep 9, 2020

Test build #128457 has finished for PR 29689 at commit a595600.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

srowen pushed a commit that referenced this pull request Sep 18, 2020
### What changes were proposed in this pull request?

After #29660 and #29689 there are 13 remaining  failed cases of sql core module with Scala 2.13.

The reason for the remaining failed cases is the optimization result of `CostBasedJoinReorder` maybe different with same input in Scala 2.12 and Scala 2.13 if there are more than one same cost candidate plans.

In this pr give a way to make the  optimization result deterministic as much as possible to pass all remaining failed cases of `sql/core` module in Scala 2.13, the main change of this pr as follow:

- Change to use `LinkedHashMap` instead of `Map` to store `foundPlans` in `JoinReorderDP.search` method to ensure same iteration order with same insert order because iteration order of `Map` behave differently under Scala 2.12 and 2.13

- Fixed `StarJoinCostBasedReorderSuite` affected by the above change

- Regenerate golden files affected by the above change.

### Why are the changes needed?
We need to support a Scala 2.13 build.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Scala 2.12: Pass the Jenkins or GitHub Action

- Scala 2.13: All tests passed.

Do the following:

```
dev/change-scala-version.sh 2.13
mvn clean install -DskipTests  -pl sql/core -Pscala-2.13 -am
mvn test -pl sql/core -Pscala-2.13
```

**Before**
```
Tests: succeeded 8485, failed 13, canceled 1, ignored 52, pending 0
*** 13 TESTS FAILED ***

```

**After**

```
Tests: succeeded 8498, failed 0, canceled 1, ignored 52, pending 0
All tests passed.
```

Closes #29711 from LuciferYang/SPARK-32808-3.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
a0x8o added a commit to a0x8o/spark that referenced this pull request Sep 18, 2020
### What changes were proposed in this pull request?

After apache/spark#29660 and apache/spark#29689 there are 13 remaining  failed cases of sql core module with Scala 2.13.

The reason for the remaining failed cases is the optimization result of `CostBasedJoinReorder` maybe different with same input in Scala 2.12 and Scala 2.13 if there are more than one same cost candidate plans.

In this pr give a way to make the  optimization result deterministic as much as possible to pass all remaining failed cases of `sql/core` module in Scala 2.13, the main change of this pr as follow:

- Change to use `LinkedHashMap` instead of `Map` to store `foundPlans` in `JoinReorderDP.search` method to ensure same iteration order with same insert order because iteration order of `Map` behave differently under Scala 2.12 and 2.13

- Fixed `StarJoinCostBasedReorderSuite` affected by the above change

- Regenerate golden files affected by the above change.

### Why are the changes needed?
We need to support a Scala 2.13 build.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Scala 2.12: Pass the Jenkins or GitHub Action

- Scala 2.13: All tests passed.

Do the following:

```
dev/change-scala-version.sh 2.13
mvn clean install -DskipTests  -pl sql/core -Pscala-2.13 -am
mvn test -pl sql/core -Pscala-2.13
```

**Before**
```
Tests: succeeded 8485, failed 13, canceled 1, ignored 52, pending 0
*** 13 TESTS FAILED ***

```

**After**

```
Tests: succeeded 8498, failed 0, canceled 1, ignored 52, pending 0
All tests passed.
```

Closes #29711 from LuciferYang/SPARK-32808-3.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
@LuciferYang LuciferYang deleted the SPARK-32755-FOLLOWUP branch June 6, 2022 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants