-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32755][SQL][FOLLOWUP] Ensure -- method of AttributeSet have same behavior under Scala 2.12 and 2.13
#29689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // use a Scala 2.12 based code to maintains the insertion order in Scala 2.13 | ||
| case otherSet: AttributeSet => | ||
| new AttributeSet(baseSet -- otherSet.baseSet) | ||
| new AttributeSet(baseSet.clone() --= otherSet.baseSet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Scala 2.12 the implementation of -- is
override def --(xs: GenTraversableOnce[A]): This = clone() --= xs.seq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff method is defined as Computes the difference of this set and another set, I haven't tested overall behavior of diff, but I think the current changes don't look complicated either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff returns a set, so it can potentially mess up the ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell Actually, diff returns LinkedHashSet[AttributeEquals] here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Scala 2.13
def diff(that: collection.Set[A]): C =
toIterable.foldLeft(empty)((result, elem) => if (that contains elem) result else result += elem)
In Scala 2.12
def diff(that: GenSet[A]): This = this -- that
It looks like they're maintains ordered, but the input parameter needs to be a Set, so
other.map(a => new AttributeEquals(a.toAttribute))
part need call toSet。。。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then diff looks better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can avoid the conversion it would be nice. The current thing would be fine in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conversion seems inevitable if use diff :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the current method then :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok ~
|
cc @cloud-fan @hvanhovell @dbaliafroozeh to help review this ~ |
|
Test build #128436 has finished for PR 29689 at commit
|
hvanhovell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Jenkins retest this please |
|
github action passed, merging to master, thanks! |
|
Test build #128457 has finished for PR 29689 at commit
|
### What changes were proposed in this pull request? After #29660 and #29689 there are 13 remaining failed cases of sql core module with Scala 2.13. The reason for the remaining failed cases is the optimization result of `CostBasedJoinReorder` maybe different with same input in Scala 2.12 and Scala 2.13 if there are more than one same cost candidate plans. In this pr give a way to make the optimization result deterministic as much as possible to pass all remaining failed cases of `sql/core` module in Scala 2.13, the main change of this pr as follow: - Change to use `LinkedHashMap` instead of `Map` to store `foundPlans` in `JoinReorderDP.search` method to ensure same iteration order with same insert order because iteration order of `Map` behave differently under Scala 2.12 and 2.13 - Fixed `StarJoinCostBasedReorderSuite` affected by the above change - Regenerate golden files affected by the above change. ### Why are the changes needed? We need to support a Scala 2.13 build. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Scala 2.12: Pass the Jenkins or GitHub Action - Scala 2.13: All tests passed. Do the following: ``` dev/change-scala-version.sh 2.13 mvn clean install -DskipTests -pl sql/core -Pscala-2.13 -am mvn test -pl sql/core -Pscala-2.13 ``` **Before** ``` Tests: succeeded 8485, failed 13, canceled 1, ignored 52, pending 0 *** 13 TESTS FAILED *** ``` **After** ``` Tests: succeeded 8498, failed 0, canceled 1, ignored 52, pending 0 All tests passed. ``` Closes #29711 from LuciferYang/SPARK-32808-3. Authored-by: yangjie01 <[email protected]> Signed-off-by: Sean Owen <[email protected]>
### What changes were proposed in this pull request? After apache/spark#29660 and apache/spark#29689 there are 13 remaining failed cases of sql core module with Scala 2.13. The reason for the remaining failed cases is the optimization result of `CostBasedJoinReorder` maybe different with same input in Scala 2.12 and Scala 2.13 if there are more than one same cost candidate plans. In this pr give a way to make the optimization result deterministic as much as possible to pass all remaining failed cases of `sql/core` module in Scala 2.13, the main change of this pr as follow: - Change to use `LinkedHashMap` instead of `Map` to store `foundPlans` in `JoinReorderDP.search` method to ensure same iteration order with same insert order because iteration order of `Map` behave differently under Scala 2.12 and 2.13 - Fixed `StarJoinCostBasedReorderSuite` affected by the above change - Regenerate golden files affected by the above change. ### Why are the changes needed? We need to support a Scala 2.13 build. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Scala 2.12: Pass the Jenkins or GitHub Action - Scala 2.13: All tests passed. Do the following: ``` dev/change-scala-version.sh 2.13 mvn clean install -DskipTests -pl sql/core -Pscala-2.13 -am mvn test -pl sql/core -Pscala-2.13 ``` **Before** ``` Tests: succeeded 8485, failed 13, canceled 1, ignored 52, pending 0 *** 13 TESTS FAILED *** ``` **After** ``` Tests: succeeded 8498, failed 0, canceled 1, ignored 52, pending 0 All tests passed. ``` Closes #29711 from LuciferYang/SPARK-32808-3. Authored-by: yangjie01 <[email protected]> Signed-off-by: Sean Owen <[email protected]>
What changes were proposed in this pull request?
--method ofAttributeSetbehave differently under Scala 2.12 and 2.13 because--method ofLinkedHashSetin Scala 2.13 can't maintains the insertion order.This pr use a Scala 2.12 based code to ensure
--method of AttributeSet have same behavior under Scala 2.12 and 2.13.Why are the changes needed?
The behavior of
AttributeSetneeds to be compatible with Scala 2.12 and 2.13Does this PR introduce any user-facing change?
No
How was this patch tested?
Scala 2.12: Pass the Jenkins or GitHub Action
Scala 2.13: Manual test sub-suites of
PlanStabilitySuiteBefore :293 TESTS FAILED
After:13 TESTS FAILED(The remaining failures are not associated with the current issue)