[SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13 #29660

LuciferYang · 2020-09-07T09:51:22Z

What changes were proposed in this pull request?

The purpose of this pr is to partial resolve SPARK-32808, total of 26 failed test cases were fixed, the related suite as follow:

StreamingAggregationSuite related test cases (2 FAILED -> Pass)
GeneratorFunctionSuite related test cases (2 FAILED -> Pass)
UDFSuite related test cases (2 FAILED -> Pass)
SQLQueryTestSuite related test cases (5 FAILED -> Pass)
WholeStageCodegenSuite related test cases (1 FAILED -> Pass)
DataFrameSuite related test cases (3 FAILED -> Pass)
OrcV1QuerySuite\OrcV2QuerySuite related test cases (4 FAILED -> Pass)
ExpressionsSchemaSuite related test cases (1 FAILED -> Pass)
DataFrameStatSuite related test cases (1 FAILED -> Pass)
JsonV1Suite\JsonV2Suite\JsonLegacyTimeParserSuite related test cases (6 FAILED -> Pass)

The main change of this pr as following:

Fix Scala 2.13 compilation problems in ShuffleBlockFetcherIterator and Analyzer
Specified Seq to scala.collection.Seq in objects.scala and GenericArrayData because internal use Seq maybe mutable.ArraySeq and not easy to call .toSeq
Should specified Seq to scala.collection.Seq when we call Row.getAs[Seq] and Row.get(i).asInstanceOf[Seq] because the data maybe mutable.ArraySeq but Seq is immutable.Seq in Scala 2.13
Use a compatible way to let + and - method of Decimal having the same behavior in Scala 2.12 and Scala 2.13
Call toList in RelationalGroupedDataset.toDF method when groupingExprs is Stream type because Stream can't serialize in Scala 2.13
Add a manual sort to classFunsMap in ExpressionsSchemaSuite because Iterable.groupBy in Scala 2.13 has different result with TraversableLike.groupBy in Scala 2.12

Why are the changes needed?

We need to support a Scala 2.13 build.

Does this PR introduce any user-facing change?

Should specified Seq to scala.collection.Seq when we call Row.getAs[Seq] and Row.get(i).asInstanceOf[Seq] because the data maybe mutable.ArraySeq but the Seq is immutable.Seq in Scala 2.13

How was this patch tested?

Scala 2.12: Pass the Jenkins or GitHub Action
Scala 2.13: Do the following:

dev/change-scala-version.sh 2.13
mvn clean install -DskipTests  -pl sql/core -Pscala-2.13 -am
mvn test -pl sql/core -Pscala-2.13

Before

Tests: succeeded 8166, failed 319, canceled 1, ignored 52, pending 0
*** 319 TESTS FAILED ***

After

Tests: succeeded 8204, failed 286, canceled 1, ignored 52, pending 0
*** 286 TESTS FAILED ***

…onV2Suite JsonLegacyTimeParserSuite DataStreamReaderWriterSuite

…yTestSuite DataFrameSuite

LuciferYang · 2020-09-07T09:57:06Z

cc @srowen , this pr is try to pass some test case of sql/core module in Scala 2.13, other failures may related to CostBasedJoinReorder, I will check it and feedback later.

LuciferYang · 2020-09-07T10:06:31Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala

-    case seq: Seq[Any] => seq.toArray
+    // Specified this as`scala.collection.Seq` because seqOrArray can be
+    // `mutable.ArraySeq` in Scala 2.13
+    case seq: scala.collection.Seq[Any] => seq.toArray


The entrance is

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala

Lines 28 to 38 in 04f7f6d

object ArrayData {

def toArrayData(input: Any): ArrayData = input match {

case a: Array[Boolean] => UnsafeArrayData.fromPrimitiveArray(a)

case a: Array[Byte] => UnsafeArrayData.fromPrimitiveArray(a)

case a: Array[Short] => UnsafeArrayData.fromPrimitiveArray(a)

case a: Array[Int] => UnsafeArrayData.fromPrimitiveArray(a)

case a: Array[Long] => UnsafeArrayData.fromPrimitiveArray(a)

case a: Array[Float] => UnsafeArrayData.fromPrimitiveArray(a)

case a: Array[Double] => UnsafeArrayData.fromPrimitiveArray(a)

case other => new GenericArrayData(other)

}

not easy to call toSeq, so I changed it here

LuciferYang · 2020-09-07T10:10:10Z

sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala

    if (decimalVal.eq(null) && that.decimalVal.eq(null) && scale == that.scale) {
      Decimal(longVal + that.longVal, Math.max(precision, that.precision), scale)
    } else {
-      Decimal(toBigDecimal + that.toBigDecimal)


In Scala 2.13, + method is

def + (that: BigDecimal): BigDecimal = new BigDecimal(this.bigDecimal.add(that.bigDecimal, mc), mc)

and in Scala 2.12 + method is

def + (that: BigDecimal): BigDecimal = new BigDecimal(this.bigDecimal add that.bigDecimal, mc)

There are some differences in accuracy.

I don't think we want to set a MathContext here anyway?

Do you mean we need change to use methods with MathContext ? Like BigDecimal add(BigDecimal augend, MathContext mc) ?

Sorry , I think I don't fully understand this comments ....

I think the change is OK here, because we actually do not want to modify the rounding, right?

Yes, you are right ~

LuciferYang · 2020-09-07T11:42:06Z

add executors default profile *** FAILED *** (82 milliseconds) failed in GItHub Action, but successful in local test.

SparkQA · 2020-09-07T12:34:27Z

Test build #128352 has finished for PR 29660 at commit 1fa24b9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

LuciferYang · 2020-09-07T15:01:05Z

cc @srowen , this pr is try to pass some test case of sql/core module in Scala 2.13, other failures may related to CostBasedJoinReorder, I will check it and feedback later.

There are other reasons. I'm working on it

sql/core/src/test/resources/sql-functions/sql-expression-schema.md

srowen · 2020-09-07T15:24:34Z

sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala

    if (decimalVal.eq(null) && that.decimalVal.eq(null) && scale == that.scale) {
      Decimal(longVal + that.longVal, Math.max(precision, that.precision), scale)
    } else {
-      Decimal(toBigDecimal + that.toBigDecimal)


I don't think we want to set a MathContext here anyway?

srowen · 2020-09-07T16:34:29Z

Do you want to add more changes here? we can merge it whenever it gets big and continue in another PR if desired.

LuciferYang · 2020-09-07T16:41:49Z

@srowen Maybe we can merge this first, other failures are related to the 'PlanStabilitySuite' and I will continue to fix these in another pr.

LuciferYang · 2020-09-08T05:45:17Z

Address 454b53c merge upstream master and resolve conflict file

SparkQA · 2020-09-08T07:05:02Z

Test build #128381 has finished for PR 29660 at commit 454b53c.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class ExecutorDecommissionInfo(message: String, workerHost: Option[String] = None)
throw new AnalysisException(s\"Can not load class '$className' when registering \" +

srowen · 2020-09-08T13:33:04Z

Jenkins retest this please

SparkQA · 2020-09-08T17:17:34Z

Test build #128409 has finished for PR 29660 at commit 454b53c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class ExecutorDecommissionInfo(message: String, workerHost: Option[String] = None)
throw new AnalysisException(s\"Can not load class '$className' when registering \" +

LuciferYang · 2020-09-09T02:38:17Z

local test mvn clean test -pl core -DwildcardSuites=org.apache.spark.scheduler.BarrierTaskContextSuite -Dtest=none all passed.

Discovery starting.
Discovery completed in 3 seconds, 740 milliseconds.
Run starting. Expected test count is: 11
BarrierTaskContextSuite:
- global sync by barrier() call
- share messages with allGather() call
- throw exception if we attempt to synchronize with different blocking calls
- successively sync with allGather and barrier
- support multiple barrier() call within a single task
- throw exception on barrier() call timeout
- throw exception if barrier() call doesn't happen on every task
- throw exception if the number of barrier() calls are not the same on every task
- barrier task killed, no interrupt
- barrier task killed, interrupt
- SPARK-31485: barrier stage should fail if only partial tasks are launched
Run completed in 4 minutes, 51 seconds.
Total number of tests run: 11
Suites: completed 2, aborted 0
Tests: succeeded 11, failed 0, canceled 0, ignored 0, pending 0

LuciferYang · 2020-09-09T03:02:49Z

Address 9185a95 re-sync master and local test core module ,all test passed.

SparkQA · 2020-09-09T05:29:19Z

Test build #128430 has finished for PR 29660 at commit 9185a95.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

LuciferYang · 2020-09-09T06:01:13Z

org.apache.spark.sql.hive.thriftserver.CliSuite.* failed because Database clitestdb already exists.......

LuciferYang · 2020-09-09T09:42:02Z

local test org.apache.spark.sql.hive.thriftserver.CliSuite, all 28 case succeeded

xuanyuanking · 2020-09-09T10:15:57Z

retest this please

SparkQA · 2020-09-09T13:20:47Z

Test build #128446 has finished for PR 29660 at commit 9185a95.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

LuciferYang · 2020-09-09T13:26:23Z

@srowen Jenkins and GitHub Action all passed ~

srowen · 2020-09-09T13:54:29Z

Merged to master. I have left the JIRA open though.

LuciferYang · 2020-09-09T13:58:48Z

thx @srowen @xuanyuanking

### What changes were proposed in this pull request? After #29660 and #29689 there are 13 remaining failed cases of sql core module with Scala 2.13. The reason for the remaining failed cases is the optimization result of `CostBasedJoinReorder` maybe different with same input in Scala 2.12 and Scala 2.13 if there are more than one same cost candidate plans. In this pr give a way to make the optimization result deterministic as much as possible to pass all remaining failed cases of `sql/core` module in Scala 2.13, the main change of this pr as follow: - Change to use `LinkedHashMap` instead of `Map` to store `foundPlans` in `JoinReorderDP.search` method to ensure same iteration order with same insert order because iteration order of `Map` behave differently under Scala 2.12 and 2.13 - Fixed `StarJoinCostBasedReorderSuite` affected by the above change - Regenerate golden files affected by the above change. ### Why are the changes needed? We need to support a Scala 2.13 build. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Scala 2.12: Pass the Jenkins or GitHub Action - Scala 2.13: All tests passed. Do the following: ``` dev/change-scala-version.sh 2.13 mvn clean install -DskipTests -pl sql/core -Pscala-2.13 -am mvn test -pl sql/core -Pscala-2.13 ``` **Before** ``` Tests: succeeded 8485, failed 13, canceled 1, ignored 52, pending 0 *** 13 TESTS FAILED *** ``` **After** ``` Tests: succeeded 8498, failed 0, canceled 1, ignored 52, pending 0 All tests passed. ``` Closes #29711 from LuciferYang/SPARK-32808-3. Authored-by: yangjie01 <[email protected]> Signed-off-by: Sean Owen <[email protected]>

### What changes were proposed in this pull request? After apache/spark#29660 and apache/spark#29689 there are 13 remaining failed cases of sql core module with Scala 2.13. The reason for the remaining failed cases is the optimization result of `CostBasedJoinReorder` maybe different with same input in Scala 2.12 and Scala 2.13 if there are more than one same cost candidate plans. In this pr give a way to make the optimization result deterministic as much as possible to pass all remaining failed cases of `sql/core` module in Scala 2.13, the main change of this pr as follow: - Change to use `LinkedHashMap` instead of `Map` to store `foundPlans` in `JoinReorderDP.search` method to ensure same iteration order with same insert order because iteration order of `Map` behave differently under Scala 2.12 and 2.13 - Fixed `StarJoinCostBasedReorderSuite` affected by the above change - Regenerate golden files affected by the above change. ### Why are the changes needed? We need to support a Scala 2.13 build. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Scala 2.12: Pass the Jenkins or GitHub Action - Scala 2.13: All tests passed. Do the following: ``` dev/change-scala-version.sh 2.13 mvn clean install -DskipTests -pl sql/core -Pscala-2.13 -am mvn test -pl sql/core -Pscala-2.13 ``` **Before** ``` Tests: succeeded 8485, failed 13, canceled 1, ignored 52, pending 0 *** 13 TESTS FAILED *** ``` **After** ``` Tests: succeeded 8498, failed 0, canceled 1, ignored 52, pending 0 All tests passed. ``` Closes #29711 from LuciferYang/SPARK-32808-3. Authored-by: yangjie01 <[email protected]> Signed-off-by: Sean Owen <[email protected]>

LuciferYang added 4 commits September 7, 2020 12:11

fix compile

6171535

Fix DataFrameStatSuite OrcV1QuerySuite OrcV2QuerySuite JsonV1Suite Js…

990d75c

…onV2Suite JsonLegacyTimeParserSuite DataStreamReaderWriterSuite

Fix StreamingAggregationSuite GeneratorFunctionSuite UDFSuite SQLQuer…

51238d6

…yTestSuite DataFrameSuite

Fix ExpressionsSchemaSuite

1fa24b9

probot-autolabeler bot added CORE SQL labels Sep 7, 2020

LuciferYang commented Sep 7, 2020

View reviewed changes

srowen reviewed Sep 7, 2020

View reviewed changes

Merge branch 'upmaster' into SPARK-32808

454b53c

LuciferYang mentioned this pull request Sep 8, 2020

[SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet #29598

Closed

Merge branch 'upmaster' into SPARK-32808

9185a95

srowen approved these changes Sep 9, 2020

View reviewed changes

srowen closed this in 513d51a Sep 9, 2020

LuciferYang mentioned this pull request Sep 10, 2020

[SPARK-32808][SQL] Pass all test of sql/core module in Scala 2.13 #29711

Closed

LuciferYang deleted the SPARK-32808 branch June 6, 2022 03:44

	object ArrayData {
	def toArrayData(input: Any): ArrayData = input match {
	case a: Array[Boolean] => UnsafeArrayData.fromPrimitiveArray(a)
	case a: Array[Byte] => UnsafeArrayData.fromPrimitiveArray(a)
	case a: Array[Short] => UnsafeArrayData.fromPrimitiveArray(a)
	case a: Array[Int] => UnsafeArrayData.fromPrimitiveArray(a)
	case a: Array[Long] => UnsafeArrayData.fromPrimitiveArray(a)
	case a: Array[Float] => UnsafeArrayData.fromPrimitiveArray(a)
	case a: Array[Double] => UnsafeArrayData.fromPrimitiveArray(a)
	case other => new GenericArrayData(other)
	}

[SPARK-32808][SQL] Fix some test cases of sql/core module in scala 2.13 #29660

[SPARK-32808][SQL] Fix some test cases of sql/core module in scala 2.13 #29660

Uh oh!

Conversation

LuciferYang commented Sep 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

LuciferYang commented Sep 7, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LuciferYang commented Sep 7, 2020

Uh oh!

SparkQA commented Sep 7, 2020

Uh oh!

LuciferYang commented Sep 7, 2020

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srowen commented Sep 7, 2020

Uh oh!

LuciferYang commented Sep 7, 2020

Uh oh!

LuciferYang commented Sep 8, 2020

Uh oh!

SparkQA commented Sep 8, 2020

Uh oh!

srowen commented Sep 8, 2020

Uh oh!

SparkQA commented Sep 8, 2020

Uh oh!

LuciferYang commented Sep 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LuciferYang commented Sep 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Sep 9, 2020

Uh oh!

LuciferYang commented Sep 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LuciferYang commented Sep 9, 2020

Uh oh!

xuanyuanking commented Sep 9, 2020

Uh oh!

SparkQA commented Sep 9, 2020

Uh oh!

LuciferYang commented Sep 9, 2020

Uh oh!

srowen commented Sep 9, 2020

Uh oh!

LuciferYang commented Sep 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13 #29660

[SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13 #29660

LuciferYang commented Sep 7, 2020 •

edited

Loading

LuciferYang commented Sep 9, 2020 •

edited

Loading

LuciferYang commented Sep 9, 2020 •

edited

Loading

LuciferYang commented Sep 9, 2020 •

edited

Loading