-
Notifications
You must be signed in to change notification settings - Fork 577
Upgrade to spark 3.2 #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to spark 3.2 #416
Conversation
| <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-reflect --> | ||
| <dependency> | ||
| <groupId>org.scala-lang</groupId> | ||
| <artifactId>scala-reflect</artifactId> | ||
| <version>${scala.version}</version> | ||
| </dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to fix the following error happened on my local PC
*** RUN ABORTED *** (1 minute, 15 seconds)
java.lang.VerifyError: class scala.tools.nsc.reporters.Reporter overrides final method echo.(Ljava/lang/String;)V
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at com.amazon.deequ.suggestions.ConstraintSuggestionRunnerTest$.verificationFnFromConstraintSrc(ConstraintSuggestionRunnerTest.scala:291)
at com.amazon.deequ.suggestions.ConstraintSuggestionRunnerTest.suggestHasDataTypeConstraintVerifyTest(ConstraintSuggestionRunnerTest.scala:259)
at com.amazon.deequ.suggestions.ConstraintSuggestionRunnerTest.$anonfun$new$22(ConstraintSuggestionRunnerTest.scala:221)
at com.amazon.deequ.suggestions.ConstraintSuggestionRunnerTest.$anonfun$new$22$adapted(ConstraintSuggestionRunnerTest.scala:215)
at com.amazon.deequ.SparkContextSpec.withSparkSession(SparkContextSpec.scala:33)
at com.amazon.deequ.SparkContextSpec.withSparkSession$(SparkContextSpec.scala:30)
at com.amazon.deequ.suggestions.ConstraintSuggestionRunnerTest.withSparkSession(ConstraintSuggestionRunnerTest.scala:36)
at com.amazon.deequ.suggestions.ConstraintSuggestionRunnerTest.$anonfun$new$21(ConstraintSuggestionRunnerTest.scala:215)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.wordspec.AnyWordSpecLike$$anon$3.apply(AnyWordSpecLike.scala:1076)
at org.scalatest.TestSuite.withFixture(TestSuite.scala:196)
at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195)
at org.scalatest.wordspec.AnyWordSpec.withFixture(AnyWordSpec.scala:1879)
at org.scalatest.wordspec.AnyWordSpecLike.invokeWithFixture$1(AnyWordSpecLike.scala:1074)
at org.scalatest.wordspec.AnyWordSpecLike.$anonfun$runTest$1(AnyWordSpecLike.scala:1086)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.wordspec.AnyWordSpecLike.runTest(AnyWordSpecLike.scala:1086)
at org.scalatest.wordspec.AnyWordSpecLike.runTest$(AnyWordSpecLike.scala:1068)
at org.scalatest.wordspec.AnyWordSpec.runTest(AnyWordSpec.scala:1879)
at org.scalatest.wordspec.AnyWordSpecLike.$anonfun$runTests$1(AnyWordSpecLike.scala:1145)
at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:390)
at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:427)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
at org.scalatest.wordspec.AnyWordSpecLike.runTests(AnyWordSpecLike.scala:1145)
at org.scalatest.wordspec.AnyWordSpecLike.runTests$(AnyWordSpecLike.scala:1144)
at org.scalatest.wordspec.AnyWordSpec.runTests(AnyWordSpec.scala:1879)
at org.scalatest.Suite.run(Suite.scala:1112)
at org.scalatest.Suite.run$(Suite.scala:1094)
at org.scalatest.wordspec.AnyWordSpec.org$scalatest$wordspec$AnyWordSpecLike$$super$run(AnyWordSpec.scala:1879)
at org.scalatest.wordspec.AnyWordSpecLike.$anonfun$run$1(AnyWordSpecLike.scala:1190)
at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
at org.scalatest.wordspec.AnyWordSpecLike.run(AnyWordSpecLike.scala:1190)
at org.scalatest.wordspec.AnyWordSpecLike.run$(AnyWordSpecLike.scala:1188)
at org.scalatest.wordspec.AnyWordSpec.run(AnyWordSpec.scala:1879)
at org.scalatest.Suite.callExecuteOnSuite$1(Suite.scala:1175)
at org.scalatest.Suite.$anonfun$runNestedSuites$1(Suite.scala:1222)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at org.scalatest.Suite.runNestedSuites(Suite.scala:1220)
at org.scalatest.Suite.runNestedSuites$(Suite.scala:1154)
at org.scalatest.tools.DiscoverySuite.runNestedSuites(DiscoverySuite.scala:30)
at org.scalatest.Suite.run(Suite.scala:1109)
at org.scalatest.Suite.run$(Suite.scala:1094)
at org.scalatest.tools.DiscoverySuite.run(DiscoverySuite.scala:30)
at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1320)
at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1314)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1314)
at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:993)
at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:971)
at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1480)
at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:971)
at org.scalatest.tools.Runner$.main(Runner.scala:775)
at org.scalatest.tools.Runner.main(Runner.scala)
| * | ||
| */ | ||
|
|
||
| * Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatted using scalafmt to pass scalastyle check.
| override protected def withNewChildrenInternal(newLeft: Expression, | ||
| newRight: Expression): StatefulCorrelation = | ||
| new StatefulCorrelation(newLeft, newRight, nullOnDivideByZero) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to fix a fail on baseCheck.hasCorrelation
https://github.com/awslabs/deequ/blob/master/src/test/scala/com/amazon/deequ/checks/CheckTest.scala#L577
If we do not provide this withNewChildrenInternal, the method of Corr will be used, which lead to incorrect results.
| .appName("test") | ||
| .config("spark.ui.enabled", "false") | ||
| .config("spark.sql.shuffle.partitions", 2.toString) | ||
| .config("spark.sql.adaptive.enabled", value = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to fix failed tests when verifying submitted job count from SparkSessionStats
Because from Spark 3.2, adaptive query execution is enabled by default, the plan Spark creates to execute tasks has changed largely.
Here we disable Adaptive query execution to make Spark behave the same as 3.1
This is to fix failed tests like below:
https://github.com/awslabs/deequ/blob/master/src/test/scala/com/amazon/deequ/profiles/ColumnProfilerRunnerTest.scala#L63
(There were several failed tasks like this one, the actual number of jobs that have been submitted is 5 when AQE is enabled, and is 3 when AQE is disabled)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan is explained on master branch (spark 3.1)
== Parsed Logical Plan ==
'Aggregate [sum(cast(isnotnull('item) as int)) AS sum(CAST((item IS NOT NULL) AS INT))#915, count(1) AS count(1)#916L, stateful_approx_count_distinct('item, 0, 0) AS stateful_approx_count_distinct(item)#1021, statefuldatatype('item, org.apache.spark.sql.StatefulDataType@4ba1c1a2, 0, 0) AS statefuldatatype(item)#1032, sum(cast(isnotnull('att1) as int)) AS sum(CAST((att1 IS NOT NULL) AS INT))#1033, count(1) AS count(1)#1034L, stateful_approx_count_distinct('att1, 0, 0) AS stateful_approx_count_distinct(att1)#1139, sum(cast(isnotnull('att2) as int)) AS sum(CAST((att2 IS NOT NULL) AS INT))#1140, count(1) AS count(1)#1141L, stateful_approx_count_distinct('att2, 0, 0) AS stateful_approx_count_distinct(att2)#1246, sum(cast(isnotnull('att3) as int)) AS sum(CAST((att3 IS NOT NULL) AS INT))#1247, count(1) AS count(1)#1248L, stateful_approx_count_distinct('att3, 0, 0) AS stateful_approx_count_distinct(att3)#1353, count(1) AS count(1)#1354L]
+- Project [_1#4 AS item#13, _2#5 AS att1#14, _3#6 AS att2#15, _4#7 AS att3#16]
+- LocalRelation [_1#4, _2#5, _3#6, _4#7]
== Analyzed Logical Plan ==
sum(CAST((item IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(item): binary, statefuldatatype(item): binary, sum(CAST((att1 IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(att1): binary, sum(CAST((att2 IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(att2): binary, sum(CAST((att3 IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(att3): binary, count(1): bigint
Aggregate [sum(cast(cast(isnotnull(item#13) as int) as bigint)) AS sum(CAST((item IS NOT NULL) AS INT))#915L, count(1) AS count(1)#916L, stateful_approx_count_distinct(item#13, 0, 0) AS stateful_approx_count_distinct(item)#1021, statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@4ba1c1a2, 0, 0) AS statefuldatatype(item)#1032, sum(cast(cast(isnotnull(att1#14) as int) as bigint)) AS sum(CAST((att1 IS NOT NULL) AS INT))#1033L, count(1) AS count(1)#1034L, stateful_approx_count_distinct(att1#14, 0, 0) AS stateful_approx_count_distinct(att1)#1139, sum(cast(cast(isnotnull(att2#15) as int) as bigint)) AS sum(CAST((att2 IS NOT NULL) AS INT))#1140L, count(1) AS count(1)#1141L, stateful_approx_count_distinct(att2#15, 0, 0) AS stateful_approx_count_distinct(att2)#1246, sum(cast(cast(isnotnull(att3#16) as int) as bigint)) AS sum(CAST((att3 IS NOT NULL) AS INT))#1247L, count(1) AS count(1)#1248L, stateful_approx_count_distinct(att3#16, 0, 0) AS stateful_approx_count_distinct(att3)#1353, count(1) AS count(1)#1354L]
+- Project [_1#4 AS item#13, _2#5 AS att1#14, _3#6 AS att2#15, _4#7 AS att3#16]
+- LocalRelation [_1#4, _2#5, _3#6, _4#7]
== Optimized Logical Plan ==
Aggregate [sum(cast(cast(isnotnull(item#13) as int) as bigint)) AS sum(CAST((item IS NOT NULL) AS INT))#915L, count(1) AS count(1)#916L, stateful_approx_count_distinct(item#13, 0, 0) AS stateful_approx_count_distinct(item)#1021, statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@4ba1c1a2, 0, 0) AS statefuldatatype(item)#1032, sum(1) AS sum(CAST((att1 IS NOT NULL) AS INT))#1033L, count(1) AS count(1)#1034L, stateful_approx_count_distinct(att1#14, 0, 0) AS stateful_approx_count_distinct(att1)#1139, sum(1) AS sum(CAST((att2 IS NOT NULL) AS INT))#1140L, count(1) AS count(1)#1141L, stateful_approx_count_distinct(att2#15, 0, 0) AS stateful_approx_count_distinct(att2)#1246, sum(1) AS sum(CAST((att3 IS NOT NULL) AS INT))#1247L, count(1) AS count(1)#1248L, stateful_approx_count_distinct(att3#16, 0, 0) AS stateful_approx_count_distinct(att3)#1353, count(1) AS count(1)#1354L]
+- LocalRelation [item#13, att1#14, att2#15, att3#16]
== Physical Plan ==
HashAggregate(keys=[], functions=[sum(cast(cast(isnotnull(item#13) as int) as bigint)), count(1), stateful_approx_count_distinct(item#13, 0, 0), statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@4ba1c1a2, 0, 0), sum(1), stateful_approx_count_distinct(att1#14, 0, 0), stateful_approx_count_distinct(att2#15, 0, 0), stateful_approx_count_distinct(att3#16, 0, 0)], output=[sum(CAST((item IS NOT NULL) AS INT))#915L, count(1)#916L, stateful_approx_count_distinct(item)#1021, statefuldatatype(item)#1032, sum(CAST((att1 IS NOT NULL) AS INT))#1033L, count(1)#1034L, stateful_approx_count_distinct(att1)#1139, sum(CAST((att2 IS NOT NULL) AS INT))#1140L, count(1)#1141L, stateful_approx_count_distinct(att2)#1246, sum(CAST((att3 IS NOT NULL) AS INT))#1247L, count(1)#1248L, stateful_approx_count_distinct(att3)#1353, count(1)#1354L])
+- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#10]
+- HashAggregate(keys=[], functions=[partial_sum(cast(cast(isnotnull(item#13) as int) as bigint)), partial_count(1), partial_stateful_approx_count_distinct(item#13, 0, 0), partial_statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@4ba1c1a2, 0, 0), partial_sum(1), partial_stateful_approx_count_distinct(att1#14, 0, 0), partial_stateful_approx_count_distinct(att2#15, 0, 0), partial_stateful_approx_count_distinct(att3#16, 0, 0)], output=[sum#2224L, count#2225L, MS[0]#1407L, MS[1]#1408L, MS[2]#1409L, MS[3]#1410L, MS[4]#1411L, MS[5]#1412L, MS[6]#1413L, MS[7]#1414L, MS[8]#1415L, MS[9]#1416L, MS[10]#1417L, MS[11]#1418L, MS[12]#1419L, MS[13]#1420L, MS[14]#1421L, MS[15]#1422L, MS[16]#1423L, MS[17]#1424L, MS[18]#1425L, MS[19]#1426L, MS[20]#1427L, MS[21]#1428L, ... 192 more fields])
+- LocalTableScan [item#13, att1#14, att2#15, att3#16]
The plan is explained on this branch (Spark 3.2.1)
== Parsed Logical Plan ==
'Aggregate [sum(cast(isnotnull('item) as int)) AS sum(CAST((item IS NOT NULL) AS INT))#915, count(1) AS count(1)#916L, stateful_approx_count_distinct('item, 0, 0) AS stateful_approx_count_distinct(item)#1021, statefuldatatype('item, org.apache.spark.sql.StatefulDataType@253b1cbd, 0, 0, None) AS statefuldatatype(item)#1032, sum(cast(isnotnull('att1) as int)) AS sum(CAST((att1 IS NOT NULL) AS INT))#1033, count(1) AS count(1)#1034L, stateful_approx_count_distinct('att1, 0, 0) AS stateful_approx_count_distinct(att1)#1139, sum(cast(isnotnull('att2) as int)) AS sum(CAST((att2 IS NOT NULL) AS INT))#1140, count(1) AS count(1)#1141L, stateful_approx_count_distinct('att2, 0, 0) AS stateful_approx_count_distinct(att2)#1246, sum(cast(isnotnull('att3) as int)) AS sum(CAST((att3 IS NOT NULL) AS INT))#1247, count(1) AS count(1)#1248L, stateful_approx_count_distinct('att3, 0, 0) AS stateful_approx_count_distinct(att3)#1353, count(1) AS count(1)#1354L]
+- Project [_1#4 AS item#13, _2#5 AS att1#14, _3#6 AS att2#15, _4#7 AS att3#16]
+- LocalRelation [_1#4, _2#5, _3#6, _4#7]
== Analyzed Logical Plan ==
sum(CAST((item IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(item): binary, statefuldatatype(item): binary, sum(CAST((att1 IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(att1): binary, sum(CAST((att2 IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(att2): binary, sum(CAST((att3 IS NOT NULL) AS INT)): bigint, count(1): bigint, stateful_approx_count_distinct(att3): binary, count(1): bigint
Aggregate [sum(cast(isnotnull(item#13) as int)) AS sum(CAST((item IS NOT NULL) AS INT))#915L, count(1) AS count(1)#916L, stateful_approx_count_distinct(item#13, 0, 0) AS stateful_approx_count_distinct(item)#1021, statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@253b1cbd, 0, 0, None) AS statefuldatatype(item)#1032, sum(cast(isnotnull(att1#14) as int)) AS sum(CAST((att1 IS NOT NULL) AS INT))#1033L, count(1) AS count(1)#1034L, stateful_approx_count_distinct(att1#14, 0, 0) AS stateful_approx_count_distinct(att1)#1139, sum(cast(isnotnull(att2#15) as int)) AS sum(CAST((att2 IS NOT NULL) AS INT))#1140L, count(1) AS count(1)#1141L, stateful_approx_count_distinct(att2#15, 0, 0) AS stateful_approx_count_distinct(att2)#1246, sum(cast(isnotnull(att3#16) as int)) AS sum(CAST((att3 IS NOT NULL) AS INT))#1247L, count(1) AS count(1)#1248L, stateful_approx_count_distinct(att3#16, 0, 0) AS stateful_approx_count_distinct(att3)#1353, count(1) AS count(1)#1354L]
+- Project [_1#4 AS item#13, _2#5 AS att1#14, _3#6 AS att2#15, _4#7 AS att3#16]
+- LocalRelation [_1#4, _2#5, _3#6, _4#7]
== Optimized Logical Plan ==
Aggregate [sum(cast(isnotnull(item#13) as int)) AS sum(CAST((item IS NOT NULL) AS INT))#915L, count(1) AS count(1)#916L, stateful_approx_count_distinct(item#13, 0, 0) AS stateful_approx_count_distinct(item)#1021, statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@253b1cbd, 0, 0, None) AS statefuldatatype(item)#1032, sum(1) AS sum(CAST((att1 IS NOT NULL) AS INT))#1033L, count(1) AS count(1)#1034L, stateful_approx_count_distinct(att1#14, 0, 0) AS stateful_approx_count_distinct(att1)#1139, sum(1) AS sum(CAST((att2 IS NOT NULL) AS INT))#1140L, count(1) AS count(1)#1141L, stateful_approx_count_distinct(att2#15, 0, 0) AS stateful_approx_count_distinct(att2)#1246, sum(1) AS sum(CAST((att3 IS NOT NULL) AS INT))#1247L, count(1) AS count(1)#1248L, stateful_approx_count_distinct(att3#16, 0, 0) AS stateful_approx_count_distinct(att3)#1353, count(1) AS count(1)#1354L]
+- LocalRelation [item#13, att1#14, att2#15, att3#16]
== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- HashAggregate(keys=[], functions=[sum(cast(isnotnull(item#13) as int)), count(1), stateful_approx_count_distinct(item#13, 0, 0), statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@253b1cbd, 0, 0, None), sum(1), stateful_approx_count_distinct(att1#14, 0, 0), stateful_approx_count_distinct(att2#15, 0, 0), stateful_approx_count_distinct(att3#16, 0, 0)], output=[sum(CAST((item IS NOT NULL) AS INT))#915L, count(1)#916L, stateful_approx_count_distinct(item)#1021, statefuldatatype(item)#1032, sum(CAST((att1 IS NOT NULL) AS INT))#1033L, count(1)#1034L, stateful_approx_count_distinct(att1)#1139, sum(CAST((att2 IS NOT NULL) AS INT))#1140L, count(1)#1141L, stateful_approx_count_distinct(att2)#1246, sum(CAST((att3 IS NOT NULL) AS INT))#1247L, count(1)#1248L, stateful_approx_count_distinct(att3)#1353, count(1)#1354L])
+- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#11]
+- HashAggregate(keys=[], functions=[partial_sum(cast(isnotnull(item#13) as int)), partial_count(1), partial_stateful_approx_count_distinct(item#13, 0, 0), partial_statefuldatatype(item#13, org.apache.spark.sql.StatefulDataType@253b1cbd, 0, 0, None), partial_sum(1), partial_stateful_approx_count_distinct(att1#14, 0, 0), partial_stateful_approx_count_distinct(att2#15, 0, 0), partial_stateful_approx_count_distinct(att3#16, 0, 0)], output=[sum#2224L, count#2225L, MS[0]#1407L, MS[1]#1408L, MS[2]#1409L, MS[3]#1410L, MS[4]#1411L, MS[5]#1412L, MS[6]#1413L, MS[7]#1414L, MS[8]#1415L, MS[9]#1416L, MS[10]#1417L, MS[11]#1418L, MS[12]#1419L, MS[13]#1420L, MS[14]#1421L, MS[15]#1422L, MS[16]#1423L, MS[17]#1424L, MS[18]#1425L, MS[19]#1426L, MS[20]#1427L, MS[21]#1428L, ... 192 more fields])
+- LocalTableScan [item#13, att1#14, att2#15, att3#16]
* Use spark 3.2.1 and fix hasCorrelation Check fail * Fix scalastyle fail * Disable spark.sql.adaptive.enabled Co-authored-by: tan.vu <tan.vu@linecorp.com>
Fix documentation travis-ci.org to travis-ci.com link Update anomaly_detection_example.md There is a 60 multiplication missing in the example: 1000 ms = 1 s 60 s = 1 min 60 min = 1 h 24 h = 1 d add constructor option to sort suggested categories update to spark3.1 AFAIU the only requirement is update for <apache/spark#29983>. In order to be consistent with the previous behavior and pass the existing test suite, this PR is essentially equavalent to setting `spark.sql.legacy.statisticalAggregate` to `true`. Now the code is incompatible with spark-2.x or spark-3.0, and so I'd like to recommend only supporting spark 3.1 and higher and scala 2.12 from now on. Update README.md Update README.md fix pattern match hashcode bug restore empty line fix style change version number for release 2.0.0-spark-3.1 update pom.xml and some analyzers to compile with spark 3.2.0 - tests failing Upgrade to spark 3.2 (awslabs#416) * Use spark 3.2.1 and fix hasCorrelation Check fail * Fix scalastyle fail * Disable spark.sql.adaptive.enabled Co-authored-by: tan.vu <tan.vu@linecorp.com> devcontainer Referential Integrity check and test, with Data Synchronization Check and Test remove .DS_Store files Cleaner versions of Referential Integrity and Data Synchronization checks and tests. save save Newest version of my three checks Version for code review, for all of my checks Final code review Pull request version of my code Pull request version of my code Final Version Pull Request remove .DS_Store files Duplicate .DS_Store banished! Removing Removings Delete DS_Stores
* Use spark 3.2.1 and fix hasCorrelation Check fail * Fix scalastyle fail * Disable spark.sql.adaptive.enabled Co-authored-by: tan.vu <tan.vu@linecorp.com>
Issue #, if available:
#380
Description of changes:
scala.reflectrelated errormvn clean buildhas been executed successfully on my local PC.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.