Skip to content

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Jun 9, 2023

What changes were proposed in this pull request?

TBD

Why are the changes needed?

TBD

Does this PR introduce any user-facing change?

No, just for test.

How was this patch tested?

  • Pass GitHub Action

@github-actions github-actions bot added the INFRA label Jun 9, 2023
@LuciferYang
Copy link
Contributor Author

wait #41487

run: |
# Fix for TTY related issues when launching the Ammonite REPL in tests.
export TERM=vt100 && script -qfc 'echo exit | amm -s' && rm typescript
# `set -e` to make the exit status as expected due to use `script -q -e -c` to run the commands
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another way is add a new script, maybe named dev/run-connect-maven-tests as follows:

#!/usr/bin/env bash

set -e


# Go to the Spark project root directory
FWDIR="$(cd "`dirname "$0"`"/..; pwd)"
cd "$FWDIR"
export SPARK_HOME=$FWDIR
echo "$SPARK_HOME"


if [[ -z "$JAVA_VERSION" ]]; then
  JAVA_VERSION=8
fi


export MAVEN_OPTS="-Xss64m -Xmx2g -XX:ReservedCodeCacheSize=1g -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN"
export MAVEN_CLI_OPTS="--no-transfer-progress"


# 1. Test with -Phive
# It uses Maven's 'install' intentionally, see https://github.com/apache/spark/pull/26414.
build/mvn $MAVEN_CLI_OPTS -DskipTests -Djava.version=${JAVA_VERSION/-ea} install -Phive
build/mvn $MAVEN_CLI_OPTS -Djava.version=${JAVA_VERSION/-ea} test -pl connector/connect/client/jvm -Phive

# 2. Test without -Phive
build/mvn $MAVEN_CLI_OPTS -DskipTests -Djava.version=${JAVA_VERSION/-ea} install -pl assembly
build/mvn $MAVEN_CLI_OPTS -Djava.version=${JAVA_VERSION/-ea} test -pl connector/connect/client/jvm

and run: can change to:

          # Fix for TTY related issues when launching the Ammonite REPL in tests.
          export TERM=vt100 && script -qfc 'echo exit | amm -s' && rm typescript
          export JAVA_VERSION=${{ matrix.java }}
          ./dev/run-connect-maven-tests
          TEST_RETCODE=$?
          rm -rf ~/.m2/repository/org/apache/spark
          exit $TEST_RETCODE

I tested it and it worked, but due to lack of this script in branch-3.4, we need to add "connect-maven" : "false" to build_branch34.yml when using script

@LuciferYang
Copy link
Contributor Author

7c0409b rebase to test

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Jun 11, 2023

@HyukjinKwon @dongjoon-hyun Do you think this is necessary?

For the connect server module, testing can be added after resolving SPARK-43646

@LuciferYang LuciferYang requested review from HyukjinKwon and dongjoon-hyun and removed request for HyukjinKwon June 11, 2023 07:42
@LuciferYang
Copy link
Contributor Author

friendly ping @HyukjinKwon @dongjoon-hyun

@LuciferYang
Copy link
Contributor Author

also cc @hvanhovell , I would like to add a maven test GA task for the connect modules because I think sbt testing always masks some issues(due to the visibility of the classpath)

@HyukjinKwon
Copy link
Member

Don't feel strongly but I think we should better set a regular job ... with a broader scope .. doing this for every PR is I think a bit of overhead.

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Jun 13, 2023

Don't feel strongly but I think we should better set a regular job ... with a broader scope .. doing this for every PR is I think a bit of overhead.

A daily job and tests all modules using maven?

@LuciferYang LuciferYang marked this pull request as draft June 13, 2023 14:19
@LuciferYang
Copy link
Contributor Author

Set to draft first, will be updated to add a maven daily test later

\"lint\" : \"true\",
\"k8s-integration-tests\" : \"true\",
\"breaking-changes-buf\" : \"true\",
\"maven-build\" : \"true\",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change to false after test

- ${{ inputs.hadoop }}
hive:
- hive2.3
modules:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you use modules in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

export JAVA_VERSION=${{ matrix.java }}
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=${JAVA_VERSION/-ea} clean install
if [[ "$INCLUDED_TAGS" != "" ]]; then
./build/mvn $MAVEN_CLI_OPTS -pl "$MODULES_TO_TEST" -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=${JAVA_VERSION/-ea} -Dtest.include.tags="$INCLUDED_TAGS" test
Copy link
Member

@dongjoon-hyun dongjoon-hyun Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Got it. It uses MODULES_TO_TEST. May I ask which this is "for connect client module"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource-managers/yarn,resource-managers/mesos,resource-managers/kubernetes is irrelevent to this, isn't it?

Copy link
Contributor Author

@LuciferYang LuciferYang Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the topic of this pr will be changed and I want to add a daily maven test will to cover all modules.

Due to ongoing testing, the pr title and description have not been changed, which has caused you misunderstandings. Sorry for any inconvenience caused @dongjoon-hyun

@LuciferYang LuciferYang changed the title [SPARK-43988][INFRA] Add maven testing GitHub Action task for connect client module [SPARK-43988][INFRA] Add a daily maven testing GitHub Action job Jun 15, 2023
@LuciferYang
Copy link
Contributor Author

LuciferYang commented Jun 15, 2023

ProductAggSuite in catalyst module test aborted when using maven:

ProductAggSuite:
*** RUN ABORTED ***
  java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$
  at org.apache.spark.sql.catalyst.expressions.codegen.JavaCode$.variable(javaCode.scala:64)
  at org.apache.spark.sql.catalyst.expressions.codegen.JavaCode$.isNullVariable(javaCode.scala:77)
  at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:200)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.sql.catalyst.expressions.Expression.genCode(Expression.scala:196)
  at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.$anonfun$create$1(GenerateSafeProjection.scala:156)
  at scala.collection.immutable.List.map(List.scala:293)
  at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:153)
  at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:39)
  at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:1369)

created SPARK-44064 to tracking this

@LuciferYang
Copy link
Contributor Author

ReplSuite in repl module test faild when using maven:

ReplSuite:
17500Spark context available as 'sc' (master = local, app id = local-1686829049116).
17501Spark session available as 'spark'.
17502- SPARK-15236: use Hive catalog *** FAILED ***
17503  isContain was true Interpreter output contained 'Exception':
17504  Welcome to
17505        ____              __
17506       / __/__  ___ _____/ /__
17507      _\ \/ _ \/ _ `/ __/  '_/
17508     /___/ .__/\_,_/_/ /_/\_\   version 3.5.0-SNAPSHOT
17509        /_/
17510           
17511  Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 1.8.0_372)
17512  Type in expressions to have them evaluated.
17513  Type :help for more information.
17514  
17515  scala> 
17516  scala> java.lang.NoClassDefFoundError: org/sparkproject/guava/cache/CacheBuilder
17517    at org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(SessionCatalog.scala:197)
17518    at org.apache.spark.sql.internal.BaseSessionStateBuilder.catalog$lzycompute(BaseSessionStateBuilder.scala:153)
17519    at org.apache.spark.sql.internal.BaseSessionStateBuilder.catalog(BaseSessionStateBuilder.scala:152)
17520    at org.apache.spark.sql.internal.BaseSessionStateBuilder.v2SessionCatalog$lzycompute(BaseSessionStateBuilder.scala:166)
17521    at org.apache.spark.sql.internal.BaseSessionStateBuilder.v2SessionCatalog(BaseSessionStateBuilder.scala:166)
17522    at org.apache.spark.sql.internal.BaseSessionStateBuilder.catalogManager$lzycompute(BaseSessionStateBuilder.scala:168)
17523    at org.apache.spark.sql.internal.BaseSessionStateBuilder.catalogManager(BaseSessionStateBuilder.scala:168)
17524    at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$1.<init>(BaseSessionStateBuilder.scala:185)
17525    at org.apache.spark.sql.internal.BaseSessionStateBuilder.analyzer(BaseSessionStateBuilder.scala:185)
17526    at org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$build$2(BaseSessionStateBuilder.scala:373)
17527    at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:92)
17528    at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:92)
17529    at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
17530    at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
17531    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:202)
17532    at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:529)
17533    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:202)
17534    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
17535    at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:201)
17536    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
17537    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
17538    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
17539    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
17540    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
17541    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
17542    at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:640)
17543    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
17544    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:630)
17545    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:671)
17546    ... 94 elided
17547  Caused by: java.lang.ClassNotFoundException: org.sparkproject.guava.cache.CacheBuilder
17548    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
17549    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
17550    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
17551    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
17552    ... 123 more
17553  
17554  scala>      | 
17555  scala> :quit (ReplSuite.scala:83)
17556Spark context available as 'sc' (master = local, app id = local-1686829054261).
17557Spark session available as 'spark'.
17558- SPARK-15236: use in-memory catalog
17559Spark context available as 'sc' (master = local, app id = local-1686829056083).
17560Spark session available as 'spark'.
17561- broadcast vars
17562Spark context available as 'sc' (master = local, app id = local-1686829059606).
17563Spark session available as 'spark'.
17564- line wrapper only initialized once when used as encoder outer scope
17565Spark context available as 'sc' (master = local-cluster[1,1,1024], app id = app-20230615043742-0000).
17566Spark session available as 'spark'.
17567
17568// Exiting paste mode, now interpreting.
17569
17570- define case class and create Dataset together with paste mode *** FAILED ***
17571  isContain was true Interpreter output contained 'Exception':
17572  Welcome to
17573        ____              __
17574       / __/__  ___ _____/ /__
17575      _\ \/ _ \/ _ `/ __/  '_/
17576     /___/ .__/\_,_/_/ /_/\_\   version 3.5.0-SNAPSHOT
17577        /_/
17578           
17579  Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 1.8.0_372)
17580  Type in expressions to have them evaluated.
17581  Type :help for more information.
17582  
17583  scala> // Entering paste mode (ctrl-D to finish)
17584  
17585  java.lang.NoClassDefFoundError: org/sparkproject/guava/util/concurrent/AtomicLongMap
17586    at org.apache.spark.sql.catalyst.rules.QueryExecutionMetering.<init>(QueryExecutionMetering.scala:27)
17587    at org.apache.spark.sql.catalyst.rules.RuleExecutor$.<init>(RuleExecutor.scala:31)
17588    at org.apache.spark.sql.catalyst.rules.RuleExecutor$.<clinit>(RuleExecutor.scala)
17589    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:192)
17590    at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.$anonfun$canonicalize$1(GenerateUnsafeProjection.scala:319)
17591    at scala.collection.immutable.List.map(List.scala:293)
17592    at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.canonicalize(GenerateUnsafeProjection.scala:319)
17593    at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.generate(GenerateUnsafeProjection.scala:327)
17594    at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.createCodeGeneratedObject(Projection.scala:124)
17595    at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.createCodeGeneratedObject(Projection.scala:120)
17596    at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:51)
17597    at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.create(Projection.scala:151)
17598    at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Serializer.apply(ExpressionEncoder.scala:198)
17599    at org.apache.spark.sql.SparkSession.$anonfun$createDataset$1(SparkSession.scala:483)
17650- SPARK-2576 importing implicits
17651- Datasets and encoders *** FAILED ***
17652  isContain was true Interpreter output contained 'error:':
17653  
17654  scala> import org.apache.spark.sql.functions._
17655  
17656  scala> import org.apache.spark.sql.{Encoder, Encoders}
17657  
17658  scala> import org.apache.spark.sql.expressions.Aggregator
17659  
17660  scala> import org.apache.spark.sql.TypedColumn
17661  
17662  scala>      |      |      |      |      |      |      | simpleSum: org.apache.spark.sql.TypedColumn[Int,Int] = $anon$1(boundreference() AS value, value, unresolveddeserializer(assertnotnull(upcast(getcolumnbyordinal(0, IntegerType), IntegerType, - root class: "int")), value#9), boundreference() AS value)
17663  
17664  scala> 
17665  scala> java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.rules.RuleExecutor$
17666    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:192)
17667    at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.$anonfun$canonicalize$1(GenerateUnsafeProjection.scala:319)
17668    at scala.collection.immutable.List.map(List.scala:293)
17669    at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.canonicalize(GenerateUnsafeProjection.scala:319)
17670    at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.generate(GenerateUnsafeProjection.scala:327)
17671    at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.createCodeGeneratedObject(Projection.scala:124)
17672    at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.createCodeGeneratedObject(Projection.scala:120)
17673    at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:51)
17674    at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.create(Projection.scala:151)
17675    at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Serializer.apply(ExpressionEncoder.scala:198)
17676    at org.apache.spark.sql.SparkSession.$anonfun$createDataset$1(SparkSession.scala:483)
17677    at scala.collection.immutable.List.map(List.scala:293)
17678    at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:483)
17679    at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:354)
17680    at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:244)
17681    ... 39 elided
17682  
17683  scala> <console>:33: error: not found: value ds
17684         ds.select(simpleSum).collect
17685         ^
17686  
17687  scala>      | _result_1686829100269: Int = 1
17688  
17689  scala> (SingletonReplSuite.scala:106) 

created SPARK-44069 to tracking this

@github-actions github-actions bot added the CORE label Jun 15, 2023
@github-actions github-actions bot removed the CORE label Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants