Skip to content

Conversation

@rdblue
Copy link
Contributor

@rdblue rdblue commented May 26, 2020

This replaces gradle-consistent-versions with nebula.dependency-recommender and nebula.dependency-lock. The purpose of this change is to enable having separate Spark 2.x and Spark 3.x modules in the build. An empty spark3 project is included.

The dependency recommender plugin is used to get versions from versions.props or the per-project dependency lock files.

The dependency lock plugin is used to lock versions. Locks are now generated in the build/ folders using ./gradlew generateLock, and used in the build after running ./gradlew saveLock. The JSON lock files are added in this commit. These are large because they include transitive information and a set of dependencies per configuration.

This also fixes a problem where JMH dependencies needed to be declared in the compile configuration. Now JMH dependencies can use the jmh configuration.

@rdblue
Copy link
Contributor Author

rdblue commented May 26, 2020

@jerryshao, you may be interested in this change because it allows multiple Spark versions in the build. @massdosage, this should help fix your dependency issues as well.

@mccheah, you may be interested in reviewing this because it updates how dependencies are locked.

@rdblue rdblue force-pushed the use-nebula-version-plugins branch from 8ce3ee9 to 4ebb73f Compare May 26, 2020 23:48
@massdosage
Copy link
Contributor

OK, this does indeed look interesting. So we should be able to use this to specify a different version of Guava for a certain subproject? We'll give it a go on a branch we're working on to demonstrate the Guava version issue for Hive tests. Thanks for pointing this out.

@rdblue
Copy link
Contributor Author

rdblue commented May 27, 2020

Looks like Jackson 2.10.2 breaks Spark 2.4.4:

org.apache.iceberg.spark.TestSparkDataFile > testValueConversion FAILED
    java.lang.ExceptionInInitializerError
        at org.apache.spark.SparkContext.withScope(SparkContext.scala:699)
        at org.apache.spark.SparkContext.parallelize(SparkContext.scala:716)
        at org.apache.spark.api.java.JavaSparkContext.parallelize(JavaSparkContext.scala:134)
        at org.apache.spark.api.java.JavaSparkContext.parallelize(JavaSparkContext.scala:146)
        at org.apache.iceberg.spark.TestSparkDataFile.checkSparkDataFile(TestSparkDataFile.java:149)
        at org.apache.iceberg.spark.TestSparkDataFile.testValueConversion(TestSparkDataFile.java:130)
        Caused by:
        com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.10.2
            at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
            at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
            at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:808)
            at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
            at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
            ... 6 more

I don't think that this plugin locks transitive dependencies by default, so we will probably need to turn that on to fix this.

@rdblue rdblue closed this May 27, 2020
@rdblue rdblue reopened this May 27, 2020
Copy link
Member

@jzhuge jzhuge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM
We have used Nebula in our internal fork at Netflix for more than 6 months.

@massdosage
Copy link
Contributor

I tried this out in a branch we have which demonstrates the version conflicts we were getting with Hive and Guava (https://github.com/ExpediaGroup/iceberg/tree/add-hiverunner-test) and it works!

@rdblue rdblue requested a review from danielcweeks June 1, 2020 21:05
Copy link
Contributor

@aokolnychyi aokolnychyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@aokolnychyi aokolnychyi merged commit 65ef122 into apache:master Jun 2, 2020
@rdblue
Copy link
Contributor Author

rdblue commented Jun 2, 2020

Thanks for reviewing, everyone! Good to have this in to unblock the Hive and Spark 3 work!

@rdblue rdblue mentioned this pull request Jun 2, 2020
cmathiesen pushed a commit to ExpediaGroup/iceberg that referenced this pull request Aug 19, 2020
This replaces `gradle-consistent-versions` with `nebula.dependency-recommender` and `nebula.dependency-lock`. The purpose of this change is to enable having separate Spark 2.x and Spark 3.x modules in the build. An empty `spark3` project is included.

The dependency recommender plugin is used to get versions from `versions.props` or the per-project dependency lock files.

The dependency lock plugin is used to lock versions. Locks are now generated in the `build/` folders using `./gradlew generateLock`, and used in the build after running `./gradlew saveLock`. The JSON lock files are added in this commit. These are large because they include transitive information and a set of dependencies per configuration.

This also fixes a problem where JMH dependencies needed to be declared in the compile configuration. Now JMH dependencies can use the jmh configuration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants