-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-6212] Hudi Spark 3.0.x integration #8714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| trait HoodieSpark3CatalystExpressionUtils extends HoodieCatalystExpressionUtils | ||
| with PredicateHelper { | ||
|
|
||
| override def normalizeExprs(exprs: Seq[Expression], attributes: Seq[Attribute]): Seq[Expression] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataSourceStrategy.normalizeExprs(exprs, attributes) are not included in spark3.0. So that we need to remove this specific implement in spark3-common space.
| override def normalizeExprs(exprs: Seq[Expression], attributes: Seq[Attribute]): Seq[Expression] = | ||
| DataSourceStrategy.normalizeExprs(exprs, attributes) | ||
|
|
||
| override def extractPredicatesWithinOutputSet(condition: Expression, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extractPredicatesWithinOutputSet are not included in spark3.0. So that we need to remove this specific implement in spark3-common space.
| case plan if !plan.resolved => None | ||
| // NOTE: When resolving Hudi table we allow [[Filter]]s and [[Project]]s be applied | ||
| // on top of it | ||
| case PhysicalOperation(_, _, DataSourceV2Relation(v2: V2TableWithV1Fallback, _, _, _, _)) if isHoodieTable(v2.v1Table) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
V2TableWithV1Fallback are not included in spark3.0. So that we need to remove this specific implement in spark3-common space.
| case NONE => "NONE" | ||
| case DISK_ONLY => "DISK_ONLY" | ||
| case DISK_ONLY_2 => "DISK_ONLY_2" | ||
| case DISK_ONLY_3 => "DISK_ONLY_3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DISK_ONLY_3 are not included in spark3.0. So that we need to remove this specific implement in spark3-common space.
yihua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/avro/TestAvroSerDe.scala
Outdated
Show resolved
Hide resolved
...park-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestMergeIntoTable.scala
Outdated
Show resolved
Hide resolved
...-spark3-common/src/main/scala/org/apache/spark/sql/HoodieSpark3CatalystExpressionUtils.scala
Show resolved
Hide resolved
...-spark3-common/src/main/scala/org/apache/spark/sql/HoodieSpark3CatalystExpressionUtils.scala
Show resolved
Hide resolved
...x/src/main/scala/org/apache/spark/sql/execution/datasources/Spark30NestedSchemaPruning.scala
Outdated
Show resolved
Hide resolved
...park3.0.x/src/main/scala/org/apache/spark/sql/hudi/Spark30ResolveHudiAlterTableCommand.scala
Outdated
Show resolved
Hide resolved
...i-spark3.0.x/src/main/scala/org/apache/spark/sql/hudi/command/Spark30AlterTableCommand.scala
Outdated
Show resolved
Hide resolved
...datasource/hudi-spark3.0.x/src/main/scala/org/apache/spark/sql/adapter/Spark3_0Adapter.scala
Show resolved
Hide resolved
This reverts commit 11aee96.
| CONFLUENT_VERSION=5.5.12 | ||
| KAFKA_CONNECT_HDFS_VERSION=10.1.13 | ||
| IMAGE_TAG=flink1146hive313spark302 | ||
| elif [[ ${SPARK_RUNTIME} == 'spark3.1.3' ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the tag flink1146hive313spark302 needs to be prefabricated, seems no script really build this image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer this PR
#8767

Change Logs
Hudi spark 3.0.x adoption. Looks like we did had the support for spark3.0 in 0.9.0, and later removed it. But due to interest from the community users, we are adding it back.
Impact
Hudi spark 3.0.x adoption
Risk level (write none, low medium or high below)
low
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist