-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-5785] Enhance Spark Datasource tests #7938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-5785] Enhance Spark Datasource tests #7938
Conversation
9223732 to
4e7da70
Compare
xushiyan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
alexeykudinkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start the process of unifying all of the utilities to make sure we're not getting bitten by the same thing again
https://github.com/apache/hudi/pull/7702/files#diff-93d5c78a2db3470cef4a643a3b41b8b97876f411310a5653d232525c87a6d749
I created this to unify all APIs to construct Spark configs: HUDI-5788 |
4e7da70 to
1bf2b8c
Compare
|
Since this has been approved already, will go ahead and merge once CI is green. |
We got 2 approvals and moving ahead
- Enhancing spark ds tests to ensure tests for MDT spark datasource read tests are robust
- Enhancing spark ds tests to ensure tests for MDT spark datasource read tests are robust
- Enhancing spark ds tests to ensure tests for MDT spark datasource read tests are robust
- Enhancing spark ds tests to ensure tests for MDT spark datasource read tests are robust
- Enhancing spark ds tests to ensure tests for MDT spark datasource read tests are robust
Change Logs
Previously, we found that Spark Datasource read of metadata table was broken and the issue is fixed by #7924. However, the
TestMetadataTableWithSparkDataSourceguarding the exact same functionality did not fail in CI or local mvn command below. After investigation, the Hudi Spark configs (spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog,spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension) are not properly added to the Spark session in the test environment.This PR sets the proper Hudi Spark configs for Spark Datasource tests and adds one more test on reading metadata table through Spark Datasource.
Impact
After this change, without the fix #7924, the following test fails which is consistent with the behavior of spark-shell (previously it passed without raising the alarm).
Risk level
low
Documentation Update
N/A
Contributor's checklist