-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-5637] Add Kryo for hive sync #7781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@alexeykudinkin @codope don't we want to include kryo in aws, gcp, datahub-sync, and timeline-server bundles too? these all could run as standalone app. In #7702, we mainly want to exclude kryo from spark and utilities bundles, right? |
|
Again it is deltastreamer multiwriter |
I verified hudi-aws-bundle by running glue sync in standalone mode. I was able to run the sync and query table through Athena. This suggests that |
|
@xushiyan we should not add it to bundles that are being mixed in w/ bundles that are NOT shading (Spark) |

Change Logs
After a70355f kryo was added explicitly shaded in a few bundles. But, it missed hudi-hive-sync-bundle. Due to that, hive sync using run_sync_tool woild fail due to
java.lang.NoClassDefFoundError: com/esotericsoftware/kryo/KryoSerializable. This PR fixes it. Note that we need to add explicitly in the bundle pom because in the parent pom kryo-shaded is declared in provided scope.Impact
Fix hive sync.
Risk level (write none, low medium or high below)
low
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist