-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-1200] Fix NullPointerException, CustomKeyGenerator does not work #1984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| private static final String NAMESPACE = "hoodieRow"; | ||
|
|
||
| protected transient TypedProperties config; | ||
| protected TypedProperties config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be transient so that we don't serialize config in KeyGenerator s. Lets keep it that way. Since the issue is specifically with CustomKeyGenerator we should resolve there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you let me know what you think? Can you describe it in detail @bhasudha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be transient so that we don't serialize config in KeyGenerator s. Lets keep it that way. Since the issue is specifically with CustomKeyGenerator we should resolve there.
Hi @bhasudha , CustomKeyGenerator extends KeyGenerator, maybe this NPE is caused by the config(erased in the process of serialization) in KeyGenerator. @liujinhui1994 is that what you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so too. @liujinhui1994 can you try with Hudi 0.6.0 and see if that helps? We fixed some serialization issues there wrt KeyGenerators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so too. @liujinhui1994 can you try with Hudi 0.6.0 and see if that helps? We fixed some serialization issues there wrt KeyGenerators.
I tried both TimestampBasedKeyGenerator and CustomKeyGenerator with 0.6.0 (in my IDEA) TimestampBasedKeyGenerator is ok, while CustomKeyGenerator still has NPE.
Remove transient can resolve this problem.
Caused by: java.lang.NullPointerException at org.apache.hudi.keygen.SimpleKeyGenerator.<init>(SimpleKeyGenerator.java:35) at org.apache.hudi.keygen.CustomKeyGenerator.getRecordKey(CustomKeyGenerator.java:128) at org.apache.hudi.keygen.BuiltinKeyGenerator.getKey(BuiltinKeyGenerator.java:75) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:143) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:139) at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bhasudha @vinothchandar what was the purpose behind keeping this config object here as transient? I see it has been like this from the very beginning. There is a workaround solution also which only affects CustomKeyGenerator class but removing transient keyword from here actually seems the easier solution. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not want to make config object as transient, we have 2 choices -
- introduce a new TypedProperties instance in CustomKeyGenerator
- introduce a new TimestampBasedKeyGenerator instance in CustomKeyGenerator, which gets initialised only if needed.
Please suggest.
| private static final String NAMESPACE = "hoodieRow"; | ||
|
|
||
| protected transient TypedProperties config; | ||
| protected TypedProperties config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so too. @liujinhui1994 can you try with Hudi 0.6.0 and see if that helps? We fixed some serialization issues there wrt KeyGenerators.
I tried both TimestampBasedKeyGenerator and CustomKeyGenerator with 0.6.0 (in my IDEA) TimestampBasedKeyGenerator is ok, while CustomKeyGenerator still has NPE.
Remove transient can resolve this problem.
Caused by: java.lang.NullPointerException at org.apache.hudi.keygen.SimpleKeyGenerator.<init>(SimpleKeyGenerator.java:35) at org.apache.hudi.keygen.CustomKeyGenerator.getRecordKey(CustomKeyGenerator.java:128) at org.apache.hudi.keygen.BuiltinKeyGenerator.getKey(BuiltinKeyGenerator.java:75) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:143) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:139) at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
|
@pratyakshsharma can you please take a look at this? |
|
Ack, will check this today. |
|
Hi Team, any updates on this. Wanted this fix in 0.6.0 as Im having this as part of a POC . @pratyakshsharma @bhasudha |
|
@ashishmgofficial Apologies for the delay, will check this today and circle back. |
|
@pratyakshsharma I would like to get this into 0.6.1 if possible. please prioritize accordingly |
Checking this. |
|
@pratyakshsharma Do you have any suggestions for this PR? According to the content of this PR, I resolved the error |
|
@liujinhui1994 Making the TypedProperties variable non transient works. Another work around can be #2093 |
What is the purpose of the pull request
Fix NullPointerException, KeyGenerator does not work
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.