Skip to content

Conversation

@liujinhui1994
Copy link
Contributor

What is the purpose of the pull request

Fix NullPointerException, KeyGenerator does not work

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

Fix NullPointerException caused TypedProperties
@liujinhui1994
Copy link
Contributor Author

please help review
@bhasudha @yanghua

@liujinhui1994 liujinhui1994 changed the title [HUDI-1200] Fix NullPointerException, KeyGenerator does not work [HUDI-1200] Fix NullPointerException, CustomKeyGenerator does not work Aug 19, 2020
private static final String NAMESPACE = "hoodieRow";

protected transient TypedProperties config;
protected TypedProperties config;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to be transient so that we don't serialize config in KeyGenerator s. Lets keep it that way. Since the issue is specifically with CustomKeyGenerator we should resolve there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you let me know what you think? Can you describe it in detail @bhasudha

Copy link
Contributor

@wangxianghu wangxianghu Aug 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to be transient so that we don't serialize config in KeyGenerator s. Lets keep it that way. Since the issue is specifically with CustomKeyGenerator we should resolve there.

Hi @bhasudha , CustomKeyGenerator extends KeyGenerator, maybe this NPE is caused by the config(erased in the process of serialization) in KeyGenerator. @liujinhui1994 is that what you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so too. @liujinhui1994 can you try with Hudi 0.6.0 and see if that helps? We fixed some serialization issues there wrt KeyGenerators.

Copy link
Contributor

@wangxianghu wangxianghu Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so too. @liujinhui1994 can you try with Hudi 0.6.0 and see if that helps? We fixed some serialization issues there wrt KeyGenerators.

I tried both TimestampBasedKeyGenerator and CustomKeyGenerator with 0.6.0 (in my IDEA) TimestampBasedKeyGenerator is ok, while CustomKeyGenerator still has NPE.
Remove transient can resolve this problem.

Caused by: java.lang.NullPointerException at org.apache.hudi.keygen.SimpleKeyGenerator.<init>(SimpleKeyGenerator.java:35) at org.apache.hudi.keygen.CustomKeyGenerator.getRecordKey(CustomKeyGenerator.java:128) at org.apache.hudi.keygen.BuiltinKeyGenerator.getKey(BuiltinKeyGenerator.java:75) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:143) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:139) at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)

Copy link
Contributor

@pratyakshsharma pratyakshsharma Sep 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhasudha @vinothchandar what was the purpose behind keeping this config object here as transient? I see it has been like this from the very beginning. There is a workaround solution also which only affects CustomKeyGenerator class but removing transient keyword from here actually seems the easier solution. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not want to make config object as transient, we have 2 choices -

  1. introduce a new TypedProperties instance in CustomKeyGenerator
  2. introduce a new TimestampBasedKeyGenerator instance in CustomKeyGenerator, which gets initialised only if needed.

Please suggest.

private static final String NAMESPACE = "hoodieRow";

protected transient TypedProperties config;
protected TypedProperties config;
Copy link
Contributor

@wangxianghu wangxianghu Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so too. @liujinhui1994 can you try with Hudi 0.6.0 and see if that helps? We fixed some serialization issues there wrt KeyGenerators.

I tried both TimestampBasedKeyGenerator and CustomKeyGenerator with 0.6.0 (in my IDEA) TimestampBasedKeyGenerator is ok, while CustomKeyGenerator still has NPE.
Remove transient can resolve this problem.

Caused by: java.lang.NullPointerException at org.apache.hudi.keygen.SimpleKeyGenerator.<init>(SimpleKeyGenerator.java:35) at org.apache.hudi.keygen.CustomKeyGenerator.getRecordKey(CustomKeyGenerator.java:128) at org.apache.hudi.keygen.BuiltinKeyGenerator.getKey(BuiltinKeyGenerator.java:75) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:143) at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:139) at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)

@vinothchandar
Copy link
Member

@pratyakshsharma can you please take a look at this?

@pratyakshsharma
Copy link
Contributor

Ack, will check this today.

@ashishmgofficial
Copy link

Hi Team, any updates on this. Wanted this fix in 0.6.0 as Im having this as part of a POC . @pratyakshsharma @bhasudha

@pratyakshsharma
Copy link
Contributor

@ashishmgofficial Apologies for the delay, will check this today and circle back.

@vinothchandar
Copy link
Member

@pratyakshsharma I would like to get this into 0.6.1 if possible. please prioritize accordingly

@pratyakshsharma
Copy link
Contributor

@pratyakshsharma I would like to get this into 0.6.1 if possible. please prioritize accordingly

Checking this.

@liujinhui1994
Copy link
Contributor Author

@pratyakshsharma Do you have any suggestions for this PR? According to the content of this PR, I resolved the error

@pratyakshsharma
Copy link
Contributor

@liujinhui1994 Making the TypedProperties variable non transient works. Another work around can be #2093

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants