Skip to content

HIVE-24436: Fix Avro NULL_DEFAULT_VALUE compatibility issue#1715

Closed
wangyum wants to merge 2 commits intoapache:branch-2.3from
wangyum:HIVE-24436
Closed

HIVE-24436: Fix Avro NULL_DEFAULT_VALUE compatibility issue#1715
wangyum wants to merge 2 commits intoapache:branch-2.3from
wangyum:HIVE-24436

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Nov 27, 2020

What changes were proposed in this pull request?

This pr replace null with JsonProperties.NULL_VALUE to fix compatibility issue:

  1. java.lang.NoSuchMethodError: 'void org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, java.lang.String, org.codehaus.jackson.JsonNode)'
    - create hive serde table with Catalog
    *** RUN ABORTED ***
      java.lang.NoSuchMethodError: 'void org.apache.avro.Schema$Field.<init>(java.lang.String, org.apache.avro.Schema, 
    java.lang.String, org.codehaus.jackson.JsonNode)'
      at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76)
      at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61)
      at org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170)
      at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114)
      at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83)
      at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533)
      at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450)
      at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437)
      at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281)
      at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263)
    
  2. org.apache.avro.AvroRuntimeException: Unknown datum class: class org.codehaus.jackson.node.NullNode
    - alter hive serde table add columns -- partitioned - AVRO *** FAILED ***
      org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: 
    org.apache.avro.AvroRuntimeException: Unknown datum class: class org.codehaus.jackson.node.NullNode;
      at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112)
      at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245)
      at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
      at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346)
      at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
      at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
      at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680)
    

Why are the changes needed?

For compatibility with Avro 1.9.x and Avro 1.10.0.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Build and run Spark test:

mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.hive.execution.HiveDDLSuite test -pl sql/hive

} else {
fields.add(new Schema.Field(schemaField.name(), schemaField.schema(), schemaField.doc(),
nullDefault));
nullDefault));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangyum
Copy link
Member Author

wangyum commented Nov 27, 2020

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @wangyum !

Comment on lines +77 to 78
return new Schema.Field(name, createAvroSchema(typeInfo), comment, JsonProperties.NULL_VALUE);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix, @wangyum and all. The patch looks reasonable and safe to me too.

BTW, all of 25 failures are irrelevant?

Existing failures - 25

@sunchao
Copy link
Member

sunchao commented Nov 30, 2020

Yes @dongjoon-hyun , these test failures have been there since 2.3.7 release. I do plan to take a look at them later.

@wangyum I believe the issue exists in the master branch as well? if so, can we make this PR against the master and backport to branch-2.3/branch-3.1 later once that is merged?

@wangyum
Copy link
Member Author

wangyum commented Dec 1, 2020

This is for master branch: #1722

@sunchao sunchao closed this Dec 1, 2020
@sunchao
Copy link
Member

sunchao commented Dec 1, 2020

Closing this one since #1722 is merged and backported to branch-2.3

@wangyum wangyum deleted the HIVE-24436 branch December 2, 2020 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants