Skip to content

Uppercased schemas are not readable in Iceberg-mr/ hive #1445

@HotSushi

Description

@HotSushi

I wrote a simple test for reading uppercased schema in iceberg-mr, but it fails.

The schema is as follows

Schema(
    required(1, "Data", Types.StructType.of(
        required(2, "Case1", Types.BooleanType.get())
))

If you run simple Select * from table query with hiverunner, it fails because of following error:

java.lang.RuntimeException: cannot find field data from [org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector$IcebergRecordStructField@f45265b5]
	at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:523)
	at org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector.getStructFieldRef(IcebergRecordObjectInspector.java:68)
	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:56)
	at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1033)
	at org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1059)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:75)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:366)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:556)
	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:508)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
	at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:88)

The reason for this is that ObjectInspectorUtils.getStandardStructFieldRef forcibly checks with a lowercased fieldname (i.e data) whereas IcebergRecordObjectInspector has uppercased fieldname (i.e Data).

the following workaround works but not sure if worth pursuing as all fieldnames in structs would be lowercased.

--- a/mr/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergRecordObjectInspector.java
+++ b/mr/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergRecordObjectInspector.java
@@ -125,7 +125,7 @@ public final class IcebergRecordObjectInspector extends StructObjectInspector {
 
     @Override
     public String getFieldName() {
-      return field.name();
+      return field.name().toLowerCase();
     }

Here's the complete test: link

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions