Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion orc/src/main/java/org/apache/iceberg/orc/OrcIterable.java
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,9 @@ public CloseableIterator<T> iterator() {
if (nameMapping == null) {
nameMapping = MappingUtil.create(schema);
}
fileSchemaWithIds = ORCSchemaUtil.applyNameMapping(fileSchema, nameMapping);
// Since in the above branch, if the ignoreFileFieldIds is true, fileSchema can still have ids,
// here we should first sanitize it by removing the existing ids.
fileSchemaWithIds = ORCSchemaUtil.applyNameMapping(ORCSchemaUtil.removeIds(fileSchema), nameMapping);
}
readOrcSchema = ORCSchemaUtil.buildOrcProjection(schema, fileSchemaWithIds);
// If the projected ORC schema is an empty struct, it means we are only projecting columns
Expand Down
7 changes: 7 additions & 0 deletions orc/src/main/java/org/apache/iceberg/orc/RemoveIds.java
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,13 @@ public TypeDescription map(TypeDescription map, TypeDescription key, TypeDescrip
return TypeDescription.createMap(key, value);
}

@Override
public TypeDescription union(TypeDescription union, List<TypeDescription> options) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify a little more on why this function is needed for this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are leveraging the RemoveIds visitor to remove ids from a ORC schema, the issue is in LI-Iceberg we added union type support whereas the vanilla RemoveIds doesn't have the union case (because vanilla iceberg doesn't have union type), so we just need to add this override implementation to reconstruct the union schema case.

TypeDescription ret = TypeDescription.createUnion();
options.forEach(ret::addUnionChild);
return ret;
}

@Override
public TypeDescription primitive(TypeDescription primitive) {
return removeIcebergAttributes(primitive.clone());
Expand Down