Skip to content

Conversation

@zhongyujiang
Copy link
Contributor

This PR makes ApplyNameMapping keep field's name after name mapping, which is for compatibility while reading the migrated Iceberg table with legacy parquet schema mentioned in this issue #3604.

@chenjunjiedada
Copy link
Collaborator

Looks correct, could you please add a unit test for this?

@zhongyujiang
Copy link
Contributor Author

Looks correct, could you please add a unit test for this?

Of course!

@zhongyujiang
Copy link
Contributor Author

Hi @chenjunjiedada, I found that RemoveIds haven't keep the original name of repeatedElement of list/map type too:

@Override
  public Type list(GroupType array, Type item) {
    return Types.list(array.getRepetition())
        .element(item)
        .named(array.getName());
  }

I think this also need a fix.
And speaking of unit test, I found that TestPruneColumns already did a similar one for PruneColumns, maybe I can reuse the fileSchema in it, so can I just rename TestPruneColumns to TestParquetTypeVisitor and put the unit tests for RemoveIds and ApplyNameMapping in it? What do you think?

@zhongyujiang
Copy link
Contributor Author

Hey @chenjunjiedada , I make RemoveIds keeps field's original name too and add unit tests for both RemoveIds and ApplyNameMapping in the latest commit, could you please give it a review?

@uncleGen
Copy link
Contributor

uncleGen commented Dec 20, 2021

In general, this pr makes sense to me, and is there same issue in ORC ff?

@uncleGen
Copy link
Contributor

Besides, could you please add an end-to-end unit test?

@zhongyujiang
Copy link
Contributor Author

In general, this pr makes sense to me, and is there same issue in ORC ff?

Sorry, I don't know much of ORC, and I think this issue is already solved by this pr, it's already merged. I'll close this one.

@zhongyujiang zhongyujiang deleted the ApplyNameMapping-without-alter-parquet-field-name branch April 26, 2022 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants