Parquet partition schema evolution on non-primitive columns#7305
Parquet partition schema evolution on non-primitive columns#7305Yaliang wants to merge 3 commits intoprestodb:masterfrom
Conversation
|
A general question I have is whether this functionality should be required to be enabled with a config setting, or if backward-compatible schema evolution should just be natively supported? I would think we'd want the latter, which is how I thought parquet schema evolution patches like #4714 were being handled. |
|
Right, as I comment here. I am not sure if other formats have similar functionalities to handle the name-based mapping for non-primitive fields. We may able to combine with other information to decided either |
|
For case of parquet format, we should use |
|
I think Presto already has some support for primitive type evolution. If you want to support evolution for non-primitive types, it would be better to do it in |
|
@geraint0923 Correct, that is an alternative approach and looks more structured. |
ffed8b9 to
1482ec6
Compare
|
#4714 Rebased without dependency of presto-main. Will update when it passed CI. |
024d38c to
a1f0e02
Compare
… on non-primitive type for Parquet so that the Parquet cursor can get the table schema
a1f0e02 to
7657954
Compare
|
Restructured commits. |
|
@geraint0923 Ready for review |
|
Closing this PR and implementing the coercion in HiveCoercionRecordCursor and HivePageSource #9131 |
Combined with the flexible parquet struct converter(#4714), this PR added a lazy equal on HiveType in order to allow a partition schema evolution over non-primitive fields(especially Struct).