Support adding fields to nested Parquet structs#4939
Support adding fields to nested Parquet structs#4939billonahill wants to merge 9 commits intoprestodb:masterfrom
Conversation
|
@billonahill , |
| * table struct fields are permitted. | ||
| * | ||
| * @throws PrestoException if the schema has not evolved in a supported way | ||
| */ |
There was a problem hiding this comment.
This check is used for all file formats, but it looks like only the Parquet record reader supports this. What will happen with the other formats?
|
I think it is better to have PR #4714 at first. If needed, then we can get this one. |
|
@billonahill @zhenxiao just to confirm with you guys, does #4714 supersede this PR? If yes please close this one and then we can focus on the other. |
|
@nezihyigitbasi I've confirmed that this issue still exists on master by re-submitting this PR with only the test changes included. I will check against #4714 as well. Is there a way to exercise the |
|
@billonahill The purpose of that test is to read data that was written by Hive. You can create the table in Hive, then run |
|
For IntelliJ, you set the parameters in the "Parameters" tab of the TestNG run configuration. I run those Hive tests all the time in the IDE. |
|
Thanks @electrum for jogging my memory re how to run these tests, it's been a while. :) Adding my own notes here, this is how I run from the command line: @nezihyigitbasi #4714 does not seem to address this issue. What's needed is for For my own notes, I've made a branch based on the patch in #4714 (with the current master merged in) at https://github.com/billonahill/presto/tree/jxiang_extra_parquet_struct_fields which includes the fix to |
|
@billonahill not so clear about this PR, could you please add more explanation. Is there anything missing in #4714 for schema evolution in Parquet? We just use that and are working fine |
|
@zhenxiao if you include the unit tests from this patch only in #4714 you can see that they fail. I didn't investigate the code in detail to see why though. The patch in #4714 is based on a pretty old version of the master and has conflicts so I might not have resolved correctly based on your intent. If you could rebase it and try running with the tests from this patch that would show if it works or not. |
|
@zhenxiao can you give @billonahill 's suggestion a try and let us know if that test works or not? |
|
@nezihyigitbasi @billonahill I rebased our Nested Schema Evolution work at: It becomes an independent PR for Nested Schema Evolution. I will close all other pending PRs for that. @billonahill could you please take a test on 6675 to see whether that could fix the problem? We've using that in production for a long time, to resolve all schema evolution issues. It is quite stable. |
|
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions! |
Presto currently support adding columns to a table, but not adding fields to a nested struct. This adds support for struct field additions.