avro.schema.literal support and some related patches#5101
avro.schema.literal support and some related patches#5101lxynov wants to merge 5 commits intotrinodb:masterfrom
Conversation
|
@lxynov, you have anything to add to this PR? Thanks. |
eabdc5e to
75400d3
Compare
|
@findepi Could you help review this? Some commits seems hacky (due to Hive's lack of spec and doc). Feel free to slack me if you'd like to discuss in DMs. |
There was a problem hiding this comment.
Can you please add a test (product test?) for partitioned tables where table and partition schema mismatches?
for example
- table is ORC, partition is Parquet
- table is ORC, partition is Avro with schema url
- table is ORC, partition is Avro with schema literal
- table is ORC, partition is CSV
There was a problem hiding this comment.
I was trying to add such a product test (using HQLs) which fails before the commit and succeeds after the commit, but failed to do so. But I think this commit logically makes sense and it has been deployed at our company for a long time. Please let me know if you have concerns.
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Can there be any other exception flying here?
There was a problem hiding this comment.
By looking at the implementation of Schema.Parser::parse(), I think SchemaParseException is sufficient here
There was a problem hiding this comment.
I think the reason not to support CTAS with schema url is because we need the table to be in metastore in order to get the information what the schema actually (at least this is the current implementation).
For schema literal we could obtain this information locally. Even if we do not do this (which I am fine with), we should keep those cases separate here -- as separate ifs.
the table/partition schema handliung changes require some improvement |
c68ea1c to
96ce3ef
Compare
There was a problem hiding this comment.
It would be better to limit onHive() usage for Hive specific operation because it's slow. Other places are also the same.
There was a problem hiding this comment.
This test is meant to test Hive created tables, so onHive() is used here.
presto-product-tests/src/main/java/io/prestosql/tests/hive/TestAvroSchemaLiteral.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/metastore/MetastoreUtil.java
Outdated
Show resolved
Hide resolved
c175748 to
7c82037
Compare
|
@ebyhr All tests have passed. This is ready for review |
|
@ebyhr @findepi Could you please take a look? Thanks in advance!
@findepi I thought about this before but found that commit |
|
👋 @lxynov - this PR has become inactive. If you're still interested in working on it, please let us know, and we can try to get reviewers to help with that. We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks. |
Closes #5001
A related issue: prestodb/presto#9116