Fix avro test failures in prestodb project#51
Conversation
|
Yeah, I agree. In that case, we have to shade I guess. Any suggestion on this solution? |
Any chance to bump up the hive version in this project? then I think we only need to copy one class (the one you changed on hive) But bump up hive would cause lots of other test failures, I had tried that actually |
|
@beinan What tests are still falling in your side? I just incorporated your change along with my change in Presto and see the tests that testAvro[1000] passed. I also tried other falling tests that failed before and they also passed. Some tests on my local run still fail(test context not set for current thread)but they failed even without upgrading anything. I don't think it is related. Do you think we can have a try with your current change? If yes, you can go ahead to land it, create the jar and I can see if my PR in Presto repo can succeed. |
Really? I was so frustrated last week just because the tests kept failing. let me try to ping some other contributor who are more familiar hive and avro to review this PR. Many thanks for your help! |
|
@zhenxiao could you help take a look? Thanks! |
|
|
||
| return prunedSchemas; | ||
| } | ||
| } No newline at end of file |
|
Did you just copy those code? Any change on top of that? |
|
Hi @beinan, @zhenxiao might be very busy. Do you think we can release a SNAPSHOT version and then we can try it out on Presto? If no issues are found, we can go ahead to create a formal version release. What do you think? I would like to move a little faster. Our Presto team push me several times to upstream the changes I made internally for parquet Column Index and Column Encryption that have been running in production for quite a while. Our Presto team suffers a lot from maintaining those changes as private. Since I open the PR 14960, it has been 1+ years and we cannot even upgrade the parquet version as the first step yet. We need your and communities' help to relieve their pain. Thanks a lot @beinan and the community! cc @chliang71 Xinli Shang |
@tdcmeehan @zhenxiao could you help take a look? this one have been blocked Xili for more than a year. Thanks! @aweisberg is it possible for us to make a SNAPSHOT release? And do you happen to know if the presto CI would pick up the SNAPSHOT dependencies? because we wanna run the CI (includes the facebook integration test) of presto to make sure there is no other failures. Thanks! |
|
Presto CI will build with anything that is in maven because it is just running maven commands. I will try to upload a SNAPSHOT artifact with this change. |
|
Published a snapshot artifact off this branch https://pastebin.com/nJWfrAm8 |
|
Just tried it out and the issue still exists. prestodb/presto#16545 Caused by: java.lang.NoSuchMethodError: org.apache.avro.Schema$Field.(Ljava/lang/String;Lorg/apache/avro/Schema;Ljava/lang/String;Lcom/facebook/presto/hive/$internal/org/codehaus/jackson/JsonNode;)V |
Looks like the line number is not alined with the code change in this PR. Any thoughts? |
|
I think this is an issue with the avro version. The signature of the Field constructor in Avro 1.8.2 was public Field(String name, Schema schema, String doc, JsonNode defaultValue) which the code in TypeInfoToSchema.java is looking for. But in avro 1.9 this signature has changed to public Field(String name, Schema schema, String doc, Object defaultValue) So, we need the change in TypeInfoToSchema.java as done in trino to cast the last argument as type Object. cc @beinan , @shangxinli |


This pr is a cherry pick of https://github.com/apache/hive/pull/1715/commits
We fixed the compatibility issue which caused by Avro's NULL_DEFAULT_VALUE Many thanks to @shangxinli for his contribution in Hive project.
This PR will fix the avro test failures when we're using new version of avro. See the discussion here. prestodb/presto#16545. Quite a few new features of Iceberg connector is blocked by this one.