Conversation
|
If there are test failures introduced by a commit it's better to have the fixes in the commit and not separate. Otherwise there are failing commits on master that interfere with bisecting. This seems to have introduced another dependency related failure. It's caught in the FB integration builds. |
|
I'll try and get Tempto through since it is one of the dependencies. |
|
@aweisberg Yes once the tempto changes are through I will update this PR(squash commits as well) and make sure all the test cases pass before the code is reviewed. Regarding the Facebook Integration build failure, it looks like there are some additional plugins in FB like presto-namespace which will also have to be modified for Hive 3 Upgrade. Currently, this additional plugin is using some dependencies which have been upgraded in this PR due to which we are seeing the errors. I am not sure if this plugin (presto-namespace) is open-sourced. |
|
Thanks I missed that. I thought presto-namespace was one of the OSS repos. Still trying to figure out how to release Tempto. |
|
So the issue is that we can't release Tempto because it's under We either need to find out who has the credentials for updating I can ask around about updating If you want to do the refactor of Tempto it would unblock this PR otherwise we have to wait on trying to get the credentials back. Sorry we are getting blocked on something so basic. |
|
@aweisberg Thanks for looking into this. I can take up the refactoring task and raise another PR once it's ready. |
|
Great, so the package we want is |
|
I was able to upload the snapshot artifacts. Can you rebase your PR and update it to use the 1.51-SNAPSHOT artifact and we will see if the tests complete? |
57acf75 to
ce9dfbc
Compare
|
@aweisberg I had made the respective changes. The product tests have passed but the tests for presto-main module seem to have failed. It doesn't look like that is because of the code changes in this PR, looks like it was because of some crash while the tests were running. Maybe if you can re-trigger the module, it would pass successfully. |
aweisberg
left a comment
There was a problem hiding this comment.
Only two minor change requests. The next step is for me to deploy this on a cluster and shadow some traffic to sanity check. I don't want this to get reverted during the release process.
There was a problem hiding this comment.
Maybe keep the refactor from the Trino PR? trinodb/trino@238f4b6#diff-23fd4cecda2fe420dc9a4fea750e785439046225ed9f013dee3bc123f1c9c2e5R1202
Just to keep the two code bases more consistent.
There was a problem hiding this comment.
@aweisberg I have made the refactor consistent with Trino as you requested.
There was a problem hiding this comment.
Is there a reason we don't take the latest version of BloomFilter from Trino?
There was a problem hiding this comment.
@aweisberg The latest version was not taken with the idea of making the changes gradually to understand the impact. The class being used in trino is forked from org.apache.orc.util.BloomFilter and we had been referring to org.apache.hive.common.util.BloomFilter.
Let me check the impact and the changes required and get back.
There was a problem hiding this comment.
@aweisberg I reviewed the BloomFilter Class being used in Trino. As I mentioned earlier, one of the major change is they have forked from org.apache.orc.util.BloomFilter and post that they have made some optimisations on top of that. A couple of enhancements and optimisations include changes in ORC Bloom filter hash, added support for ORC UTF8 bloom filters, real type in ORC bloom filter and a few more.
Would it be better to have separate PRs for these enhancements?
There was a problem hiding this comment.
I would have been fine with that as well :-) Thanks for doing it now. I am still trying to figure out why I can't publish Tempto. It really does seem to be uploading without error.
ce9dfbc to
115ac54
Compare
|
It's released! At long last our national nightmare is over. Bad news, there are some shading issues I think. FB integration is failing on Still looking into this to make sure we aren't pulling in org.apache.parquet directly by mistake. |
That's great news.
Sure I will also try to have a look into the same. |
7f64dcc to
8456314
Compare
aweisberg
left a comment
There was a problem hiding this comment.
LGTM. I verified this on a cluster. Turns out the issue was that the Tempto 1.51-SNAPSHOT was probably just unmodified 1.50 since I was failing it update the artifacts. Switching to the 1.51 release artifact made the dependency issues go away.
rschlussel
left a comment
There was a problem hiding this comment.
Just 2 questions, otherwise looks good.
There was a problem hiding this comment.
What's the reason for the changes for decimal?
There was a problem hiding this comment.
Constructor argument order is reversed in Hive 3. Trino also just switches the parameters around.
There was a problem hiding this comment.
why was this test removed?
There was a problem hiding this comment.
Short answer, because that is what Trino did trinodb/trino@238f4b6#diff-82a51d3cf239ee341a4eab63c8c88c82367b389a9652cbaa067631d131a60930
Long answer... I don't know. I'll bring it back and see what happens when I run it.
There was a problem hiding this comment.
The test still passes. So I guess we should keep it!
aweisberg
left a comment
There was a problem hiding this comment.
@imjalpreet can you bring back TestShardWriter.testWriterZeroRows? I am not sure why it was removed in Trino, but I brought it back and the test still passes.
There was a problem hiding this comment.
The test still passes. So I guess we should keep it!
Fix TestOrcBatchPageSourceMemoryTracking unit test failures Fork BloomFilter class from org.apache.hive.common.util.BloomFilter Fix presto-orc plugin unit test failures Co-authored-by: David Phillips <david@acz.org>
8456314 to
64fb3db
Compare
|
@aweisberg I have reverted that test removal as requested. I will try to understand why was it removed in Trino. |
|
Ignoring travis failure since we're migrating from travis. Thanks for the contribution! |
|
I was mistaken the joda upgrade was in #15649, hah I should know! |
Ah okay, no issues! |
|
This did end up failing on at least one different issue. CONTAINS on an array was returning false and never true when it should and the # of input rows counted by the aggregation for that query also didn't match 0.247. I am trying to factor out a reproducer now. |
|
Narrowing it down some, the issue with Hive 3 is that arrays of varchar are coming back in queries as an array containing a single varchar which is a , separated list of the array contents. I haven't factored that out into a reproducer yet. You can tell from the CLI because instead of seeing a space after the , between each entry it's just one long run on. |
|
Still can't reproduce, but I noticed that the format of the table that creates this issue is TEXTFILE. I can create my own TEXTFILE format table with an array it in, and select it out, but I don't see that the values in the array being merged :-( |
|
@aweisberg thanks for the detailed investigation. A bug due to correcting a misspelled property that's a first. I will try to get in the fix before the release cut most probably by today evening. |
|
Thanks @imjalpreet putting this up. While working on the fix, please take a look at #15589. It depends which PR merges first. We will need to add materialized view support for Hive 3.0 |
|
@highker Sure, I will have a look at the materialized view PR. |
|
@highker I am not 100% following. Do we need to deliver both at the same time? If we only release the Hive 3 upgrade is that ok? |
|
@aweisberg, @imjalpreet, for example, we need to model materialized views as managed tables today (#15589 (comment)). This can be changed after Hive 3.0 |
This PR includes the base changes required for upgrading to Hive 3. This will help in supporting and bringing a number of interesting features in Hive 3 to Presto in the future.
This depends on https://github.com/prestodb/presto-hive-apache/releases/tag/3.0.0-2 and the following PR in tempto (prestodb/tempto#272)
depends on https://github.com/facebookexternal/presto-facebook/pull/1414