Skip to content

Parquet Cleanup#10737

Merged
nezihyigitbasi merged 4 commits intoprestodb:masterfrom
zhenxiao:parquet-clean
Oct 15, 2018
Merged

Parquet Cleanup#10737
nezihyigitbasi merged 4 commits intoprestodb:masterfrom
zhenxiao:parquet-clean

Conversation

@zhenxiao
Copy link
Collaborator

@zhenxiao zhenxiao commented Jun 1, 2018

@dain @electrum @nezihyigitbasi now New Parquet Reader is production ready
this patch is to remove the old Parquet reader, and a cleanup of Parquet code

Copy link
Contributor

@nezihyigitbasi nezihyigitbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhenxiao thanks for cleaning this up. There are at least two changes in this single huge commit:

  • remove the old Parquet reader
  • cleanup/refactor

It would be great if you can split this commit up to reflect that (hope that would be easy).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you moved these methods from the PredicateUtils class to here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is mostly just called here. Try to not have cross module dependency

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module already depends on presto-parquet, also this class already imports PredicateUtils methods. I think we can keep these methods in their original class, the PredicateUtils.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, just found, HiveColumnHandle could not include in presto-parquet module, so I leave it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for removing these tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests are for old Parquet reader only

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this copied from the hive module? If so, please add a comment that this is copied.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will do

@nezihyigitbasi nezihyigitbasi self-assigned this Jun 15, 2018
Copy link
Collaborator Author

@zhenxiao zhenxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, @nezihyigitbasi I will take a try

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests are for old Parquet reader only

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is mostly just called here. Try to not have cross module dependency

@zhenxiao
Copy link
Collaborator Author

@nezihyigitbasi separate into 2 commits: one for remove old parquet reader, one for new Parquet reader code refactor

Copy link
Contributor

@nezihyigitbasi nezihyigitbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove old Parquet Reader LGTM. Please update the commit title as Remove the old Parquet reader

@zhenxiao
Copy link
Collaborator Author

thank you @nezihyigitbasi
get it updated

@zhenxiao
Copy link
Collaborator Author

kindly ping @nezihyigitbasi

@nezihyigitbasi
Copy link
Contributor

nezihyigitbasi commented Jul 11, 2018

@zhenxiao can you please rebase? I plan to continue reviewing this one in the next few days.

@zhenxiao
Copy link
Collaborator Author

get it rebased. @nezihyigitbasi

Copy link
Contributor

@nezihyigitbasi nezihyigitbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the old Parquet Reader looks good.

  • Please update the commit message as ... reader.

Copy link
Contributor

@nezihyigitbasi nezihyigitbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor Parquet code

  • We can update the commit title as Refactor the Parquet reader, and we can put more details in the commit message body, e.g., we can say This change creates the presto-parquet module and updates presto-hive to use it.

  • Now that we have removed the old reader, we can remove the parquet_optimized_reader_enabled/hive.parquet-optimized-reader.enabled property.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module already depends on presto-parquet, also this class already imports PredicateUtils methods. I think we can keep these methods in their original class, the PredicateUtils.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the top level project pom we use 1.1.1.7 version, why do we override that here?

@zhenxiao
Copy link
Collaborator Author

thank you, @nezihyigitbasi
comments addressed

@zhenxiao zhenxiao force-pushed the parquet-clean branch 3 times, most recently from a603af9 to fd2e9cd Compare July 19, 2018 22:49
@zhenxiao
Copy link
Collaborator Author

kindly ping

@nezihyigitbasi
Copy link
Contributor

nezihyigitbasi commented Jul 23, 2018

LGTM. But, before we can merge this I want to resolve all the open issues with the new Parquet reader as once we remove the old reader users won't have a fallback option:

@zhenxiao
Copy link
Collaborator Author

get it. Just added one more commit to remove predicate pushdown property. It is default as true for some releases.

@zhenxiao
Copy link
Collaborator Author

kindly ping

@zhenxiao
Copy link
Collaborator Author

zhenxiao commented Sep 5, 2018

kindly ping

@zhenxiao
Copy link
Collaborator Author

@nezihyigitbasi kindly ping :)

@haozhun
Copy link
Contributor

haozhun commented Sep 11, 2018

@nezihyigitbasi is out for an extended period of time.

@zhenxiao
Copy link
Collaborator Author

oh, get it. thank you @haozhun @dain are you free to review?

Copy link
Contributor

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor typo in comments, but otherwise looks good.

Also, can you verify that #10844 is no longer an issue?

pom.xml Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you retitle this commit "Move Parquet reader to new presto-parquet module"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadocs here have two spaces where Parquet was removed.

@zhenxiao
Copy link
Collaborator Author

@dain thank you, I get comments addressed
#10844 should be fixed in 203+ version

@zhenxiao
Copy link
Collaborator Author

zhenxiao commented Oct 2, 2018

kindly ping @nezihyigitbasi @dain

@dain dain self-assigned this Oct 2, 2018
@zhenxiao
Copy link
Collaborator Author

kindly ping @nezihyigitbasi @dain

@dain
Copy link
Contributor

dain commented Oct 12, 2018

It looks good to me. @nezihyigitbasi do you have any additional concerns?

Copy link
Contributor

@nezihyigitbasi nezihyigitbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too.

@nezihyigitbasi nezihyigitbasi merged commit 7a22609 into prestodb:master Oct 15, 2018
@nezihyigitbasi
Copy link
Contributor

Merged, thanks @zhenxiao.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants