Skip to content

Pushdown nested fields#3

Merged
JamesRTaylor merged 2 commits intolyft-309from
pushdown-nested-fields
May 29, 2019
Merged

Pushdown nested fields#3
JamesRTaylor merged 2 commits intolyft-309from
pushdown-nested-fields

Conversation

@JamesRTaylor
Copy link

Please review, @billonahill. I've adapted the patch from qqibrow here to be on top of Presto 309. I'm hoping this will address the perf issue for pruning of nested fields and close the gap between Presto & BQ.

cc @puneetjaiswal

@JamesRTaylor JamesRTaylor requested a review from billonahill May 29, 2019 17:03
@billonahill
Copy link

@JamesRTaylor were you able to verify the effectiveness of this patch in staging?

@JamesRTaylor
Copy link
Author

Next step post-merge is to deploy this to next-staging on top of the branch I've already been working with so that we can test perf.

@billonahill
Copy link

billonahill commented May 29, 2019

Got it. This is a big patch and I haven't been totally up to speed with all the upstream PRs and discussions, but looking into this. Can you give a tl;dr; of some of the key areas of the code to focus on?

And which parts have we considerably changed from the upstream PR?

@billonahill
Copy link

billonahill commented May 29, 2019

👍

I gave this a cursory look but would need quite a bit more time to familiarize with the code and I don't want to block on that. Some of the comments I have would be better suited for the OSS patch, like missing tests and questions about the API/object model, so they might not be that constructive here.

I recommend putting it in staging and we can test it.

@JamesRTaylor
Copy link
Author

The main part of the patch is threading through a map of NestedColumn objects to the Hive connector and then using that in ParquetPageSource to only read what's projected. I'll get another PR together for prestoinfra to push this to next staging so we can evaluate the performance.

@JamesRTaylor JamesRTaylor merged commit 398cef2 into lyft-309 May 29, 2019
@JamesRTaylor JamesRTaylor deleted the pushdown-nested-fields branch May 29, 2019 21:16
puneetjaiswal pushed a commit that referenced this pull request Jul 31, 2019
* Pushdown dereference expression to paruqet reader (qqibrow)

* Fix TestMergeNestedColumns
bearcage pushed a commit that referenced this pull request Aug 30, 2019
* Pushdown dereference expression to paruqet reader (qqibrow)

* Fix TestMergeNestedColumns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants