Skip to content

[native] Implement bucket conversion for Hive splits#23028

Merged
xiaoxmeng merged 2 commits intoprestodb:masterfrom
Yuhta:tasks/T192188366/0
Jun 19, 2024
Merged

[native] Implement bucket conversion for Hive splits#23028
xiaoxmeng merged 2 commits intoprestodb:masterfrom
Yuhta:tasks/T192188366/0

Conversation

@Yuhta
Copy link
Contributor

@Yuhta Yuhta commented Jun 18, 2024

When the bucket count of a table changes over time, there can be legitimate cases that multiple buckets exist in the same file. In such cases the query planner should set bucket conversion for these splits and in Velox we use extra filter to get only the rows corresponding to the bucket number requested.

@Yuhta Yuhta marked this pull request as ready for review June 18, 2024 14:35
@Yuhta Yuhta requested a review from a team as a code owner June 18, 2024 14:35
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Yuhta. Overall looks very good minus some documentation comments.

@aditi-pandit
Copy link
Contributor

@Yuhta : We should have a Release note for this PR I feel. Please can you add details in the PR description.

xiaoxmeng
xiaoxmeng previously approved these changes Jun 18, 2024
Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yuhta thanks for the fix!

aditi-pandit
aditi-pandit previously approved these changes Jun 18, 2024
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Yuhta

When the bucket count of a table changes over time, there can be legitimate
cases that multiple buckets exist in the same file.  In such cases the query
planner should set bucket conversion for these splits and in Velox we use extra
filter to get only the rows corresponding to the bucket number requested.
@xiaoxmeng xiaoxmeng merged commit 0168e16 into prestodb:master Jun 19, 2024
@tdcmeehan tdcmeehan mentioned this pull request Aug 23, 2024
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants