-
Notifications
You must be signed in to change notification settings - Fork 3k
Add set-based equality and position filters #1309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
JingsongLi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rdblue , Looks very nice.
| public static <T> CloseableIterable<T> positionSetFilter(CloseableIterable<T> rows, | ||
| Function<T, Long> rowToPosition, | ||
| CloseableIterable<Long> posDeletes) { | ||
| try (CloseableIterable<Long> deletes = posDeletes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a static method toLongSet(We can optimize it to primitive long set in future) in CloseableIterable(Or some other places)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the set methods as you suggested, which should make this a bit cleaner. Thanks!
| try (CloseableIterable<StructLike> deletes = eqDeletes) { | ||
| CloseableIterator<StructLike> eqDeleteIterator = deletes.iterator(); | ||
| if (eqDeleteIterator.hasNext()) { | ||
| StructLikeSet deleteSet = StructLikeSet.create(eqType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this set can be reused? A set gets bigger and bigger in merging?
So can we have a method like addAll(CloseableIterable<StructLike>) in StructLikeSet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored this so that the filter method accepts a set. That will make them reusable.
| CloseableIterator<StructLike> eqDeleteIterator = deletes.iterator(); | ||
| if (eqDeleteIterator.hasNext()) { | ||
| StructLikeSet deleteSet = StructLikeSet.create(eqType); | ||
| Iterators.addAll(deleteSet, eqDeleteIterator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I get this part. Do we create StructLikeSet for every entry in eqDeletes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The outer hasNext check is used to see whether we need to filter at all. If there are no equality deletes then we just return rows in the else case. This could happen if we filter the deletes using the scan predicates. If you're looking for a specific ID that is also used for a delete, then you only need to merge in the deletes with that ID.
To fill the delete set, we call Iterators.addAll that will add all of the remaining items from an iterator to a collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it now.
|
Thanks for reviewing, @JingsongLi and @aokolnychyi! |
This adds set-based filter implementations for equality and position deletes. Equality deletes use the
StructLikeSetadded in #1307.