-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[WIP] Merge NestedLoopJoin Build vectors #10048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -673,7 +673,7 @@ void RowVector::resize(vector_size_t newSize, bool setNotNull) { | |
| // to skip uniqueness check since effectively we are just changing | ||
| // the length. | ||
| if (newSize > oldSize) { | ||
| VELOX_CHECK(child.unique(), "Resizing shared child vector"); | ||
| // VELOX_CHECK(child.unique(), "Resizing shared child vector"); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why have you commented this line ?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This check makes sure if the vectors are shared we don't resize them. I will try an alternate approach like you and @Yuhta suggested. |
||
| child->resize(newSize, setNotNull); | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to merge the vector, you just need to change the loop order in probe to do the probe side first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karteekmurthys : Agree with Jimmy. We need to change the probe loop to generate the indices differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One approach I tried was generating all the probe side indices for total build vector size (sum of all build vectors at the time of probing). Since we compute
buildSizeas sum of all build vectors, this also takes care of generating the build side indices. Now, all we need to handle is projections which expects a single RowVector for probe and build side. It seems like at this point we are forced to merge all our build vectors to be projected at once. That is why I chose to merge vectors in NestedLoopBuild side. Please CMIW there could be a better way to map indices to vectors.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Synced with @Yuhta will try another approach and update this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, merging build vectors help avoiding small output batches so might worth doing here. But
BaseVector::appendis not the right implementation here as it's too slow. Can you create a static method inBaseVectorlike this?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think about it again, merging will produce batches larger than we currently use, so have a risk memory-wise. @mbasmanova Do you have any opinion here? Will merging the build side in nested loop causing any potential issue?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yuhta, @mbasmanova : There is another option here. This grouping behavior from NestedLoopJoinNode is needed because of the use of StreamingAggregation in the original query.
We could change the LocalPlanner to use HashAggregation instead of the StreamingAggregation when used over a NestedLoopJoin instead for the correctness.
To elaborate, the original issue came from incorrect results in the following query:
SELECT count(*) FROM orders o WHERE EXISTS(SELECT 1 FROM orders i WHERE o.orderkey < i.orderkey AND i.orderkey % 1000 = 0);count(*) needed aggregation and the subquery needed NestedLoopJoin in Prestissimo.
In Presto Java -- the same JoinNode translates to LookupJoin which imposes a sorting on the HashBuild side in this code https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/sql/planner/LocalExecutionPlanner.java#L2436. So it justified the usage of StreamingAggregation.
We don't have an equivalent for such a LookupJoin in Velox and used NestedLoopJoin.
Since StreamingAggregation above NestedLoopJoin requires pre-grouping, it didn't give correct results.
As demonstrated in @karteekmurthys example
This double counts the groups for both 1 and 2 keys.
We could change in local planning to use HashAggregation instead of StreamingAggregation above NestedLoopJoin to avoid the mis-counting.
wdyt ?
@amitkdutta