-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9990][SQL]Create local hash join operator #8535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to use a new method name because the default parameter is in conflict with overloading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some java doc to explain what this method's doing?
|
Test build #41824 has finished for PR 8535 at commit
|
|
@rxin do we need to make these local classes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we just use tungstenEnabled here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the ProjectNode always uses unsafe projection, should we control that by this config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the ProjectNode always uses unsafe projection, should we control that by this config?
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we just use tungstenEnabled here?
Just followed SparkPlan.
|
Test build #41902 has finished for PR 8535 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think implementing the wrapper is better since it's not very complicated. Duplicate code in general is really bad and hard to maintain. We can have something like the following in LocalNode
def asIterator: Iterator[InternalRow] = new LocalNodeIterator(this)
then provide the dummy SQLMetrics.nullLongMetric
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote some code for this. Feel free to steal or come up with something better. (not tested!)
/**
* An thin wrapper around a [[LocalNode]] that provides an iterator interface.
*/
private class LocalNodeIterator(localNode: LocalNode) extends Iterator[InternalRow] {
private var nextRow: InternalRow = _
override def hasNext: Boolean = {
if (nextRow == null) {
val res = localNode.next()
if (res) {
nextRow = localNode.fetch()
}
res
} else {
true
}
}
override def next(): InternalRow = {
if (hasNext) {
val res = nextRow
nextRow = null
res
} else {
throw new NoSuchElementException
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added LocalNodeIterator to this PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you move this above the defs?
|
@zsxwing this looks pretty good. The tests don't compile for me locally so I'm guessing there's a logical merge conflict that went in since this patch last ran tests. You may have to import some implicits in your |
|
Thanks @andrewor14 for reviewing this one. I have already updated this PR to address your comments. |
|
Test build #42269 has finished for PR 8535 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I would prefer for LocalNodeIterator to be hidden outside LocalNode so the separation is cleaner. I'll submit a follow-up patch to do this.
|
LGTM thanks for addressing all the comments quickly. I'm merging this into master. Let's address the rest in a follow-up patch. |
…, sample and intersect operators This PR is in conflict with #8535. I will update this one when #8535 gets merged. Author: zsxwing <[email protected]> Closes #8573 from zsxwing/more-local-operators.
…perators This PR is in conflict with #8535 and #8573. Will update this one when they are merged. Author: zsxwing <[email protected]> Closes #8642 from zsxwing/expand-nest-join.
This PR includes the following changes: