Skip to content

Conversation

@zsxwing
Copy link
Member

@zsxwing zsxwing commented Aug 31, 2015

This PR includes the following changes:

  • Add SQLConf to LocalNode
  • Add HashJoinNode
  • Add ConvertToUnsafeNode and ConvertToSafeNode.scala to test unsafe hash join.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to use a new method name because the default parameter is in conflict with overloading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some java doc to explain what this method's doing?

@SparkQA
Copy link

SparkQA commented Aug 31, 2015

Test build #41824 has finished for PR 8535 at commit 2ca5778.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ConvertToSafeNode(conf: SQLConf, child: LocalNode) extends UnaryLocalNode(conf)
    • case class ConvertToUnsafeNode(conf: SQLConf, child: LocalNode) extends UnaryLocalNode(conf)
    • case class FilterNode(conf: SQLConf, condition: Expression, child: LocalNode)
    • case class HashJoinNode (
    • case class LimitNode(conf: SQLConf, limit: Int, child: LocalNode) extends UnaryLocalNode(conf)
    • abstract class LocalNode(conf: SQLConf) extends TreeNode[LocalNode] with Logging
    • abstract class LeafLocalNode(conf: SQLConf) extends LocalNode(conf)
    • abstract class UnaryLocalNode(conf: SQLConf) extends LocalNode(conf)
    • abstract class BinaryLocalNode(conf: SQLConf) extends LocalNode(conf)
    • case class ProjectNode(conf: SQLConf, projectList: Seq[NamedExpression], child: LocalNode)
    • case class SeqScanNode(conf: SQLConf, output: Seq[Attribute], data: Seq[InternalRow])
    • case class UnionNode(conf: SQLConf, children: Seq[LocalNode]) extends LocalNode(conf)

@zsxwing
Copy link
Member Author

zsxwing commented Sep 1, 2015

@rxin do we need to make these local classes private[sql]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we just use tungstenEnabled here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the ProjectNode always uses unsafe projection, should we control that by this config?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the ProjectNode always uses unsafe projection, should we control that by this config?

Agreed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we just use tungstenEnabled here?

Just followed SparkPlan.

@SparkQA
Copy link

SparkQA commented Sep 2, 2015

Test build #41902 has finished for PR 8535 at commit aa928fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ConvertToSafeNode(conf: SQLConf, child: LocalNode) extends UnaryLocalNode(conf)
    • case class ConvertToUnsafeNode(conf: SQLConf, child: LocalNode) extends UnaryLocalNode(conf)
    • case class FilterNode(conf: SQLConf, condition: Expression, child: LocalNode)
    • case class HashJoinNode (
    • case class LimitNode(conf: SQLConf, limit: Int, child: LocalNode) extends UnaryLocalNode(conf)
    • abstract class LocalNode(conf: SQLConf) extends TreeNode[LocalNode] with Logging
    • abstract class LeafLocalNode(conf: SQLConf) extends LocalNode(conf)
    • abstract class UnaryLocalNode(conf: SQLConf) extends LocalNode(conf)
    • abstract class BinaryLocalNode(conf: SQLConf) extends LocalNode(conf)
    • case class ProjectNode(conf: SQLConf, projectList: Seq[NamedExpression], child: LocalNode)
    • case class SeqScanNode(conf: SQLConf, output: Seq[Attribute], data: Seq[InternalRow])
    • case class UnionNode(conf: SQLConf, children: Seq[LocalNode]) extends LocalNode(conf)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think implementing the wrapper is better since it's not very complicated. Duplicate code in general is really bad and hard to maintain. We can have something like the following in LocalNode

def asIterator: Iterator[InternalRow] = new LocalNodeIterator(this)

then provide the dummy SQLMetrics.nullLongMetric

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote some code for this. Feel free to steal or come up with something better. (not tested!)

/**
 * An thin wrapper around a [[LocalNode]] that provides an iterator interface.
 */
private class LocalNodeIterator(localNode: LocalNode) extends Iterator[InternalRow] {
  private var nextRow: InternalRow = _

  override def hasNext: Boolean = {
    if (nextRow == null) {
      val res = localNode.next()
      if (res) {
        nextRow = localNode.fetch()
      }
      res
    } else {
      true
    }
  }

  override def next(): InternalRow = {
    if (hasNext) {
      val res = nextRow
      nextRow = null
      res
    } else {
      throw new NoSuchElementException
    }
  }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added LocalNodeIterator to this PR :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you move this above the defs?

@andrewor14
Copy link
Contributor

@zsxwing this looks pretty good. The tests don't compile for me locally so I'm guessing there's a logical merge conflict that went in since this patch last ran tests. You may have to import some implicits in your HashJoinNodeSuite.

@zsxwing
Copy link
Member Author

zsxwing commented Sep 10, 2015

Thanks @andrewor14 for reviewing this one. I have already updated this PR to address your comments.

@SparkQA
Copy link

SparkQA commented Sep 10, 2015

Test build #42269 has finished for PR 8535 at commit fcec297.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ConvertToSafeNode(conf: SQLConf, child: LocalNode) extends UnaryLocalNode(conf)
    • case class ConvertToUnsafeNode(conf: SQLConf, child: LocalNode) extends UnaryLocalNode(conf)
    • case class FilterNode(conf: SQLConf, condition: Expression, child: LocalNode)
    • case class HashJoinNode(
    • case class LimitNode(conf: SQLConf, limit: Int, child: LocalNode) extends UnaryLocalNode(conf)
    • abstract class LocalNode(conf: SQLConf) extends TreeNode[LocalNode] with Logging
    • abstract class LeafLocalNode(conf: SQLConf) extends LocalNode(conf)
    • abstract class UnaryLocalNode(conf: SQLConf) extends LocalNode(conf)
    • abstract class BinaryLocalNode(conf: SQLConf) extends LocalNode(conf)
    • case class ProjectNode(conf: SQLConf, projectList: Seq[NamedExpression], child: LocalNode)
    • case class SeqScanNode(conf: SQLConf, output: Seq[Attribute], data: Seq[InternalRow])
    • case class UnionNode(conf: SQLConf, children: Seq[LocalNode]) extends LocalNode(conf)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: I would prefer for LocalNodeIterator to be hidden outside LocalNode so the separation is cleaner. I'll submit a follow-up patch to do this.

@andrewor14
Copy link
Contributor

LGTM thanks for addressing all the comments quickly. I'm merging this into master. Let's address the rest in a follow-up patch.

@asfgit asfgit closed this in d88abb7 Sep 10, 2015
@zsxwing zsxwing deleted the SPARK-9990 branch September 11, 2015 01:34
asfgit pushed a commit that referenced this pull request Sep 11, 2015
…, sample and intersect operators

This PR is in conflict with #8535. I will update this one when #8535 gets merged.

Author: zsxwing <[email protected]>

Closes #8573 from zsxwing/more-local-operators.
asfgit pushed a commit that referenced this pull request Sep 14, 2015
…perators

This PR is in conflict with #8535 and #8573. Will update this one when they are merged.

Author: zsxwing <[email protected]>

Closes #8642 from zsxwing/expand-nest-join.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants