Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,13 @@ abstract class Expression extends TreeNode[Expression] {
* This means that the lazy `cannonicalized` is called and computed only on the root of the
* adjacent expressions.
*/
lazy val canonicalized: Expression = {
lazy val canonicalized: Expression = withCanonicalizedChildren

/**
* The default process of canonicalization. It is a one pass, bottum-up expression tree
* computation based oncanonicalizing children before canonicalizing the current node.
*/
final protected def withCanonicalizedChildren: Expression = {
val canonicalizedChildren = children.map(_.canonicalized)
withNewChildren(canonicalizedChildren)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -477,7 +477,10 @@ case class Add(
override protected def withNewChildrenInternal(newLeft: Expression, newRight: Expression): Add =
copy(left = newLeft, right = newRight)

override lazy val canonicalized: Expression = {
override lazy val canonicalized: Expression = dataType match {
case _: DecimalType =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add some comments to explain the reason?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @gengliangwang for the catching.
can we make it more fine-grained ? Not all decimal add will fail, so we can check if we can reorder them safely. e.g., precision and scale in all left and right are same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan comment added

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you Are you sure about that? My concern is that if both left and right contains integer contains decimal Adds, the result may still be different after sorting all the sub Adds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding an extra Cast into the canonicalized form if needed like:

  override lazy val canonicalized: Expression = {
    // TODO: do not reorder consecutive `Add`s with different `evalMode`
    val reordered = orderCommutative({ case Add(l, r, _) => Seq(l, r) }).reduce(Add(_, _, evalMode))
    if (dataType != reordered.dataType) {
      Cast(reordered, dataType)
    } else {
      reordered
    }
  }

Copy link
Contributor

@peter-toth peter-toth Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe adding an extra Cast is not a good idea as the 2 expressions with different dataTypes shouldn't be considered equal, but if reordered's data type matches the original then why can't we reorder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peter-toth The ideal solution would be adding extra casts in all the canonicalization of the children ofComplexTypeMergingExpression if the data type is changed. However, there are also overriding in some of the ComplexTypeMergingExpression.
So I would take your suggestion to reorder the Add if the result data type is not changed. Thank you.

withCanonicalizedChildren
case _ =>
// TODO: do not reorder consecutive `Add`s with different `evalMode`
orderCommutative({ case Add(l, r, _) => Seq(l, r) }).reduce(Add(_, _, evalMode))
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4518,6 +4518,14 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
}
}
}

test("SPARK-40903: Don't reorder Add for canonicalize if it is decimal type") {
val tableName = "decimalTable"
withTable(tableName) {
sql(s"create table $tableName(a decimal(12, 5), b decimal(12, 6)) using orc")
checkAnswer(sql(s"select sum(coalesce(a+b+ 1.75, a)) from $tableName"), Row(null))
}
}
}

case class Foo(bar: Option[String])