Skip to content

Conversation

@xy-xin
Copy link

@xy-xin xy-xin commented Jun 20, 2020

What changes were proposed in this pull request?

This PR add unlimited MATCHED and NOT MATCHED clauses in MERGE INTO statement.

Why are the changes needed?

Now the MERGE INTO syntax is,

MERGE INTO [db_name.]target_table [AS target_alias]
 USING [db_name.]source_table [<time_travel_version>] [AS source_alias]
 ON <merge_condition>
 [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
 [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
 [ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ]

It would be nice if we support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO statement, because users may want to deal with different "AND "s, the result of which just like a series of "CASE WHEN"s. The expected syntax looks like

MERGE INTO [db_name.]target_table [AS target_alias]
 USING [db_name.]source_table [<time_travel_version>] [AS source_alias]
 ON <merge_condition>
 [when_matched_clause [, ...]]
 [when_not_matched_clause [, ...]]

where when_matched_clause is

WHEN MATCHED [ AND <condition> ] THEN <matched_action>

and when_not_matched_clause is

WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action>

matched_action can be one of

DELETE
UPDATE SET * or
UPDATE SET col1 = value1 [, col2 = value2, ...]

and not_matched_action can be one of

INSERT *
INSERT (col1 [, col2, ...]) VALUES (value1 [, value2, ...])

Does this PR introduce any user-facing change?

Yes. The SQL command changes, but it is backward compatible.

How was this patch tested?

New tests added.

@xy-xin
Copy link
Author

xy-xin commented Jun 20, 2020

@cloud-fan @brkyvz , pls take a look.

@SparkQA
Copy link

SparkQA commented Jun 20, 2020

Test build #124306 has finished for PR 28875 at commit e18a7a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class DeleteAction(override val condition: Option[Expression]) extends MergeAction(condition)

}
if (matchedActions.groupBy(_.getClass).mapValues(_.size).exists(_._2 > 1)) {
val matchedActionSize = matchedActions.length
if (matchedActionSize >= 2 && !matchedActions.init.forall(_.condition.nonEmpty)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can still write .children.isEmpty, then we don't need to change v2Commands.scala

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because the children of InsertAction and UpdateAction actually include condition and assignments. There may be cases where there're assignments and condition being ignored but children is nonEmpty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then this is an existing bug?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was a bug.

Copy link
Contributor

@cloud-fan cloud-fan Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you send a new PR against branch 3.0 to fix this bug?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted a pr at #28943.

@SparkQA
Copy link

SparkQA commented Jun 23, 2020

Test build #124396 has finished for PR 28875 at commit ab97e31.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 23, 2020

Test build #124379 has finished for PR 28875 at commit a6ac363.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 23, 2020

Test build #124381 has finished for PR 28875 at commit 1d39c92.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

test("merge into table: the first matched clause must have a condition if there's a second") {
test("merge into table: only the last matched clause can omit the condition") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we test the same thing for not matched clause?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


sealed abstract class MergeAction(
condition: Option[Expression]) extends Expression with Unevaluable {
val condition: Option[Expression]) extends Expression with Unevaluable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to simplify the code a little bit:

sealed trait MergeAction extends Expression with Unevaluable {
  def condition: Option[Expression]
  ...
}

Then we can just write

case class DeleteAction(condition: Option[Expression]) extends MergeAction

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@SparkQA
Copy link

SparkQA commented Jun 28, 2020

Test build #124585 has finished for PR 28875 at commit d5edef3.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • sealed abstract class MergeAction extends Expression with Unevaluable
  • case class DeleteAction(condition: Option[Expression]) extends MergeAction

@xy-xin
Copy link
Author

xy-xin commented Jun 28, 2020

retest it please.

@cloud-fan
Copy link
Contributor

retest it please

@cloud-fan
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Jun 29, 2020

Test build #5046 has finished for PR 28875 at commit d5edef3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • sealed abstract class MergeAction extends Expression with Unevaluable
  • case class DeleteAction(condition: Option[Expression]) extends MergeAction

@SparkQA
Copy link

SparkQA commented Jun 29, 2020

Test build #5047 has finished for PR 28875 at commit d5edef3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • sealed abstract class MergeAction extends Expression with Unevaluable
  • case class DeleteAction(condition: Option[Expression]) extends MergeAction

@SparkQA
Copy link

SparkQA commented Jun 29, 2020

Test build #124625 has finished for PR 28875 at commit d5edef3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • sealed abstract class MergeAction extends Expression with Unevaluable
  • case class DeleteAction(condition: Option[Expression]) extends MergeAction

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 20cd47e Jun 29, 2020
@cloud-fan cloud-fan changed the title [SPARK-32030][SQL] Support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO [SPARK-32030][SPARK-32127][SQL] Support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO Jun 29, 2020
@xy-xin
Copy link
Author

xy-xin commented Jun 30, 2020

Thanks @cloud-fan !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants