Skip to content

Conversation

@xuanyuanking
Copy link
Member

What changes were proposed in this pull request?

Define a new enumeration LegacyBehaviorPolicy in SQLConf, it will be used as the common value for result change configs.

Why are the changes needed?

During API auditing for the 3.0 release, we found several new approaches that will change the results silently. For these features, we need a common three-value config.

Does this PR introduce any user-facing change?

Yes, original config spark.sql.legacy.ctePrecedence.enabled change to spark.sql.legacy.ctePrecedencePolicy.

How was this patch tested?

Existing UT.

@xuanyuanking
Copy link
Member Author

cc @viirya @dongjoon-hyun @cloud-fan
After further thinking, it's strange in #27454 to use Option[Boolean] to present a three-value config, especially we have other similar works in the future(e.g. #27537)
In this PR, I propose LegacyBehaviorPolicy as the common config value for these similar issues. Please take a look. Thanks.

@cloud-fan
Copy link
Contributor

yea this looks clearer, +1


val LEGACY_CTE_PRECEDENCE_ENABLED = buildConf("spark.sql.legacy.ctePrecedence.enabled")
object LegacyBehaviorPolicy extends Enumeration {
val EXCEPTION, LEGACY, NEW_BEHAVIOR = Value
Copy link
Member

@viirya viirya Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NEW_BEHAVIOR sounds not a good name. INNER_FIRST?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or INNER_PRECEDENCE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the advice @viirya. Yes, both INNER_FIRST and INNER_PRECEDENCE are more accurate in this PR, but I also want to make this enum as a common config value, which can be reused in other similar issues. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about CORRECTED to indicate that it's the corrected behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, change to CORRECTED in 47edd88

@viirya
Copy link
Member

viirya commented Feb 14, 2020

+1 this is better.

s"Please set ${LEGACY_CTE_PRECEDENCE_ENABLED.key} to false so that name defined " +
"in inner CTE takes precedence. See more details in SPARK-28228.")
s"Please set ${LEGACY_CTE_PRECEDENCE_POLICY.key} to NEW_BEHAVIOR so that name " +
"defined in inner CTE takes precedence. See more details in SPARK-28228.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's enum now, maybe also explain what will happen if set it to legacy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done in 47edd88.

@SparkQA
Copy link

SparkQA commented Feb 14, 2020

Test build #118414 has finished for PR 27579 at commit af7827c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

"false, inner CTE definitions take precedence. The default value is empty, " +
.doc("When LEGACY, outer CTE definitions takes precedence over inner definitions. If set to " +
"CORRECTED, inner CTE definitions take precedence. The default value is EXCEPTION, " +
"AnalysisException is thrown while name conflict is detected in nested CTE.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add one more sentence: This config will be removed in future versions and CORRECTED will be the only behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added in bc70f75.

@@ -1,2 +1,2 @@
--SET spark.sql.legacy.ctePrecedence.enabled = false
--SET spark.sql.legacy.ctePrecedencePolicy = new_behavior
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, done in bc70f75.

@SparkQA
Copy link

SparkQA commented Feb 17, 2020

Test build #118542 has finished for PR 27579 at commit 47edd88.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 17, 2020

Test build #118565 has finished for PR 27579 at commit bc70f75.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan closed this in e4a541b Feb 17, 2020
@cloud-fan
Copy link
Contributor

thanks, merging to master/3.0!

cloud-fan pushed a commit that referenced this pull request Feb 17, 2020
…mon value for result change configs

### What changes were proposed in this pull request?
Define a new enumeration `LegacyBehaviorPolicy` in SQLConf, it will be used as the common value for result change configs.

### Why are the changes needed?
During API auditing for the 3.0 release, we found several new approaches that will change the results silently. For these features, we need a common three-value config.

### Does this PR introduce any user-facing change?
Yes, original config `spark.sql.legacy.ctePrecedence.enabled` change to `spark.sql.legacy.ctePrecedencePolicy`.

### How was this patch tested?
Existing UT.

Closes #27579 from xuanyuanking/SPARK-30829.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit e4a541b)
Signed-off-by: Wenchen Fan <[email protected]>
@xuanyuanking
Copy link
Member Author

Thanks for the review.

@xuanyuanking xuanyuanking deleted the SPARK-30829 branch February 18, 2020 02:09
HyukjinKwon pushed a commit that referenced this pull request Mar 4, 2020
…l.legacy.ctePrecedencePolicy`

### What changes were proposed in this pull request?
Fix the migration guide document for `spark.sql.legacy.ctePrecedence.enabled`, which is introduced in #27579.

### Why are the changes needed?
The config value changed.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Document only.

Closes #27782 from xuanyuanking/SPARK-30829-follow.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon pushed a commit that referenced this pull request Mar 4, 2020
…l.legacy.ctePrecedencePolicy`

### What changes were proposed in this pull request?
Fix the migration guide document for `spark.sql.legacy.ctePrecedence.enabled`, which is introduced in #27579.

### Why are the changes needed?
The config value changed.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Document only.

Closes #27782 from xuanyuanking/SPARK-30829-follow.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit f7f1948)
Signed-off-by: HyukjinKwon <[email protected]>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
…mon value for result change configs

### What changes were proposed in this pull request?
Define a new enumeration `LegacyBehaviorPolicy` in SQLConf, it will be used as the common value for result change configs.

### Why are the changes needed?
During API auditing for the 3.0 release, we found several new approaches that will change the results silently. For these features, we need a common three-value config.

### Does this PR introduce any user-facing change?
Yes, original config `spark.sql.legacy.ctePrecedence.enabled` change to `spark.sql.legacy.ctePrecedencePolicy`.

### How was this patch tested?
Existing UT.

Closes apache#27579 from xuanyuanking/SPARK-30829.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
…l.legacy.ctePrecedencePolicy`

### What changes were proposed in this pull request?
Fix the migration guide document for `spark.sql.legacy.ctePrecedence.enabled`, which is introduced in apache#27579.

### Why are the changes needed?
The config value changed.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Document only.

Closes apache#27782 from xuanyuanking/SPARK-30829-follow.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants