Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

re-arrange the parser rules to make it clear that multiple unit TO unit statement like SELECT INTERVAL '1-1' YEAR TO MONTH '2-2' YEAR TO MONTH is not allowed.

Why are the changes needed?

This is definitely an accident that we support such a weird syntax in the past. It's not supported by any other DBs and I can't think of any use case of it. Also no test covers this syntax in the current codebase.

Does this PR introduce any user-facing change?

Yes, and a migration guide item is added.

How was this patch tested?

new tests.

@cloud-fan
Copy link
Contributor Author

cc @MaxGekk @dongjoon-hyun @maropu

@cloud-fan
Copy link
Contributor Author

retest this please

@shaneknapp
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Oct 28, 2019

Test build #112787 has finished for PR 26285 at commit ca9073d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Oct 28, 2019

retest this please


- Since Spark 3.0, the `size` function returns `NULL` for the `NULL` input. In Spark version 2.4 and earlier, this function gives `-1` for the same input. To restore the behavior before Spark 3.0, you can set `spark.sql.legacy.sizeOfNull` to `true`.

- Since Spark 3.0, the interval literal syntax does not allow multiple unit TO unit statements anymore. For example, `SELECT INTERVAL '1-1' YEAR TO MONTH '2-2' YEAR TO MONTH'` throws parser exception.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: I think its a little difficult to understand word boundaries "... multiple unit TO unit statements anymore.", so how about multiple unit-TO-unit statements anymore or multiple "unit TO unit" statements anymore

: {ansi}? INTERVAL? intervalField+
| {!ansi}? INTERVAL intervalField*
: {ansi}? INTERVAL? (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval)
| {!ansi}? INTERVAL (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval)
Copy link
Member

@maropu maropu Oct 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity; is that bad to just drop + and * for simple fixes?

interval
    : {ansi}? INTERVAL? intervalField
    | {!ansi}? INTERVAL intervalField

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intervalField can be single-unit statements like 1 year, which is allowed to repeat.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Thanks!

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for dropping the support. LGTM except for minor comments.

@SparkQA
Copy link

SparkQA commented Oct 29, 2019

Test build #112805 has finished for PR 26285 at commit ca9073d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Oct 29, 2019

FYI: the test failure is not related to this pr: #26287

case _ =>
throw new ParseException(s"Intervals $from TO $to are not supported.", ctx)
}
validate(interval != null, "No interval can be constructed", ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which of fromYearMonthString or fromDayTimeString can return null? They either throw an exception or return an interval.

"Can only have a single unit TO unit statement in the interval literal syntax",
innerCtx.unitToUnitInterval)
}
Literal(visitMultiUnitsInterval(innerCtx.multiUnitsInterval))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you specify CalendarIntervalType to avoid unnecessary pattern matching for inferring the type by provided value.

@SparkQA
Copy link

SparkQA commented Oct 29, 2019

Test build #112861 has finished for PR 26285 at commit 2b79f9c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112886 has finished for PR 26285 at commit 9ab9554.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112885 has finished for PR 26285 at commit 940c0be.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan]

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112896 has finished for PR 26285 at commit 9ab9554.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

: {ansi}? INTERVAL? intervalField+
| {!ansi}? INTERVAL intervalField*
: INTERVAL (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval)?
| {ansi}? (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ansi}? (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval)? is illegal as it can match anything. Note that we need to make (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval) optional so that we can detect select interval and give precise error message.


// Unknown FROM TO intervals
intercept("interval 10 month to second",
intercept("interval '10' month to second",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from-to unit can only handle string values ('1-2' year to month, '1 12' day to hour).

Copy link
Contributor Author

@cloud-fan cloud-fan Oct 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do write interval 10 month to second, then we will ask users to use string instead, see https://github.com/apache/spark/pull/26285/files#diff-4f9e28af8e9fcb40a8a99b4e49f3b9b2R612

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112918 has finished for PR 26285 at commit 51d9b45.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112916 has finished for PR 26285 at commit 356d19b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112917 has finished for PR 26285 at commit 454b673.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

})

val e1 = intercept[AnalysisException] {
sql("select interval")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to literals.sql

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112928 has finished for PR 26285 at commit 9ec8f63.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112949 has finished for PR 26285 at commit e560cd8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #113005 has finished for PR 26285 at commit e560cd8.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Oct 31, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #113027 has finished for PR 26285 at commit e560cd8.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

I believe the Spark R failure is not related. Any more comments before merging it?

@SparkQA
Copy link

SparkQA commented Nov 1, 2019

Test build #113088 has finished for PR 26285 at commit 044257d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • trait HasRelativeError extends Params
  • class _ImputerParams(HasInputCol, HasInputCols, HasOutputCol, HasOutputCols, HasRelativeError):
  • class _RobustScalerParams(HasInputCol, HasOutputCol, HasRelativeError):
  • class HasRelativeError(Params):
  • case class AlterTableRenamePartitionStatement(
  • case class AlterTableDropPartitionStatement(
  • case class ShowColumnsStatement(
  • case class DataSourceV2ScanRelation(
  • case class OptimizeLocalShuffleReader(conf: SQLConf) extends Rule[SparkPlan]

@cloud-fan
Copy link
Contributor Author

thanks for the review, merging to master!

@cloud-fan cloud-fan closed this in 31ae446 Nov 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants