Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Oct 30, 2019

What changes were proposed in this pull request?

In the PR, I propose to refactor code related to the IntervalUtils.fromDayTimeString() method by:

  • making it more generic and independent from the requested range of interval units
  • introducing enumeration for unit names. This should allow to avoid typos in unit names.
  • extracting unit value properties from fromDayTimeString() like min/max values and a function for converting parsed values to microseconds using exact math methods.

This PR includes code from #26313

Why are the changes needed?

  • To improve maintainability of fromDayTimeString() and related code in IntervalUtils
  • Avoid adhoc code

Does this PR introduce any user-facing change?

No

How was this patch tested?

By existing test in IntervalUtilsSuite and DDLParserSuite.

}

// Parses a string with nanoseconds, truncates the result and returns microseconds
private def parseNanos(nanosStr: String, isNegative: Boolean): Long = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code has been borrowed from #26313

Long.MinValue / DateTimeUtils.MICROS_PER_SECOND,
Long.MaxValue / DateTimeUtils.MICROS_PER_SECOND) * DateTimeUtils.MICROS_PER_SECOND
}
def parseNanos(nanosStr: String): Long = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes from #26313

@MaxGekk
Copy link
Member Author

MaxGekk commented Oct 30, 2019

@dongjoon-hyun @srowen @cloud-fan @HyukjinKwon Could you take a look at this, please.

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112960 has finished for PR 26327 at commit c3fe563.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented Oct 30, 2019

The build #26327 (comment) fails on: '1 2:03' day to hour because groups in the regexp are floating:

  1. hour can be bound to group 5
    hours = toLongWithRange("hour", m.group(5), 0, 23)
  2. or to group 6:
    hours = toLongWithRange("hour", m.group(6), 0, 23)

    The is a weird regexp :-E

…om-daytime-string

# Conflicts:
#	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #113044 has finished for PR 26327 at commit f381103.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • trait HasRelativeError extends Params
  • class _ImputerParams(HasInputCol, HasInputCols, HasOutputCol, HasOutputCols, HasRelativeError):
  • class _RobustScalerParams(HasInputCol, HasOutputCol, HasRelativeError):
  • class HasRelativeError(Params):
  • case class ShowColumnsStatement(
  • case class DataSourceV2ScanRelation(
  • case class OptimizeLocalShuffleReader(conf: SQLConf) extends Rule[SparkPlan]

@SparkQA
Copy link

SparkQA commented Nov 1, 2019

Test build #113052 has finished for PR 26327 at commit 140a26f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants