Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Nov 5, 2018

What changes were proposed in this pull request?

Hive and Oracle trim the string when cast stringToTimestamp and stringToDate. this PR support this feature:
image
image

How was this patch tested?

unit tests

Closes #22089

@SparkQA
Copy link

SparkQA commented Nov 5, 2018

Test build #98460 has finished for PR 22943 at commit d297817.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Nov 5, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Nov 5, 2018

Test build #98462 has finished for PR 22943 at commit d297817.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Nov 5, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Nov 5, 2018

Test build #98467 has finished for PR 22943 at commit d297817.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

private[this] def castToTimestamp(from: DataType): Any => Any = from match {
case StringType =>
buildCast[UTF8String](_, utfs => DateTimeUtils.stringToTimestamp(utfs, timeZone).orNull)
buildCast[UTF8String](_, s => DateTimeUtils.stringToTimestamp(s.trim(), timeZone).orNull)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about changing stringToDate and stringToTimestamp instead?
Those functions are used only in Cast and they already handle nullcases, too.
I didn't look at the detail of this PR. The change looks a little less robust when s is null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about change stringToDate to trimStringToDate and update trimStringToDate to:
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, I'd like not to rename it. One line function document will suffice.

@SparkQA
Copy link

SparkQA commented Nov 6, 2018

Test build #98511 has finished for PR 22943 at commit 5090d52.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


/**
* Parses a given UTF8 date string to the corresponding a corresponding [[Long]] value.
* Parses a trimmed UTF8 date string to the corresponding a corresponding [[Long]] value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parses a trimmed UTF8 -> Trim and parse a given UTF8?


/**
* Parses a given UTF8 date string to a corresponding [[Int]] value.
* Parses a trimmed UTF8 date string to a corresponding [[Int]] value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parses a trimmed UTF8 -> Trim and parse a given UTF8?

@SparkQA
Copy link

SparkQA commented Nov 7, 2018

Test build #98540 has finished for PR 22943 at commit b866d65.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Could you review this, @gatorsmile and @cloud-fan ?

millisToDays(c.getTimeInMillis))
assert(stringToDate(UTF8String.fromString("2015-03-18T")).get ===
millisToDays(c.getTimeInMillis))
Seq("2015-03-18", "2015-03-18 ", " 2015-03-18", " 2015-03-18 ", "2015-03-18 123142",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test result doesn't change?

Copy link
Member

@dongjoon-hyun dongjoon-hyun Nov 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New test cases (with space padding) are added; e.g. ' 2015-03-18' and ' 2015-03-18 '.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i see

@dongjoon-hyun
Copy link
Member

Thank you, @wangyum and @cloud-fan .
Merged to master.

@asfgit asfgit closed this in 9e9fa2f Nov 7, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…ringToDate

## What changes were proposed in this pull request?

**Hive** and **Oracle** trim the string when cast `stringToTimestamp` and `stringToDate`. this PR support this feature:
![image](https://user-images.githubusercontent.com/5399861/47979721-793b1e80-e0ff-11e8-97c8-24b10950ee9e.png)
![image](https://user-images.githubusercontent.com/5399861/47979725-7dffd280-e0ff-11e8-87d4-5767a00ed46e.png)

## How was this patch tested?

unit tests

Closes apache#22089

Closes apache#22943 from wangyum/SPARK-25098.

Authored-by: Yuming Wang <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants