-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25098][SQL] Trim the string when cast stringToTimestamp and stringToDate #22943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #98460 has finished for PR 22943 at commit
|
|
retest this please |
|
Test build #98462 has finished for PR 22943 at commit
|
|
retest this please |
|
Test build #98467 has finished for PR 22943 at commit
|
| private[this] def castToTimestamp(from: DataType): Any => Any = from match { | ||
| case StringType => | ||
| buildCast[UTF8String](_, utfs => DateTimeUtils.stringToTimestamp(utfs, timeZone).orNull) | ||
| buildCast[UTF8String](_, s => DateTimeUtils.stringToTimestamp(s.trim(), timeZone).orNull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about changing stringToDate and stringToTimestamp instead?
Those functions are used only in Cast and they already handle nullcases, too.
I didn't look at the detail of this PR. The change looks a little less robust when s is null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @gatorsmile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, I'd like not to rename it. One line function document will suffice.
|
Test build #98511 has finished for PR 22943 at commit
|
|
|
||
| /** | ||
| * Parses a given UTF8 date string to the corresponding a corresponding [[Long]] value. | ||
| * Parses a trimmed UTF8 date string to the corresponding a corresponding [[Long]] value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parses a trimmed UTF8 -> Trim and parse a given UTF8?
|
|
||
| /** | ||
| * Parses a given UTF8 date string to a corresponding [[Int]] value. | ||
| * Parses a trimmed UTF8 date string to a corresponding [[Int]] value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parses a trimmed UTF8 -> Trim and parse a given UTF8?
|
Test build #98540 has finished for PR 22943 at commit
|
|
Could you review this, @gatorsmile and @cloud-fan ? |
| millisToDays(c.getTimeInMillis)) | ||
| assert(stringToDate(UTF8String.fromString("2015-03-18T")).get === | ||
| millisToDays(c.getTimeInMillis)) | ||
| Seq("2015-03-18", "2015-03-18 ", " 2015-03-18", " 2015-03-18 ", "2015-03-18 123142", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test result doesn't change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test cases (with space padding) are added; e.g. ' 2015-03-18' and ' 2015-03-18 '.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see
|
Thank you, @wangyum and @cloud-fan . |
…ringToDate ## What changes were proposed in this pull request? **Hive** and **Oracle** trim the string when cast `stringToTimestamp` and `stringToDate`. this PR support this feature:   ## How was this patch tested? unit tests Closes apache#22089 Closes apache#22943 from wangyum/SPARK-25098. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

What changes were proposed in this pull request?
Hive and Oracle trim the string when cast


stringToTimestampandstringToDate. this PR support this feature:How was this patch tested?
unit tests
Closes #22089