-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader #8913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8913 +/- ##
==========================================
+ Coverage 75.35% 75.69% +0.33%
==========================================
Files 177 181 +4
Lines 40821 41062 +241
==========================================
+ Hits 30762 31083 +321
+ Misses 10059 9979 -80
Continue to review full report at Codecov.
|
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thank you @Dandandan
| 3||"3.33"|true | ||
| 4|4.4||false | ||
| 5|6.6|""|false | ||
| c_int|c_float|c_string|c_bool|c_date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we also add a c_datetime column too to cover the schema inference of that type too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, added some extra tests / checks as well.
|
This does highlight a mistake made by me when implementing the |
|
Pushed a speed improvement by avoiding parsing of the format (and just using the default formatting). |
|
This is really nice @Dandandan . I will have to do some reading on the implementation of the parse trait by the chrono library. |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good -- thanks @Dandandan
This PR fixes the behavior of a UTF8 -> Date64 conversion process to use `%Y-%m-%dT%H:%M:%S` rather than `%Y-%m-%d` with `00:00:00` time component. It aligns with #8913. Closes #8918 from seddonm1/fix-date64-cast Authored-by: Mike Seddon <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>
This PR fixes the behavior of a UTF8 -> Date64 conversion process to use `%Y-%m-%dT%H:%M:%S` rather than `%Y-%m-%d` with `00:00:00` time component. It aligns with apache/arrow#8913. Closes #8918 from seddonm1/fix-date64-cast Authored-by: Mike Seddon <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>
This is based on #8611 by @Jibbow and some suggestions by @alamb
Adds date32 / date64 to the csv reader. This also fixes the benchmark which now includes date types which were added by @seddonm1
There are some missing parts in the date format support (such as actual ms support) but those can be implemented I think as separate PRs.