Skip to content

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Dec 14, 2020

This is based on #8611 by @Jibbow and some suggestions by @alamb

Adds date32 / date64 to the csv reader. This also fixes the benchmark which now includes date types which were added by @seddonm1

There are some missing parts in the date format support (such as actual ms support) but those can be implemented I think as separate PRs.

@github-actions
Copy link

@codecov-io
Copy link

codecov-io commented Dec 14, 2020

Codecov Report

Merging #8913 (7fbd42a) into master (408e5be) will increase coverage by 0.33%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8913      +/-   ##
==========================================
+ Coverage   75.35%   75.69%   +0.33%     
==========================================
  Files         177      181       +4     
  Lines       40821    41062     +241     
==========================================
+ Hits        30762    31083     +321     
+ Misses      10059     9979      -80     
Impacted Files Coverage Δ
rust/arrow/src/datatypes.rs 76.42% <57.14%> (-0.41%) ⬇️
rust/arrow/src/csv/reader.rs 92.91% <91.30%> (-0.20%) ⬇️
rust/parquet/src/bin/parquet-schema.rs 0.00% <0.00%> (ø)
rust/parquet/src/bin/parquet-rowcount.rs 0.00% <0.00%> (ø)
rust/parquet_derive/src/lib.rs 0.00% <0.00%> (ø)
rust/parquet/src/bin/parquet-read.rs 0.00% <0.00%> (ø)
rust/parquet/tests/custom_writer.rs 79.31% <0.00%> (+79.31%) ⬆️
rust/parquet_derive/src/parquet_field.rs 83.66% <0.00%> (+83.66%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 408e5be...7fbd42a. Read the comment docs.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you @Dandandan

3||"3.33"|true
4|4.4||false
5|6.6|""|false
c_int|c_float|c_string|c_bool|c_date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we also add a c_datetime column too to cover the schema inference of that type too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, added some extra tests / checks as well.

@seddonm1
Copy link
Contributor

This does highlight a mistake made by me when implementing the CAST to Date64. I will do a PR to address and align.

@Dandandan
Copy link
Contributor Author

Pushed a speed improvement by avoiding parsing of the format (and just using the default formatting).

@seddonm1
Copy link
Contributor

This is really nice @Dandandan . I will have to do some reading on the implementation of the parse trait by the chrono library.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- thanks @Dandandan

@alamb alamb closed this in 9a3a5b8 Dec 15, 2020
alamb pushed a commit that referenced this pull request Dec 15, 2020
This PR fixes the behavior of a UTF8 -> Date64 conversion process to use `%Y-%m-%dT%H:%M:%S` rather than `%Y-%m-%d` with `00:00:00` time component.

It aligns with #8913.

Closes #8918 from seddonm1/fix-date64-cast

Authored-by: Mike Seddon <[email protected]>
Signed-off-by: Andrew Lamb <[email protected]>
alamb pushed a commit to apache/arrow-rs that referenced this pull request Apr 20, 2021
This PR fixes the behavior of a UTF8 -> Date64 conversion process to use `%Y-%m-%dT%H:%M:%S` rather than `%Y-%m-%d` with `00:00:00` time component.

It aligns with apache/arrow#8913.

Closes #8918 from seddonm1/fix-date64-cast

Authored-by: Mike Seddon <[email protected]>
Signed-off-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants