Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jan 19, 2021

I found this while removing test::format_batches (PR to come shortly); The Record batch pretty printing code was printing numbers rather than dates.

Before this PR, when date/time columns were printed they were printed as numbers:

[
    "+----------+",
    "| f        |",
    "+----------+",
    "| 11111111 |",
    "|          |",
    "+----------+",
]

After this PR, they are printed (via chrono) as dates:

[
    "+---------------------+",
    "| f                   |",
    "+---------------------+",
    "| 1970-05-09 14:25:11 |",
    "|                     |",
    "+---------------------+",
]

}
DataType::Time32(unit) if *unit == TimeUnit::Microsecond => {
make_string!(array::Time64MicrosecondArray, column, row)
DataType::Time64(unit) if *unit == TimeUnit::Microsecond => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was actually a bug here -- there is no such type as using Time32 for storing Microseconds -- it is actually Time64 (as can be seen in the Time64MicrosecondArray type below it)

@github-actions
Copy link

let array = $column.as_any().downcast_ref::<$array_type>().unwrap();

let s = if array.is_null($row) {
"".to_string()
Copy link
Contributor

@Dandandan Dandandan Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't nulls be better displayed as "NULL" and add a test for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are tradeoffs between using empty string or some sentinel ("NULL") for display; I don't have a super strong preference though I do think the behavior should be consistent across types

This PR follows the same convention as the rest of this module and uses "" for NULL values: https://github.com/apache/arrow/pull/9263/files#diff-821be3a8a9239cdd12ab2b3c979e174f7b936aa6bfad4b9bc76489b9923cf747R38

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NULL behavior is included in the tests below (and the empty string is visible)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. I think the "problem" would be when displaying strings, as the empty string is displayed the same way as null, but I agree it is better to keep it consistent.

@alamb alamb closed this in 71572bd Jan 20, 2021
@alamb alamb deleted the alamb/pretty_print_datetimes branch January 20, 2021 11:51
kszucs pushed a commit that referenced this pull request Jan 25, 2021
…stamp types

I found this while removing `test::format_batches` (PR to come shortly); The Record batch pretty printing code was printing numbers rather than dates.

Before this PR, when date/time columns were printed they were printed as numbers:

```
[
    "+----------+",
    "| f        |",
    "+----------+",
    "| 11111111 |",
    "|          |",
    "+----------+",
]
```

After this PR, they are printed (via chrono) as dates:

```
[
    "+---------------------+",
    "| f                   |",
    "+---------------------+",
    "| 1970-05-09 14:25:11 |",
    "|                     |",
    "+---------------------+",
]
```

Closes #9263 from alamb/alamb/pretty_print_datetimes

Authored-by: Andrew Lamb <[email protected]>
Signed-off-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants