Skip to content

Conversation

@tdcmeehan
Copy link

It seems that #5980 erroneously reverted the documentation for the day partition transformation function (originally added by #447). We can see from the code that day is special cased to return date.

@github-actions github-actions bot added the Specification Issues that may introduce spec changes. label Dec 19, 2023
@tdcmeehan tdcmeehan changed the title API: Fix Date partition transform result type API: Fix day partition transform result type Dec 19, 2023
@tdcmeehan
Copy link
Author

CC: @nastra @szehon-ho

| **`year`** | Extract a date or timestamp year, as years from 1970 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `int` |
| **`month`** | Extract a date or timestamp month, as months from 1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `int` |
| **`day`** | Extract a date or timestamp day, as days from 1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `int` |
| **`day`** | Extract a date or timestamp day, as days from 1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `date` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A date is also an integer, as it represents the number of days since 1-1-1970. I think that int is correct here, similar to month, hour, and others.

Copy link
Author

@tdcmeehan tdcmeehan Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the underlying type of date is an int, shouldn't it should still be documented to be date, as it's expected that metadata tables produce YYYY-mm-dd formatted values for days (I gathered this from #279)? As it's written now, it's surprising to me to see non-integer values when I look at partition metadata for a column that has been transformed with day.

Copy link
Member

@RussellSpitzer RussellSpitzer Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Fokko, this is a description of the transform which takes a date or timestamp and produces an int. If you are implementing this transform and return a date the transform would be incorrect and incompatible with other Iceberg implementations. Now if you implement this function and display the output as a different type (like the metadata table) that's fine.

@tdcmeehan
Copy link
Author

It seems the consensus is it's more straightforward to keep this as int, so I will close this. Thanks folks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Specification Issues that may introduce spec changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants