-
Notifications
You must be signed in to change notification settings - Fork 2.9k
API: Fix day partition transform result type #9345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CC: @nastra @szehon-ho |
| | **`year`** | Extract a date or timestamp year, as years from 1970 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `int` | | ||
| | **`month`** | Extract a date or timestamp month, as months from 1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `int` | | ||
| | **`day`** | Extract a date or timestamp day, as days from 1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `int` | | ||
| | **`day`** | Extract a date or timestamp day, as days from 1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns` | `date` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A date is also an integer, as it represents the number of days since 1-1-1970. I think that int is correct here, similar to month, hour, and others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the underlying type of date is an int, shouldn't it should still be documented to be date, as it's expected that metadata tables produce YYYY-mm-dd formatted values for days (I gathered this from #279)? As it's written now, it's surprising to me to see non-integer values when I look at partition metadata for a column that has been transformed with day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Fokko, this is a description of the transform which takes a date or timestamp and produces an int. If you are implementing this transform and return a date the transform would be incorrect and incompatible with other Iceberg implementations. Now if you implement this function and display the output as a different type (like the metadata table) that's fine.
|
It seems the consensus is it's more straightforward to keep this as |
It seems that #5980 erroneously reverted the documentation for the day partition transformation function (originally added by #447). We can see from the code that day is special cased to return date.