You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site/docs/spec.md
+15-7Lines changed: 15 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -170,6 +170,7 @@ Partition specs capture the transform from table data to partition values. This
170
170
|**`month`**| Extract a date or timestamp month, as months from 1970-01-01 |`date`, `timestamp(tz)`|`int`|
171
171
|**`day`**| Extract a date or timestamp day, as days from 1970-01-01 |`date`, `timestamp(tz)`|`date`|
172
172
|**`hour`**| Extract a timestamp hour, as hours from 1970-01-01 00:00:00 |`timestamp(tz)`|`int`|
173
+
|**`alwaysNull`**| Always produces `null` (the void transform) | Any |`void`|
173
174
174
175
All transforms must return `null` for a `null` input value.
175
176
@@ -646,18 +647,25 @@ Each partition field in the fields list is stored as an object. See the table fo
646
647
647
648
In some cases partition specs are stored using only the field list instead of the object format that includes the spec ID, like the deprecated `partition-spec` field in table metadata. The object format should be used unless otherwise noted in this spec.
648
649
649
-
#### Partition Field ID handling
650
+
#### Partition Field ID Handling
650
651
651
-
A partition field id is an integer (starting at 1000) used to identify a partition field.
652
+
A partition field ID is an integer used to identify a partition field.
653
+
These IDs should not conflict with the IDs required for the other fields in data files.
654
+
To avoid that, iceberg assigns partition field IDs starting at 1000.
652
655
653
-
Since iceberg release 0.8.0, partition fields are present in every partition field of partition specs in a table metadata.
656
+
The requirements below are for different versions of tables:
654
657
655
-
* For backward compatibility, if field ids are missing in a table metadata, iceberg will sequentially generate ids for each field starting at 1000 based on its position in the list of fields.
656
-
* For forward compatibility, if field ids are not supported, iceberg will ignore field ids.
658
+
* For v1 tables, partition field metadata should include a field id for each partition field, but this is not required.
659
+
* For v2 tables, partition field metadata must include a field id for each partition field. Partition field IDs are unique across partition specs to support the partition spec evolution for a given table.
660
+
661
+
To remove partition fields from the partition spec in an existing v1 table, it is recommended not removing fields but replacing their transforms with `alwaysNull`.
662
+
Otherwise, partition spec evolution will break because a partition field ID might be assigned to multiple different partition fields during partition spec evolution for a given table.
657
663
658
-
Additionally, in table metadata format v2, partition fields are required to have unique field IDs to support partition spec evolution.
664
+
For compatibility between v1 and v2 tables:
665
+
666
+
* For backward compatibility, if field ids are missing in a table metadata, iceberg will sequentially generate ids for each field starting at 1000 based on its position in the list of fields.
667
+
* For forward compatibility, if field ids are not supported but present in the metadata, old versions of the reference implementation will ignore those field ids and then regenerate an auto-increment field id starting at 1000 for every partition field.
659
668
660
-
For tables without partition field IDs, iceberg will generate an auto-increment unique field id starting at 1000 for every partition field.
0 commit comments