Skip to content
Merged
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 58 additions & 2 deletions format/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ For the representations of these types in Avro, ORC, and Parquet file formats, s

#### Nested Types

A **`struct`** is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string.
A **`struct`** is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string. Fields can have [default values](#default-values).

A **`list`** is a collection of values with some element type. The element field has an integer id that is unique in the table schema. Elements can be either optional or required. Element types may be any type.

Expand Down Expand Up @@ -194,6 +194,19 @@ Notes:
For details on how to serialize a schema to JSON, see Appendix C.


#### Default values

Default values can be tracked for struct fields (both nested structs and the top-level schema's struct). There can be two defaults with a field:
- `initial-default` is used to populate the field's value for all records that were written before the field was added to the schema
- `write-default` is used to populate the field's value for any records written after the field was added to the schema, if the writer does not supply the field's value

The `initial-default` is set only when a field is added to an existing schema. The `write-default` is initially set to the same value as `initial-default` and can be changed through schema evolution. If either default is not set for an optional field, then the default value is null for compatibility with older spec versions.

Together, the `initial-default` and `write-default` produce SQL default value behavior without rewriting data files. That is, default value changes may only affect future records and all known fields are written into data files. To produce this behavior, omitting a known field when writing a data file is not allowed. The write default for a field must be written if a field is not supplied to a write. If the write default for a required field is not set, the writer must fail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit confusing because of the ordering of sentences. Perhaps something like

"The ANSI(?) SQL default value behavior treats a new column with a default value as if all previous rows missing that field now have the default value. The default is allowed to be changed for new writes but changing the default does not effect earlier writes. To achieve this behavior in Iceberg, omitting a known field when writing a new data file is never allowed. The write default must be used when writing any new files if a value for the default field is not provided. If the field is required and it is not supplied and there are is no default available, the write must fail."

I'm a little confused on allowing the default for required fields and then allowing writers not to supply them. Isn't this supposed to be behavior for an Optional field which has not been set? Maybe an example on the difference between an Optional field with a write default and a required field with a write default? Sorry if I missed this discussion but I'm a little confused on the difference between Optional w/Default and Required w/Default

Copy link
Member

@RussellSpitzer RussellSpitzer Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the evolution rules below I think I understand

An optional field may be skipped by a writer. When an optional field is skipped the write-default should be used in place of the missing value and this default may be null. A required field may only be skipped by a writer if a write-default exists for that field and this default must not be null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An optional field may be skipped by a writer.

A write may not specify the value for a field with a default, but the write must only produce data files that contain the column set from the default value. I wouldn't say that you can ever "skip" a field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this based on your suggestions. Thank you!


Default values are attributes of fields in schemas and serialized with fields in the JSON format. See [Appendix C](#appendix-c-json-serialization).


#### Schema Evolution

Schemas may be evolved by type promotion or adding, deleting, renaming, or reordering fields in structs (both nested structs and the top-level schema’s struct).
Expand All @@ -210,6 +223,15 @@ Any struct, including a top-level schema, can evolve through deleting fields, ad

Grouping a subset of a struct’s fields into a nested struct is **not** allowed, nor is moving fields from a nested struct into its immediate parent struct (`struct<a, b, c> ↔ struct<a, struct<b, c>>`). Evolving primitive types to structs is **not** allowed, nor is evolving a single-field struct to a primitive (`map<string, int> ↔ map<string, struct<int>>`).

Struct evolution requires the following rules for default values:
* The `initial-default` must be set when a field is added and cannot change
* The `write-default` must be set when a field is added and may change
* When a required field is added, both defaults must be set to a non-null value
* When an optional field is added, the defaults may be null and should be explicitly set
* When a new field is added to a struct with a default value, the default should be updated to include the new field's default
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this cause a cascading effect? if the struct itself is nested inside another outer struct, should the outer struct's default regarding the inner struct also be updated, so on and so forth?

I thought we tried to use the downward traversing approach at runtime to resolve the default value consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was meant as a suggestion. It would be nice if defaults were updated. Or do you think we should not update a default and always recover defaults by traversing child types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think from an implementation standpoint, propagating the default update to the whole schema tree could be a bit tricky to implement. I feel the approach of only updating the default at its own level, and delaying the recovery of defaults to access time should be fine? As long as we always traverse/honor the default value defined at deeper levels?

Or do you think updating the defaults of an outer struct should also overwrite the default of an inner child field if it has its own default defined?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a bit more and changed the wording to this:

When a new field is added to a struct with a default value, updating the struct's default is optional

It is up to the writer whether to update the default. If the write default for the field changes, then not updating the value means not needing to change the struct's default as well. Not requiring this will probably give more expected behavior, but still allow people to set the default explicitly.

* If a field value is missing from a struct's `initial-default`, the field's `initial-default` must be used for the field
* If a field value is missing from a struct's `write-default`, the field's `write-default` must be used for the field


#### Column Projection

Expand Down Expand Up @@ -965,10 +987,12 @@ Types are serialized according to this table:
|**`fixed(L)`**|`JSON string: "fixed[<L>]"`|`"fixed[16]"`|
|**`binary`**|`JSON string: "binary"`|`"binary"`|
|**`decimal(P, S)`**|`JSON string: "decimal(<P>,<S>)"`|`"decimal(9,2)"`,<br />`"decimal(9, 2)"`|
|**`struct`**|`JSON object: {`<br />&nbsp;&nbsp;`"type": "struct",`<br />&nbsp;&nbsp;`"fields": [ {`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"id": <field id int>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"name": <name string>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"required": <boolean>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"type": <type JSON>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"doc": <comment string>`<br />&nbsp;&nbsp;&nbsp;&nbsp;`}, ...`<br />&nbsp;&nbsp;`] }`|`{`<br />&nbsp;&nbsp;`"type": "struct",`<br />&nbsp;&nbsp;`"fields": [ {`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"id": 1,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"name": "id",`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"required": true,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"type": "uuid"`<br />&nbsp;&nbsp;`}, {`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"id": 2,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"name": "data",`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"required": false,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"type": {`<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"type": "list",`<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />&nbsp;&nbsp;&nbsp;&nbsp;`}`<br />&nbsp;&nbsp;`} ]`<br />`}`|
|**`struct`**|`JSON object: {`<br />&nbsp;&nbsp;`"type": "struct",`<br />&nbsp;&nbsp;`"fields": [ {`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"id": <field id int>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"name": <name string>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"required": <boolean>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"type": <type JSON>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"doc": <comment string>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"initial-default": <JSON encoding of default value>,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"write-default": <JSON encoding of default value>`<br />&nbsp;&nbsp;&nbsp;&nbsp;`}, ...`<br />&nbsp;&nbsp;`] }`|`{`<br />&nbsp;&nbsp;`"type": "struct",`<br />&nbsp;&nbsp;`"fields": [ {`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"id": 1,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"name": "id",`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"required": true,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"type": "uuid",`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"initial-default": "0db3e2a8-9d1d-42b9-aa7b-74ebe558dceb",`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"write-default": "ec5911be-b0a7-458c-8438-c9a3e53cffae"`<br />&nbsp;&nbsp;`}, {`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"id": 2,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"name": "data",`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"required": false,`<br />&nbsp;&nbsp;&nbsp;&nbsp;`"type": {`<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"type": "list",`<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />&nbsp;&nbsp;&nbsp;&nbsp;`}`<br />&nbsp;&nbsp;`} ]`<br />`}`|
|**`list`**|`JSON object: {`<br />&nbsp;&nbsp;`"type": "list",`<br />&nbsp;&nbsp;`"element-id": <id int>,`<br />&nbsp;&nbsp;`"element-required": <bool>`<br />&nbsp;&nbsp;`"element": <type JSON>`<br />`}`|`{`<br />&nbsp;&nbsp;`"type": "list",`<br />&nbsp;&nbsp;`"element-id": 3,`<br />&nbsp;&nbsp;`"element-required": true,`<br />&nbsp;&nbsp;`"element": "string"`<br />`}`|
|**`map`**|`JSON object: {`<br />&nbsp;&nbsp;`"type": "map",`<br />&nbsp;&nbsp;`"key-id": <key id int>,`<br />&nbsp;&nbsp;`"key": <type JSON>,`<br />&nbsp;&nbsp;`"value-id": <val id int>,`<br />&nbsp;&nbsp;`"value-required": <bool>`<br />&nbsp;&nbsp;`"value": <type JSON>`<br />`}`|`{`<br />&nbsp;&nbsp;`"type": "map",`<br />&nbsp;&nbsp;`"key-id": 4,`<br />&nbsp;&nbsp;`"key": "string",`<br />&nbsp;&nbsp;`"value-id": 5,`<br />&nbsp;&nbsp;`"value-required": false,`<br />&nbsp;&nbsp;`"value": "double"`<br />`}`|

Note that default values are serialized using the JSON single-value serialization in [Appendix D](#appendix-d-single-value-serialization).


### Partition Specs

Expand Down Expand Up @@ -1069,6 +1093,8 @@ Example

## Appendix D: Single-value serialization

### Binary single-value serialization

This serialization scheme is for storing single values as individual binary values in the lower and upper bounds maps of manifest files.

| Type | Binary serialization |
Expand All @@ -1091,9 +1117,39 @@ This serialization scheme is for storing single values as individual binary valu
| **`list`** | Not supported |
| **`map`** | Not supported |

### JSON single-value serialization

Single values are serialized as JSON by type according to the following table:

| Type | JSON representation | Example | Description |
| ------------------ | ----------------------------------------- | ------------------------------------------ | -- |
| **`boolean`** | **`JSON boolean`** | `true` | |
| **`int`** | **`JSON int`** | `34` | |
| **`long`** | **`JSON long`** | `34` | |
| **`float`** | **`JSON number`** | `1.0` | |
| **`double`** | **`JSON number`** | `1.0` | |
| **`decimal(P,S)`** | **`JSON number`** | `14.20` | Stores the decimal as a number with S places after the decimal |
| **`date`** | **`JSON string`** | `"2017-11-16"` | Stores ISO-8601 standard date |
| **`time`** | **`JSON string`** | `"22:31:08.123456"` | Stores ISO-8601 standard time with microsecond precision |
| **`timestamp`** | **`JSON string`** | `"2017-11-16T22:31:08.123456"` | Stores ISO-8601 standard timestamp with microsecond precision; must not include a zone offset |
| **`timestamptz`** | **`JSON string`** | `"2017-11-16T22:31:08.123456-07:00"` | Stores ISO-8601 standard timestamp with microsecond precision; must include a zone offset |
| **`string`** | **`JSON string`** | `"iceberg"` | |
| **`uuid`** | **`JSON string`** | `"f79c3e09-677c-4bbd-a479-3f349cb785e7"` | Stores the lowercase uuid string |
| **`fixed(L)`** | **`JSON string`** | `"0x00010203"` | Stored as a hexadecimal string, prefixed by `0x` |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also note that the hex string length should be less than or equal to 2*L

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be exactly 2L+2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was talking about the part after the 0x, I was previously thinking if less than 2L + 2 we prepend 0s to it. But I also think making it strictly 2L+2 works too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a need to specify the length. I just wanted to clarify that it will never be less than or equal to 2L. It will always be 2L+2.

| **`binary`** | **`JSON string`** | `"0x00010203"` | Stored as a hexadecimal string, prefixed by `0x` |
| **`struct`** | **`JSON object by field ID`** | `{"1": 1, "2": "bar"}` | Stores struct fields using the field ID as the JSON field name; field values are stored using this JSON single-value format |
| **`list`** | **`JSON array of values`** | `[1, 2, 3]` | Stores a JSON array of values that are serialized using this JSON single-value format |
| **`map`** | **`JSON object of key and value arrays`** | `{ "keys": ["a", "b"], "values": [1, 2] }` | Stores arrays of keys and values; individual keys and values are serialized using this JSON single-value format |


## Appendix E: Format version changes

### Version 3

Default values are added to struct fields in v3.
* The `write-default` is a forward-compatible change because it is only used at write time. Old writers will fail because the field is missing.
* Tables with `initial-default` will be read correctly by older readers if `initial-default` is always null for optional fields. Otherwise, old readers will default optional columns with null. Old readers will fail to read required fields that have an `initial-default` because the default is not supported, when reading data files that are missing the required fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last sentence I think can be reordered a bit
"Old readers will fail to read required fields which are populated by initial-default because that default is not supported." ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I updated to your wording.


### Version 2

Writing v1 metadata:
Expand Down