-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Docs: Default value support feature specification #4301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Could you please update the PR description to give a high level description of the changes introduced in this PR (e.g., the fact that it updates the documentation to describe the default value spec, and which aspects of the spec are covered)? |
format/spec.md
Outdated
| For details on how to serialize a schema to JSON, see Appendix C. | ||
|
|
||
| #### Default value | ||
| Default values can be assigned to top-level columns or nested fields. Default values are used during schema evolution when adding a new column. The default value is used to read rows belonging to the files that lack the column or nested field prior to the schema evolution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to hit a few points very clearly here.
- Default values can be assigned to any newly created column in an Iceberg Table
- Default values are only used for files which are missing the new column (Do we explicitly require the files to be written prior to the default value being set? If I import a new file directly into the iceberg table does it get the default? I think it's probably easiest to say yes)
- New writes to the table do not use Default Values
a. Required Columns are still required at write time
b. Optional Columns are still persisted as null if they are missing at write time - Changing a default value is basically an undefined behavior
a. Rows which have been rewritten will permanently keep the default value which was set when the row is rewritten
b. Rows which have not been rewritten will adjust to the new default value
c. Because of this we lock changing defaults behind "allowIncompatible changes" - Name Mappings are always used in place of defaults. If a file is added with Iceberg Column Ids (via import or what not) we will always use name mapping before attempting to use a column default value.
I think we are mostly missing 2,3 and 5. But we should be very clear on all these points
I think the current text is good but can probably be tightened up.
For example
"Default values can be assigned to any new top-level column added to an Iceberg table." instead of the first two sentences
"Default Values are only used when reading a data file which is missing the column with the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- New writes to the table do not use Default Values
a. Required Columns are still required at write time
b. Optional Columns are still persisted as null if they are missing at write time
I would skip "New" here. I think default values are not used on the write path at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Agreed
- Yes, probably use the default for imported files. But we should state that fields must be written even if there is a default value.
- I would state this slightly differently:
- All columns must be written in new data files at write time
- Optional columns may contain null, required columns must not
- Optional columns with no default value have an implied default value,
null - Writers are allowed to write the default value if there is no data for a column
- Changing the default value should not be allowed by the spec, similar to how you cannot add required columns to the spec currently. I don't think that documenting "allow incompatible changes" is a good idea -- that's just an implementation detail if you know what you're doing. Also, we will need to clarify that you may now add an optional column, an optional column with a default value, or a required column with a default value. But you cannot add a required column with no default value.
- Agreed, but I would phrase it differently again. Name mappings are used to apply IDs for data files. Default values are added after that point, when the data file has its IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just occurred to me and I want to note it somewhere: right now we assume that missing columns are null in a lot of places. We'll need to be careful about the implementation and make sure we catch things like Parquet file filters that assume a missing column is null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Writers are allowed to write the default value if there is no data for a column
@rdblue could you clarify what you meant by that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean that this "has the side effect of not distinguishing between organic 34 and 34 from schema evolution"
That was a caveat if the user uses UPDATE statement to update he default value (which is an alternative way if dropping and recreating the column is not feasible anymore due to writing more values). So after using code snippet (1), and inserting more data, users can also change the default value by issuing an UPDATE DML, but it will not only update the rows that were present before schema evolution, but also other rows after schema evolution that happened to store the same value as the default value.
That works because the use case is that you've made a mistake and need to correct it before writing any data.
Agreed. It was not initially clear if the suggestion also applied to the case when more data could have been written. I guess this is where the UPDATE route above can be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we agree! I think you're right that you can use UPDATE to make changes as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some more concrete evolution rules with defaults:
- When creating a required field, an initial default value must be set. Neither field values nor the default may be
null.- When creating an optional field, an initial default value can be set. If the initial default value is not set, it is
null.- The initial default value may only be set when adding a field (or through an incompatible change in the API)
- The write default value for a field starts as the initial default value, and is considered set if the field has an initial default
- The write default value may be changed
- When writing, the write default must be written into data files if a value for the column is not supplied
- When writing a field with no write default, the column must be supplied or the write must fail
Does that make it clear?
@rdblue
I think I just want to make a clarification question so that I can make sure I understand the above discussions correctly.
So I assume to summarize this, you mean there should be 2 underlying default value concepts: one is initial default value and one is write default value, the initial default value affects old rows already written without the column and the write default value affects future rows.
And we only expose one default value APIs to users, but that API actually handles the underlying two default value concepts, it's just hidden to users.
Say I have a following scenario, there is a dataset with some rows written, call them R, a user later adds a new column with default value d1, and then changes the default value to d2 (I assume that will be the write default value?) , is it true that according to this spec, when the user reads the dataset at this time, the R rows must read d1 for that column, regardless of whether there is an actual full rewriting happening between the change from d1 to d2? Because I think implementation-wise, we do want to lazily do the rewriting, not eagerly rewrite we set a default value?
Is my above understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the API only exposes one "default value". But we need to track two different values internally. One for existing data files (that cannot change) and one for future writes (than can be updated).
This supports lazy rewriting. I'm not sure what you mean by "regardless of whether there is an actual full rewriting happening". In Iceberg, work that is ongoing does not affect table state. Iceberg only has committed data and metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This supports lazy rewriting.
Nicely put 👍
format/spec.md
Outdated
| | **`long`** | **`number`** | 1 | | | ||
| | **`float`** | **`number`** | 1.1 | | | ||
| | **`double`** | **`number`** | 1.1 | | | ||
| | **`decimal(P,S)`** | **`string`** | "0x3162" | we use hexadecimal byte literals to encode bytes, with prefix `0x`, the byte array contain the two's-complement representation of the `unscaled` integer value in big-endian byte order, the actual value will be `unscaled * 10 ^ (-scale)` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Iceberg, we avoid using personal pronouns ("we" or "I") in specs and docs. It is unclear who "we" is, and avoiding it ends up producing more direct documentation that is easily understood.
For example, here you should state that the values is "Stored as the two's-complement big-endian binary using the minimum number of bytes, converted to a hexadecimal string prefixed by 0x".
You should use wording mostly taken from Appendix D and adapted for JSON.
@RussellSpitzer, what do you think about base64 encoding or other representations? Do we care about the size here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For decimals I feel like we probably aren't saving that much space here, even with dozens of decimal defaults we would still probably be dominated by other entries in metadata.json and we have to string parse the whole thing before we get this value anyway. I think if we actually compressed metadata.json we probably would do better than smaller tuning like this if we were worried about space.
I think storing decimal values as a string is probably still ok and easier to read for humans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me! I was on the fence about this.
Do we need the 0x prefix, or is it just nice to have that?
format/spec.md
Outdated
| |**`list`**|`JSON object: {`<br /> `"type": "list",`<br /> `"element-id": <id int>,`<br /> `"element-required": <bool>`<br /> `"element": <type JSON>`<br />`}`|`{`<br /> `"type": "list",`<br /> `"element-id": 3,`<br /> `"element-required": true,`<br /> `"element": "string"`<br />`}`| | ||
| |**`map`**|`JSON object: {`<br /> `"type": "map",`<br /> `"key-id": <key id int>,`<br /> `"key": <type JSON>,`<br /> `"value-id": <val id int>,`<br /> `"value-required": <bool>`<br /> `"value": <type JSON>`<br />`}`|`{`<br /> `"type": "map",`<br /> `"key-id": 4,`<br /> `"key": "string",`<br /> `"value-id": 5,`<br /> `"value-required": false,`<br /> `"value": "double"`<br />`}`| | ||
|
|
||
| | **`boolean`** | `JSON string: "boolean"` | `"boolean"` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert the changes to this table that are not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted.
format/spec.md
Outdated
|
|
||
| | Type | Json type | Example | Note | | ||
| |--------------------|-------------------|----------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| | **`boolean`** | **`boolean`** | `true` | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this consistent? This uses boolean and json int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I changed it to json boolean .
format/spec.md
Outdated
|
|
||
| For default values, the serialization depends on the type of the corresponding column or nested field. The mapping of types and their corresponding default value JSON serialization is described in the following table: | ||
|
|
||
| | Type | Json type | Example | Note | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to match the conventions used by the table above, including the column headings and type references, like JSON int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it the case I'm already doing this?
format/spec.md
Outdated
| | **`timestamptz`** | **`json long`** | `1646277378000000` | Stores microseconds from 1970-01-01 00:00:00.000000 UTC | | ||
| | **`string`** | **`string`** | `"foo"` | | | ||
| | **`uuid`** | **`string`** | `"eb26bdb1-a1d8-4aa6-990e-da940875492c"` | Stores the lowercase uuid string | | ||
| | **`fixed(L)`** | **`string`** | `"0x3162"` | Stored as a hexadecimal byte literal string, prefixex by `0x` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: prefixex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
format/spec.md
Outdated
| | **`uuid`** | **`string`** | `"eb26bdb1-a1d8-4aa6-990e-da940875492c"` | Stores the lowercase uuid string | | ||
| | **`fixed(L)`** | **`string`** | `"0x3162"` | Stored as a hexadecimal byte literal string, prefixex by `0x` | | ||
| | **`binary`** | **`string`** | `"0x3162"` | Stored as a hexadecimal byte literal string, prefixex by `0x` | | ||
| | **`struct`** | **`object`** | `{"a": 1, "foo": "bar"}` | Use a JSON map to represent struct data, the keys are the nested fields' names in the struct schema, and the values are value literals of corresponding fields' type | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires a fairly complicated process to rewrite the default value when the schema changes because it uses field names instead of field IDs. Have you considered other representations, like a list of fields: [ {"field-id": 5, "value": "s"}, ... ]?
Also, will these inherit default values when the struct evolves? Is that explicit or implicit?
For example, if I have a default that is { "a": 1, "b": 12.34 } and I add an optional field c with the default value "s", will the struct default be { "a": 1, "b": 12.34, "c": null } or { "a": 1, "b": 12.34, "c": "s" }?
What happens if the default value is not updated at the same time and only stores the original, { "a": 1, "b": 12.34 }?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this for a couple days, my preference is to use the default values from the struct fields.
If an implementation doesn't update the default struct value then the most reasonable choice is to use the default that was required to create the field. That also means that we don't need to worry about traversing up the schema tree and updating default values when a nested struct changes.
My preference is to encode the default value that the user supplied and add the defaults of child fields to that value when fields are not present. For initial default values, we would use initial defaults. And for write defaults we would use write defaults.
This is too long to go in the JSON encoding section. This should be covered in the default values section above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue In the case of default values on files which have been name-mapped, would we still do name-mapping first and then use the field-id mapping correct? If so I think that sounds pretty reasonable although it would make things a bit more difficult to read in the JSON file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires a fairly complicated process to rewrite the default value when the schema changes because it uses field names instead of field IDs. Have you considered other representations, like a list of fields:
[ {"field-id": 5, "value": "s"}, ... ]?
I think you are right about using field id to better track fields in terms of schema evolution. do you think array and map also need to be encoded this way (using field-id instead of name)? My feeling is they are unnecessary because array and map can't have schema evolution themselves anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is to encode the default value that the user supplied and add the defaults of child fields to that value when fields are not present. For initial default values, we would use initial defaults. And for write defaults we would use write defaults.
Are you saying if the a default value is missing for a outer struct, its child default value should be automatically propagated to its own level ?
I feel adding this propagation logic (either traverse up the tree, or down the tree) will be too complicated. I have a different idea, how about we treat all the default value as independent?
Let's say there is a struct called foo and its one child called foo.bar.
If the user is to query and project foo, i.e. select foo, we use the default value associated with foo .
If the user is to query and project foo.bar, i.e. select foo.bar, we use the default value associated with foo.bar .
Basically, I'm saying the default value of foo and foo.bar are independent, and users also set and change those 2 values independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue In the case of default values on files which have been name-mapped, would we still do name-mapping first and then use the field-id mapping correct? If so I think that sounds pretty reasonable although it would make things a bit more difficult to read in the JSON file.
I don't think that name mapping would have any effect here. Name mapping is a way to produce an Iceberg schema, but Iceberg schemas require field IDs. So everything should have an ID by this point.
If we were to use names, then the ID/name pairings would come from the schema itself -- hence the requirement that we update the defaults as the names change. But that seems needlessly difficult to implement so I think it makes much more sense to track default values by field ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think array and map also need to be encoded this way (using field-id instead of name)?
No, I think that map and array defaults should be stored as JSON lists. The element or key/value IDs are fairly easy to match up.
Are you saying if the a default value is missing for a outer struct, its child default value should be automatically propagated to its own level ?
Yes. I think we have to do something because we need to define not only what to write, but how to read. It is likely that some implementation will fail to update a default struct when a child field is added. We can do a two things in that case: fail because we don't know how to handle the child field, or look at the child field and use its default value.
I doubt that we will be able to simply "require" keeping the default updated, and clearly refusing to read the table because of a default value issue is not a good idea. So that leaves using the child's default value.
If we choose to do this now, then we can simplify the requirements because you don't have to update parent default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, I'm saying the default value of foo and foo.bar are independent, and users also set and change those 2 values independently.
How is this possible? If I have foo struct<required bap string> default struct("zap") and I add bar to get foo struct<required bap string, required bar int default 0> then the defaults cannot be completely independent because bar is required. You either have to update the foo default to struct("zap", 0) when adding bar (which is not reliable), use the default for bar, or fail because bar is required but has not value in the struct's default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed the case where you can add a required field with default into an existing struct. I think I agree with you that we can use the down-traversing logic to lookup default values. I've updated the default value section to mention this cascading logic.
format/spec.md
Outdated
|
|
||
| #### Default value | ||
|
|
||
| Default value can be assigned to top-level columns or nested fields when they are added to an Iceberg table as part of the schema evolution. The first time a default value is added, this default value is used to read rows belonging to the files that lack the column (i.e. the files were written before the column is added), this default value will also be the default value for the column if user later inserts new rows without specifying the column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section needs more detail. A spec needs to state how something is tracked and requirements for working with the metadata. See the next section for a good example: "Schemas may be evolved by ... Valid type promotions are ...".
To start, this should have a short overview that describes exactly what is stored and how those are used. This is where you will introduce the initial-default and write-default properties of a struct field and explain what they are used for.
The spec also needs to clearly state all of the behavior requirements that have been discussed in this PR.
Currently, this is confusing because it refers to a single default, doesn't state how that default is stored, and allows the default to change. From a user's perspective it is okay to think of the two default values as a single default that changes, but that is too unclear for a specification. Instead, this should end with a description of how the two values are used to maintain SQL behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I specifically left out some details because I originally feel this is user-facing documentation. I can update it to expose the details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can think of this doc as if you were trying to implement Iceberg yourself. Whatever we write here should be all of the information required to write a brand new implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the spec is definitely not user-facing. It is a clear definition of what metadata we are tracking and how it must be used in tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the details.
format/spec.md
Outdated
| | **`long`** | **`json long`** | `1` | | | ||
| | **`float`** | **`json float`** | `1.1` | | | ||
| | **`double`** | **`json double`** | `1.1` | | | ||
| | **`decimal(P,S)`** | **`string`** | `"0x3162"` | Stores the unscaled value, as the two's-complement big-endian binary using the minimum number of bytes, converted to a hexadecimal string prefixed by `0x` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this could be fine as just a decimal string, seems like it would be clearer for a metadata.json reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with either hex string or decimal string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The decimal string would be better and easier to work with. We should specify that it should have the correct number of digits after the decimal, even if that requires padding with 0s, like 34.00 for decimal(9,2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the precision mean the total number of decimal digits instead of the digits of the fraction part?
format/spec.md
Outdated
| | **`decimal(P,S)`** | **`string`** | `"0x3162"` | Stores the unscaled value, as the two's-complement big-endian binary using the minimum number of bytes, converted to a hexadecimal string prefixed by `0x` | | ||
| | **`date`** | **`json int`** | `19054` | Stores days from the 1970-01-01 | | ||
| | **`time`** | **`json long`** | `36000000000` | Stores microseconds from midnight | | ||
| | **`timestamp`** | **`json long`** | `1646277378000000` | Stores microseconds from 1970-01-01 00:00:00.000000 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to specify UTC here as well, we won't be storing the default with timezone information even if the timestamp type has it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we probably don't need this? Because the timestamp (without tz) type is like a relative (imaginary) time on a time axis that is ignorant of timezone? it doesn't mean to specify a real instant.
|
One minor issue: I noticed that although this pr is making changes to the spec.md file, github doesn't give this pr a |
format/spec.md
Outdated
| | **`long`** | **`json long`** | `1` | | | ||
| | **`float`** | **`json float`** | `1.1` | | | ||
| | **`double`** | **`json double`** | `1.1` | | | ||
| | **`decimal(P,S)`** | **`string`** | `"0x3162"` | Stores the unscaled value, as the two's-complement big-endian binary using the minimum number of bytes, converted to a hexadecimal string prefixed by `0x` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @RussellSpitzer suggested that we store these values as natural strings and I agree. This should be "123.45" rather than the unscaled value as binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed this one, let me modify it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it seems json number can also encode arbitrary length decimal numbers? do you think we should use string or number ?
format/spec.md
Outdated
| #### Nested Types | ||
|
|
||
| A **`struct`** is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string. | ||
| A **`struct`** is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string. Fields can also have a default value (see below). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of noting "see below", please make the "default value" text a link to the default value section.
Also, I would phrase this slightly differently: "Fields can have default values". That way it doesn't sound like there is just one default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
format/spec.md
Outdated
|
|
||
| #### Default value | ||
|
|
||
| Default value can be assigned to a column when the column is added to an Iceberg table as part of the schema evolution. They are tracked at the level of a nested field inside a struct, thus it can be used for both top-level columns and nested columns. Iceberg tracks two default values internally: `initial-default` and `write-default`. The `initial-default` is used to read rows belonging to files that lack the column (i.e. the files were written before the column is added); the `write-default` value will be used for the automatically populating the column if user later inserts new rows without specifying the column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beginning when "assigned ... when ..." is not giving context about what field defaults are. This should begin with a high-level explanation and then go into detail. Also, this is the Iceberg spec so it doesn't make sense to refer to what "Iceberg tracks". Instead, this should state what is required.
Default values can be tracked for struct fields (both nested structs and the top-level schema's struct). There are two defaults for a field:
initial-defaultis a value that must be projected when reading a data file that was written before the field was added to the schemawrite-defaultis a value that must be written for all rows when the field is missing from input data while writing a new data fileNote that all schema fields are required when writing data into a table. Omitting a known field from a data file is not allowed. The write default for a field should be written when a field is not supplied to a write. If the write default is not set, the writer must fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of suggestions for the names: file-to-record default and record-to-file default, or read-time default and write-time default. I feel file-to-record default and record-to-file default are expressive, symmetric, and accurately capture the function of each default. We might explain that the former cannot be changed, while the latter can be changed throughout the lifecycle of a table. The former cannot be changed because it is used to read existing files while the latter can be changed because it is used to write new files.
We might also start by explaining the semantics of default values from a contract point of view:
A default value associated with a field is used to:
(1) populate the field's value for all records that were written before the field is introduced
(2) populate the field's value for any records that will be written after the field is introduced, when such records do not supply that field's value.
All fields introduced at table creation time (i.e., not part of schema evolution) only leverage the second use case. Fields introduced as part of a schema evolution can leverage both use cases.Changes of default values apply to future records only (i.e., records written after the default value has changed). To prevent retroactive change of record values that are already written (before the field is introduced, where those records lack the field in the files), internally each field is associated with two default values ... (then we can continue with the section above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wmoustafa, I think that these should not be symmetric because they have very different behaviors. The initial default can only be set when the column is created. The write default can change any time. Also, the initial default is explicitly not a read-time default. While it is applied at read time, I think that file-to-record or read-time imply that the default can be "used" by not writing columns into data files, which is explicitly disallowed.
I'd keep the initial-default and write-default names for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue I am fine with that. Could you comment on the intro (the second paragraph in my comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the explanation of what default values are used for, although I think we need to be specific about which default value is used. The wording "initial-default is used to populate the field's value for all records that were written before the field was introduced" is great. I think that's more clear than my version.
I would be a bit more strict with the wording about changes. In the spec, there are two default values and we need to specify how each one behaves. We can also have a paragraph explaining how to make these appear like a single default to users, but we can't combine the two values in the spec.
I would update to this:
Default values can be tracked for struct fields (both nested structs and the top-level schema's struct). There can be two defaults associated with a field:
initial-defaultis used to populate the field's value for all records that were written before the field was introducedwrite-defaultis used to populate the field's value for any records written after the field is introduced, when the writer does not supply the field's valueNote that all schema fields are required when writing data into a table. Omitting a known field from a data file is not allowed. The write default for a field should be written when a field is not supplied to a write. If the write default is not set, the writer must fail.
The
initial-defaultis set only when a field is added to an existing schema. Thewrite-defaultis initially set to the same value asinitial-defaultand can be changed through schema evolution.Together, the
initial-defaultandwrite-defaultproduce SQL default value behavior without rewriting data files. That is, changes to default values apply to future records only and all known fields are written into data files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue Updated the spec according to the above discussions, could you take a look again?
|
Will default values be part of format version 2, or will this be part of a new format version 3? If it's format version 2, what happens when existing Iceberg versions read table metadata with default values in it? |
format/spec.md
Outdated
| | **`date`** | **`string`** | `"2007-12-03"` | Stores ISO-8601 standard date | | ||
| | **`time`** | **`string`** | `"10:15:30"` | Stores ISO-8601 standard time | | ||
| | **`timestamp`** | **`string`** | `"2007-12-03T10:15:30"` | Stores ISO-8601 standard date-time without time-zone | | ||
| | **`timestamptz`** | **`string`** | `"2007-12-03T10:15:30"` | Stores ISO-8601 standard date-time with time-zone | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an invalid timestamptz value. Timestamp with time zone literals are required to have a zone offset.
In addition, the examples should always include precision down to microseconds for time, timestamp, and timestamptz. Even if the parser is more lax and allows values with different precision, the spec should always represent values correctly.
|
@rzhang10, @wmoustafa, @RussellSpitzer: I've updated this and I think it's about ready to commit. Can you take a quick look to make sure it looks alright? |
|
@sgcowell, version 2 is released and adopted, so these changes must go in v3. We will support these with a compatibility flag until then so you can opt into using them in v2 with known risks. I've documented the incompatibilities Appendix E that describes version changes. |
format/spec.md
Outdated
|
|
||
| The `initial-default` is set only when a field is added to an existing schema. The `write-default` is initially set to the same value as `initial-default` and can be changed through schema evolution. If either default is not set for an optional field, then the default value is null for compatibility with older spec versions. | ||
|
|
||
| Together, the `initial-default` and `write-default` produce SQL default value behavior without rewriting data files. That is, default value changes may only affect future records and all known fields are written into data files. To produce this behavior, omitting a known field when writing a data file is not allowed. The write default for a field must be written if a field is not supplied to a write. If the write default for a required field is not set, the writer must fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit confusing because of the ordering of sentences. Perhaps something like
"The ANSI(?) SQL default value behavior treats a new column with a default value as if all previous rows missing that field now have the default value. The default is allowed to be changed for new writes but changing the default does not effect earlier writes. To achieve this behavior in Iceberg, omitting a known field when writing a new data file is never allowed. The write default must be used when writing any new files if a value for the default field is not provided. If the field is required and it is not supplied and there are is no default available, the write must fail."
I'm a little confused on allowing the default for required fields and then allowing writers not to supply them. Isn't this supposed to be behavior for an Optional field which has not been set? Maybe an example on the difference between an Optional field with a write default and a required field with a write default? Sorry if I missed this discussion but I'm a little confused on the difference between Optional w/Default and Required w/Default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the evolution rules below I think I understand
An optional field may be skipped by a writer. When an optional field is skipped the write-default should be used in place of the missing value and this default may be null. A required field may only be skipped by a writer if a write-default exists for that field and this default must not be null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An optional field may be skipped by a writer.
A write may not specify the value for a field with a default, but the write must only produce data files that contain the column set from the default value. I wouldn't say that you can ever "skip" a field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this based on your suggestions. Thank you!
format/spec.md
Outdated
|
|
||
| Default values are added to struct fields in v3. | ||
| * The `write-default` is a forward-compatible change because it is only used at write time. Old writers will fail because the field is missing. | ||
| * Tables with `initial-default` will be read correctly by older readers if `initial-default` is always null for optional fields. Otherwise, old readers will default optional columns with null. Old readers will fail to read required fields that have an `initial-default` because the default is not supported, when reading data files that are missing the required fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last sentence I think can be reordered a bit
"Old readers will fail to read required fields which are populated by initial-default because that default is not supported." ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I updated to your wording.
RussellSpitzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on board with the semantics, I have a few suggestions on the wording but I think we are in agreement on the behaviors here.
| | **`binary`** | **`JSON string`** | `"0x3162"` | Stored as a hexadecimal string, prefixed by `0x` | | ||
| | **`string`** | **`JSON string`** | `"iceberg"` | | | ||
| | **`uuid`** | **`JSON string`** | `"f79c3e09-677c-4bbd-a479-3f349cb785e7"` | Stores the lowercase uuid string | | ||
| | **`fixed(L)`** | **`JSON string`** | `"0x00010203"` | Stored as a hexadecimal string, prefixed by `0x` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also note that the hex string length should be less than or equal to 2*L
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be exactly 2L+2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I was talking about the part after the 0x, I was previously thinking if less than 2L + 2 we prepend 0s to it. But I also think making it strictly 2L+2 works too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a need to specify the length. I just wanted to clarify that it will never be less than or equal to 2L. It will always be 2L+2.
format/spec.md
Outdated
| * The `write-default` must be set when a field is added and may change | ||
| * When a required field is added, both defaults must be set to a non-null value | ||
| * When an optional field is added, the defaults may be null and should be explicitly set | ||
| * When a new field is added to a struct with a default value, the default should be updated to include the new field's default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this cause a cascading effect? if the struct itself is nested inside another outer struct, should the outer struct's default regarding the inner struct also be updated, so on and so forth?
I thought we tried to use the downward traversing approach at runtime to resolve the default value consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was meant as a suggestion. It would be nice if defaults were updated. Or do you think we should not update a default and always recover defaults by traversing child types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think from an implementation standpoint, propagating the default update to the whole schema tree could be a bit tricky to implement. I feel the approach of only updating the default at its own level, and delaying the recovery of defaults to access time should be fine? As long as we always traverse/honor the default value defined at deeper levels?
Or do you think updating the defaults of an outer struct should also overwrite the default of an inner child field if it has its own default defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this a bit more and changed the wording to this:
When a new field is added to a struct with a default value, updating the struct's default is optional
It is up to the writer whether to update the default. If the write default for the field changes, then not updating the value means not needing to change the struct's default as well. Not requiring this will probably give more expected behavior, but still allow people to set the default explicitly.
|
Thanks for all the reviews and work, @rzhang10, @wmoustafa, and @RussellSpitzer! I merged this. |
|
Great work @rzhang10! Thank you @rdblue and @RussellSpitzer for the insightful discussions! |
This is to continue the work of #2496 , taking into account the discussions that happened there.
This spec change adds in the definition and semantics of
default value, it specifies what default value could be defined for what data types. It also details how default value is serialized inside/along with table schema using JSON encoding.