[#8834] feat(catalogs): Add a table to store details information of Gravitino managed tables.#8847
Conversation
|
@mchades |
|
The schema and catalog, we need to add locatation information at least, and whether we can store the information in |
| `partition_info` CLOB DEFAULT NULL COMMENT 'table partition info', | ||
| `index_info` CLOB DEFAULT NULL COMMENT 'table index info', | ||
| `current_version` BIGINT(20) UNSIGNED COMMENT 'table current version', | ||
| `last_version` BIGINT(20) UNSIGNED COMMENT 'table last version', |
There was a problem hiding this comment.
What's the usage of last_version
There was a problem hiding this comment.
last_version marks the largest version that has been successfully committed to storage. In most cases, current_version equals last_version. However, if we want to roll back to a certain version, then we can change current_version to an old one. The following is an example:
current_version = 3
last_version = 3,
When we want to write a new version, then last_version++ and current_version = last_version(Now, they are both 4). All available versions are 1, 2, 3, and 4. We can change current_version to any value between 1 and 4 to use a specific version. After a series of operations, we can use last_version=4 to indicate that the next version is 5.
There was a problem hiding this comment.
What is the difference between last_version here and last_version in the table_meta table?
There was a problem hiding this comment.
They should be the same, let me check whether we can remove last_version as we have define it in table_meta.
There was a problem hiding this comment.
yeah, I think it's unecessary here
There was a problem hiding this comment.
"last_version" is quite misleading since started, actually it is the latest version. Besides, I agree with @mchades that this field is not so necessary.
| `format` VARCHAR(64) NOT NULL COMMENT 'table format, such as Lance, Iceberg and so on', | ||
| `location` VARCHAR(512) NOT NULL COMMENT 'table storage location', | ||
| `external` BOOLEAN NOT NULL DEFAULT FALSE COMMENT 'whether the table is external table', | ||
| `properties` CLOB DEFAULT NULL COMMENT 'table properties', |
There was a problem hiding this comment.
Why do you use CLOB for this field?
There was a problem hiding this comment.
CLOB is the for H2 and it is equivalent toTEXT type in MySQL
| `format` VARCHAR(64) NOT NULL COMMENT 'table format, such as Lance, Iceberg and so on', | ||
| `location` VARCHAR(512) NOT NULL COMMENT 'table storage location', | ||
| `external` BOOLEAN NOT NULL DEFAULT FALSE COMMENT 'whether the table is external table', |
There was a problem hiding this comment.
My feeling is that these fields can be property, what do you think? @mchades . IIRC, these fields are properties for Hive table, can you please check Iceberg and other tables? If so, I think we can follow the convention.
There was a problem hiding this comment.
Yes, I would also suggest putting location in the table property.
Considering that there may be a need to filter the table according to the format later, the format can be stored as a separate field.
There was a problem hiding this comment.
I will put location and external in properties and let format as a field.
There was a problem hiding this comment.
Location is important. If we need to check the locations whether is used or duplicated.
| `partitions` MEDIUMTEXT DEFAULT NULL COMMENT 'table partition info', | ||
| `distribution` MEDIUMTEXT DEFAULT NULL COMMENT 'table distribution info', | ||
| `sort_orders` MEDIUMTEXT DEFAULT NULL COMMENT 'table sort order info', | ||
| `indexes_info` MEDIUMTEXT DEFAULT NULL COMMENT 'table index info', |
There was a problem hiding this comment.
All these fields will be serialized into json string and write into the DB, right?
There was a problem hiding this comment.
Yeah, I intend to do so.
2. Rename `current_version` to `version`.
| `table_id` BIGINT(20) UNSIGNED NOT NULL COMMENT 'table id', | ||
| `format` VARCHAR(64) NOT NULL COMMENT 'table format, such as Lance, Iceberg and so on', | ||
| `properties` CLOB DEFAULT NULL COMMENT 'table properties', | ||
| `partitions` MEDIUMTEXT DEFAULT NULL COMMENT 'table partition info', |
…n of Gravitino managed tables. (apache#8847) ### What changes were proposed in this pull request? Add a new table to store table details for Gravitino-managed tables. ### Why are the changes needed? To support the managed catalog. Fix: apache#8834 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? Test the process of create table locally.
…n of Gravitino managed tables. (apache#8847) ### What changes were proposed in this pull request? Add a new table to store table details for Gravitino-managed tables. ### Why are the changes needed? To support the managed catalog. Fix: apache#8834 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? Test the process of create table locally.
What changes were proposed in this pull request?
Add a new table to store table details for Gravitino-managed tables.
Why are the changes needed?
To support the managed catalog.
Fix: #8834
Does this PR introduce any user-facing change?
N/A.
How was this patch tested?
Test the process of create table locally.