Skip to content

[#8834] feat(catalogs): Add a table to store details information of Gravitino managed tables.#8847

Merged
mchades merged 11 commits intoapache:branch-lance-namepspace-devfrom
yuqi1129:issue_8834
Oct 23, 2025
Merged

[#8834] feat(catalogs): Add a table to store details information of Gravitino managed tables.#8847
mchades merged 11 commits intoapache:branch-lance-namepspace-devfrom
yuqi1129:issue_8834

Conversation

@yuqi1129
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Add a new table to store table details for Gravitino-managed tables.

Why are the changes needed?

To support the managed catalog.

Fix: #8834

Does this PR introduce any user-facing change?

N/A.

How was this patch tested?

Test the process of create table locally.

@yuqi1129
Copy link
Copy Markdown
Contributor Author

@mchades
Please check if we need to modify the structure of schema_meta and table_meta.

@yuqi1129
Copy link
Copy Markdown
Contributor Author

yuqi1129 commented Oct 20, 2025

The schema and catalog, we need to add locatation information at least, and whether we can store the information in properties.

Comment thread scripts/h2/schema-1.1.0-h2.sql Outdated
Comment thread scripts/h2/schema-1.1.0-h2.sql Outdated
`partition_info` CLOB DEFAULT NULL COMMENT 'table partition info',
`index_info` CLOB DEFAULT NULL COMMENT 'table index info',
`current_version` BIGINT(20) UNSIGNED COMMENT 'table current version',
`last_version` BIGINT(20) UNSIGNED COMMENT 'table last version',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the usage of last_version

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_version marks the largest version that has been successfully committed to storage. In most cases, current_version equals last_version. However, if we want to roll back to a certain version, then we can change current_version to an old one. The following is an example:

current_version = 3
last_version = 3,

When we want to write a new version, then last_version++ and current_version = last_version(Now, they are both 4). All available versions are 1, 2, 3, and 4. We can change current_version to any value between 1 and 4 to use a specific version. After a series of operations, we can use last_version=4 to indicate that the next version is 5.

Copy link
Copy Markdown
Contributor

@mchades mchades Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between last_version here and last_version in the table_meta table?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be the same, let me check whether we can remove last_version as we have define it in table_meta.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think it's unecessary here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"last_version" is quite misleading since started, actually it is the latest version. Besides, I agree with @mchades that this field is not so necessary.

Comment thread scripts/h2/schema-1.1.0-h2.sql
@yuqi1129 yuqi1129 closed this Oct 22, 2025
@yuqi1129 yuqi1129 reopened this Oct 22, 2025
`format` VARCHAR(64) NOT NULL COMMENT 'table format, such as Lance, Iceberg and so on',
`location` VARCHAR(512) NOT NULL COMMENT 'table storage location',
`external` BOOLEAN NOT NULL DEFAULT FALSE COMMENT 'whether the table is external table',
`properties` CLOB DEFAULT NULL COMMENT 'table properties',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use CLOB for this field?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLOB is the for H2 and it is equivalent toTEXT type in MySQL

Comment on lines +22 to +24
`format` VARCHAR(64) NOT NULL COMMENT 'table format, such as Lance, Iceberg and so on',
`location` VARCHAR(512) NOT NULL COMMENT 'table storage location',
`external` BOOLEAN NOT NULL DEFAULT FALSE COMMENT 'whether the table is external table',
Copy link
Copy Markdown
Contributor

@jerryshao jerryshao Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is that these fields can be property, what do you think? @mchades . IIRC, these fields are properties for Hive table, can you please check Iceberg and other tables? If so, I think we can follow the convention.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would also suggest putting location in the table property.

Considering that there may be a need to filter the table according to the format later, the format can be stored as a separate field.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will put location and external in properties and let format as a field.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Location is important. If we need to check the locations whether is used or duplicated.

Comment on lines +26 to +29
`partitions` MEDIUMTEXT DEFAULT NULL COMMENT 'table partition info',
`distribution` MEDIUMTEXT DEFAULT NULL COMMENT 'table distribution info',
`sort_orders` MEDIUMTEXT DEFAULT NULL COMMENT 'table sort order info',
`indexes_info` MEDIUMTEXT DEFAULT NULL COMMENT 'table index info',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these fields will be serialized into json string and write into the DB, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I intend to do so.

`table_id` BIGINT(20) UNSIGNED NOT NULL COMMENT 'table id',
`format` VARCHAR(64) NOT NULL COMMENT 'table format, such as Lance, Iceberg and so on',
`properties` CLOB DEFAULT NULL COMMENT 'table properties',
`partitions` MEDIUMTEXT DEFAULT NULL COMMENT 'table partition info',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use Partitioning?

@mchades mchades merged commit 924acbb into apache:branch-lance-namepspace-dev Oct 23, 2025
28 checks passed
jerryshao pushed a commit to jerryshao/gravitino that referenced this pull request Nov 11, 2025
…n of Gravitino managed tables. (apache#8847)

### What changes were proposed in this pull request?

Add a new table to store table details for Gravitino-managed tables.

### Why are the changes needed?

To support the managed catalog.

Fix: apache#8834 

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

Test the process of create table locally.
youngyjd pushed a commit to youngyjd/gravitino that referenced this pull request Nov 13, 2025
…n of Gravitino managed tables. (apache#8847)

### What changes were proposed in this pull request?

Add a new table to store table details for Gravitino-managed tables.

### Why are the changes needed?

To support the managed catalog.

Fix: apache#8834 

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

Test the process of create table locally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants