Skip to content

Conversation

@amogh-jahagirdar
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar commented Jan 17, 2023

This change makes schema ID a required field for SQL View Representations. My rationale is that the Iceberg View specification should always have a well defined schema for a SQL View representation and it should be stored in metadata, and during engine analysis/planning time, the schema should effectively be converted from Iceberg types into the engine data types.

cc @jzhuge @jackye1995 @rdblue @nastra

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making this a required field seems reasonable

Copy link
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think schema is always required for the use case I know. @jzhuge what was the reason it was marked optional? I will also check the original PR to see if it was discussed there.

@rdblue
Copy link
Contributor

rdblue commented Jan 18, 2023

I think the reason it was optional is to support cases where views are created via API rather than by a SQL parser. If you don't have an analyzer, you may not know the output of the SQL query.

@jackye1995
Copy link
Contributor

I think the reason it was optional is to support cases where views are created via API rather than by a SQL parser

Nice that we are clarifying this now. If we support this case of no schema, it means the view will "auto-update" itself when the source table schemas change. This might be a good thing or bad thing depending on the use case.

I think in most database systems at this moment, view schema is fixed at creation time. For example, if we support this in Trino, since we cannot derive the schema and cannot fulfill the List<ViewColumn> columns part of the ConnectorViewDefinition, we will fail atcheckViewStaleness. Core Trino logic needs to be updated to support this case, but it is still achieveable.

I also checked mySQL, https://dev.mysql.com/doc/refman/8.0/en/alter-table.html, the wording here for ALTER TABLE RENAME COLUMN:

Views and stored programs that refer to the renamed column. You must manually alter the definition of these objects to refer to the new column name.

seems to suggest the view schema definition is fixed.

I don't know if there is any ANSI definition or other places that specify a fixed schema is a requirement for view. Is anyone aware of that?

@jackye1995
Copy link
Contributor

Now I think about it, I guess if there is already an existing use case for schema not present, then we probably have to support that regardless of any existing standards. So I think we can close this PR for now. @amogh-jahagirdar

@amogh-jahagirdar
Copy link
Contributor Author

Thanks for the explanation @rdblue in that case yes I think it makes sense just to treat schema as optional in SQL view representation and close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants