Skip to content

Conversation

@dramaticlly
Copy link
Contributor

This PR add data_sequence_number as derived/virtual column on all files metadata table, enables query like

SELECT  data_sequence_number FROM iceberg.db.table.files

without change the avro schema for files.

@dramaticlly dramaticlly changed the title core: Add data sequence number as derived column to files metadata table Core: Add data sequence number as derived column to files metadata table Feb 27, 2024
@dramaticlly
Copy link
Contributor Author

@szehon-ho @aokolnychyi if you want to take a look?

Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dramaticlly for the work! I left some comments to see if we can make the code cleaner.

@aokolnychyi
Copy link
Contributor

Let me take a quick look today as well.

@Override
public Schema schema() {
StructType partitionType = Partitioning.partitionType(table());
// avoid returning an empty struct, which is not always supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit lost with all the schemas here. Let me see if we can simplify this block.

@dramaticlly
Copy link
Contributor Author

closed in favor of #10203

@dramaticlly dramaticlly deleted the dataSeq branch January 26, 2025 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants