Skip to content

Conversation

@dramaticlly
Copy link
Contributor

Statistics files are helpful to determine the NDV for each columns in a table and can be collected via engines like trino or spark

This patch help support table statistics files as part of rewrite table path spark action.

@szehon-ho @flyrain if you want to take a look

@github-actions
Copy link

github-actions bot commented Feb 8, 2025

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Feb 8, 2025
@dramaticlly
Copy link
Contributor Author

Not stale, will rebase

Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this slipped, this mostly looks good, just small subjective code suggestions.

@szehon-ho szehon-ho merged commit d935460 into apache:main Feb 8, 2025
46 checks passed
@szehon-ho
Copy link
Member

szehon-ho commented Feb 8, 2025

Merged, thanks @dramaticlly ! and @flyrain for additional review!

@dramaticlly
Copy link
Contributor Author

Thanks you @szehon-ho !

slfan1989 pushed a commit to slfan1989/iceberg that referenced this pull request Mar 19, 2025
slfan1989 added a commit to slfan1989/iceberg that referenced this pull request Mar 19, 2025
slfan1989 added a commit to slfan1989/iceberg that referenced this pull request Mar 23, 2025
nastra pushed a commit that referenced this pull request Mar 24, 2025
… procedure (#12006 #12172 #11929 #12282 #12569) (#12568)

* [BackPort#12006] Core: Exclude deleted content file in RewriteTablePathUtil copy plan (#12006)

* [BackPort#12172] Core: Fix RewriteTablePath Incremental Replication (#12172)

* [BackPort#11929] Spark 3.5: Support Statistics Files in RewriteTablePath (#11929)

* [BackPort#12282] Spark 3.5: Fix job description of RewriteTablePathSparkAction(#12282)

* [BackPort#12569] Spark: Improve assertions for better debuggability (#12569)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants