Skip to content

Conversation

@ajantha-bhat
Copy link
Member

PartitionsTable.Partition will be used between Partitions metadata table and partition stats reader-writer.
Hence, move it to a separate class and extend it with Avro's IndexedRecord (for partition stats writing).

Derived from #8488

Fixes #8455

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ajantha-bhat very exciting to see all the work being done on this front! Just had a few comments/questions.

Also if I'm not mistaken the major pending aspect spec-wise is a conclusion on whether we maintain accurate record counts as opposed to eq/position delete counts right?

import org.apache.iceberg.avro.AvroSchemaUtil;
import org.apache.iceberg.types.Types;

public class Partition implements IndexedRecord {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious does Partition actually need to be public? I was going through the overall PR if I'm not mistaken this would only be in core?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see the PR #8503, partition stats reader and writer in the Util directly work with Partition as input and output.

So, when the query engine wants to utilize partition stats for query planning, partition stats reader gives an iterator of Partition. Hence, this has to be public for engine integration module to utilize these stats.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, in the end engine integrations would use Partition so it has to be public.

return lastUpdatedSnapshotId;
}

synchronized void update(ContentFile<?> file, Snapshot snapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why the change to make it synchronized, are we expecting to update the same Partition reference from multiple threads?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on engine integration. Some engine like Dremio, writes manifest files and datafiles concurrently for a single snapshot operations. In that case we need a synchronization as there can be multiple data files belong to same partition.

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I guess I was thinking that only after writing the data files (even if the data files are written concurrently) would we update the partition stats (serially, going through each of the written data files) but as you said this is really an engine implementation detail. I think it is fine to keep this synchronized.

@ajantha-bhat
Copy link
Member Author

Also if I'm not mistaken the major pending aspect spec-wise is a conclusion on whether we maintain accurate record counts as opposed to eq/position delete counts right?

Yes, that we will conclude soon and get things moving.

PartitionsTable.Partition will be used between Partitions metadata table
and partition stats reader-writer.
Hence, move it to a separate class and extend it with Avro's
IndexedRecord (for partition stats writing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Development

Successfully merging this pull request may close these issues.

Introduce PartitionEntry class to represent stats per partition

2 participants