Api: Track partition statistics via TableMetadata #8502

ajantha-bhat · 2023-09-05T17:07:05Z

Introduce PartitionStatisticsFile as per the Spec.
Tracking PartitionStatisticsFile in a same way as how StatisticsFile is already tracked.

ajantha-bhat · 2023-11-28T10:03:34Z

api/src/main/java/org/apache/iceberg/PartitionStatisticsFile.java

+ * <p>Statistics are informational. A reader can choose to ignore statistics information. Statistics
+ * support is not required to read the table correctly.
+ */
+public interface PartitionStatisticsFile extends Serializable {


As per the spec

https://github.com/apache/iceberg/blob/main/format/spec.md#partition-statistics

aokolnychyi · 2023-11-28T23:06:44Z

Ack, will review this week.

ajantha-bhat · 2023-11-30T16:58:16Z

Ack, will review this week.

Thanks.
Also, please include #9170 in the review

aokolnychyi · 2023-12-05T18:04:29Z

I started looking on Friday but got distracted. I will try to finish by end of Wed.

aokolnychyi

I did one pass. Overall, this seems solid. I'll need to do another pass with fresh eyes.

Are saving the rest of changes in engines to keep these files during cleanup for future PRs? Shall we test the core expiry of snapshots in this PR? It would be unfortunate to add partition stats files and let the expiry process remove them.

aokolnychyi · 2023-12-01T23:58:54Z

.palantir/revapi.yml

      justification: "Static utility class - should not have public constructor"
  "1.4.0":
+    org.apache.iceberg:iceberg-api:
+    - code: "java.method.addedToInterface"


I know we did a similar breaking change when adding table stats. I wonder whether that was the correct decision, however. Why not return an empty list by default?

added as default

aokolnychyi · 2023-12-07T06:37:16Z

api/src/main/java/org/apache/iceberg/PartitionStatisticsFile.java

+import java.io.Serializable;
+
+/**
+ * Represents a partition statistics file in the table default format, that can be used to read


I wonder whether in the table default format will always be accurate. What about just this?

Represents a partition statistics file that can be used to read table data more efficiently.

aokolnychyi · 2023-12-07T06:39:30Z

api/src/main/java/org/apache/iceberg/PartitionStatisticsFile.java

+  long snapshotId();
+
+  /**
+   * Returns fully qualified path to the file, suitable for constructing a Hadoop Path. Never null.


I wouldn't encourage using Hadoop Path given that we have our own FileIO and InputFile.
What about dropping suitable for constructing a Hadoop Path?

aokolnychyi · 2023-12-07T06:47:48Z

api/src/main/java/org/apache/iceberg/UpdatePartitionStatistics.java

+   * @return this for method chaining
+   */
+  UpdatePartitionStatistics setPartitionStatistics(
+      long snapshotId, PartitionStatisticsFile partitionStatisticsFile);


Hm, PartitionStatisticsFile already returns snapshotId().
Why also pass it explicitly and validate they are equal in the implementation?
Seems to match what we do for UpdateStatistics, though.

aokolnychyi · 2023-12-07T06:52:26Z

core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java

    deleteFiles(manifestListsToDelete, "manifest list");

-    if (!beforeExpiration.statisticsFiles().isEmpty()) {
+    if (!beforeExpiration.statisticsFiles().isEmpty()


What about a helper method like hasStatsFiles or something to simplify the condition and stay on 1 line?

aokolnychyi · 2023-12-07T07:00:57Z

core/src/main/java/org/apache/iceberg/ReachableFileCleanup.java

    deleteFiles(manifestListsToDelete, "manifest list");

-    if (!beforeExpiration.statisticsFiles().isEmpty()) {
+    if (!beforeExpiration.statisticsFiles().isEmpty()


A helper method here too?

aokolnychyi · 2023-12-07T07:03:48Z

core/src/main/java/org/apache/iceberg/ReachableFileUtil.java

   * @return the location of statistics files
+   * @deprecated use the {@code allStatisticsFilesLocations(table)} instead.
   */
+  @Deprecated


Given the purpose of this method and how it is used, what about simply extending it to return all statistics files? Then we won't need changes in consumers and won't need another method. We can clarify that this returns all stats files in Javadoc.

removed the changes in this file as the new APIs added was not called from anywhere (this is needed for spark action).

I will raise a follow up PR for Spark actions to consider partition stats file for remove orphan files and expire snapshots action.

Since, it was needed only for spark action, moved these changes into a separate PR with the tests for expire snapshots and remove orphan files.

#9284

aokolnychyi · 2023-12-07T07:12:43Z

core/src/main/java/org/apache/iceberg/SetPartitionStatistics.java

+
+public class SetPartitionStatistics implements UpdatePartitionStatistics {
+  private final TableOperations ops;
+  private final Map<Long, Optional<PartitionStatisticsFile>> partitionStatisticsToSet =


I am not sure how I feel about using a map and giving Optional a special meaning. Would having a map of stats to set and a separate set of snapshot IDs to remove easier to read/understand? What do you think, @ajantha-bhat?

private TableMetadata internalApply(TableMetadata base) { TableMetadata.Builder builder = TableMetadata.buildFrom(base); toSet.forEach(builder::setPartitionStatistics); toRemove.forEach(builder::removePartitionStatistics); return builder.build(); }

aokolnychyi · 2023-12-07T07:17:33Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

      this.statisticsFiles =
          base.statisticsFiles.stream().collect(Collectors.groupingBy(StatisticsFile::snapshotId));
+      this.partitionStatisticsFiles =
+          base.partitionStatisticsFiles.stream()


I wonder whether a helper method would make it easier to read.

aokolnychyi · 2023-12-07T07:18:36Z

core/src/test/resources/TableMetadataPartitionStatisticsFiles.json

+  ],
+  "snapshot-log": [],
+  "metadata-log": []
+}


Missing empty line?

ajantha-bhat · 2023-12-07T17:33:52Z

@aokolnychyi: Thanks for the review.

I see that most of the questions or comments related to why puffin stats followed that style. We can fix in this PR for partition stats and later back port to puffin too.
I didn't add expire snapshots and remove orphan files test code (but have test case of RemoveSnapshots API) as I was thinking to keep the scope of PR to non-Spark. I will have a PR that depends on this PR which tests these function. So, this PR can be merged.

Got a little busy week. I will finish addressing comments and the follow up Spark PR for expire snapshots and remove orphan files by Monday.

Meanwhile you can also review the independent PR (Util for partition stats reading and writing) : #9170

Tracking `PartitionStatisticsFile` in a same way as how `StatisticsFile` is already tracked.

ajantha-bhat · 2023-12-12T11:43:12Z

@aokolnychyi: Fixed all the comments and also opened a new Spark module PR (which is dependent on this) to ensure partition stats are considered for GC (expire snapshots and remove orphan files). I will rebase that PR once this is merged.

#9284

aokolnychyi

This looks close. I did another pass, I'll need to check the tests with fresh eyes.

aokolnychyi · 2023-12-14T10:44:00Z

api/src/main/java/org/apache/iceberg/PartitionStatisticsFile.java

+ * <p>Statistics are informational. A reader can choose to ignore statistics information. Statistics
+ * support is not required to read the table correctly.
+ */
+public interface PartitionStatisticsFile extends Serializable {


Are we sure it is a good idea to make PartitionStatisticsFile serializable? I would probably not do that unless there is a good reason right now. None of our existing files are serializable by contract (they may be in practice but not by API).

Ack. We can add it back if required during an end to end implementation.

aokolnychyi · 2023-12-14T19:31:31Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

      this.previousFileLocation = base.metadataFileLocation;
      this.previousFiles = base.previousFiles;
      this.refs = Maps.newHashMap(base.refs);
-      this.statisticsFiles =


@ajantha-bhat, just to make sure I understand. We try to replace the partition stats for each snapshot but it is not required by the spec so it is technically possible to have multiple files for one snapshot?

Currently one stats file per snapshot. But I think in future it may track multiple files too.

I followed the same pattern as existing puffin files. I want to keep the interfaces consistent.

aokolnychyi · 2023-12-14T19:33:37Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

+    }
+
+    public Builder removePartitionStatistics(long snapshotId) {
+      Preconditions.checkNotNull(snapshotId, "snapshotId is null");


How can it be null if it is primitive?

True. I copy pasted from existing removeStatistics which has this problem. I overlooked or assumed things are correct. I will be careful next time.

aokolnychyi · 2023-12-14T19:43:33Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

      this.sortOrdersById = Maps.newHashMap(base.sortOrdersById);
    }

+    private static Map<Long, List<StatisticsFile>> statsFileBySnapshotID(TableMetadata base) {


Question: Is there a pattern in this class to have static methods at the end? If so, can we put these methods together with other static methods? If not, it is OK to keep them here.

aokolnychyi · 2023-12-14T19:45:38Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

+          .collect(Collectors.groupingBy(StatisticsFile::snapshotId));
+    }
+
+    private static Map<Long, List<PartitionStatisticsFile>> partitionStatsFileBySnapshotID(


I think it is common for this class to call such methods indexSmth and pass actual elements. If so, what about something like below? We can use shorter variables to stay on line but that's totally optional. Up to you, method names are examples.

private static Map<Long, List<StatisticsFile>> indexStatistics(List<StatisticsFile> files) { return files.stream().collect(Collectors.groupingBy(StatisticsFile::snapshotId)); } private static Map<Long, List<PartitionStatisticsFile>> indexPartitionStatistics( List<PartitionStatisticsFile> files) { return files.stream().collect(Collectors.groupingBy(PartitionStatisticsFile::snapshotId)); }

aokolnychyi · 2023-12-14T19:50:20Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

      return this;
    }

+    public Builder setPartitionStatistics(PartitionStatisticsFile partitionStatisticsFile) {


Question: I wonder whether using just file or partitionStats to shorten the lines would make sense.
Completely up to you.

aokolnychyi · 2023-12-14T19:55:58Z

core/src/main/java/org/apache/iceberg/SetPartitionStatistics.java

+
+  @Override
+  public UpdatePartitionStatistics setPartitionStatistics(
+      PartitionStatisticsFile partitionStatisticsFile) {


I feel the context in this class is pretty clear and we can call it stats, partitionStats or file to shorten the lines. Then we can also call other variables partitionStatsToSet or statsToSet.

aokolnychyi · 2023-12-14T19:57:17Z

core/src/main/java/org/apache/iceberg/SetPartitionStatistics.java

+
+public class SetPartitionStatistics implements UpdatePartitionStatistics {
+  private final TableOperations ops;
+  private final Set<PartitionStatisticsFile> partitionStatisticsToSet = Sets.newHashSet();


What about using Map<Long, PartitionStatisticsFile> to make sure the files are overridden if multiple files are passed for the same snapshot? We can then simply call statsToSet.values().forEach(...) below?

aokolnychyi · 2023-12-14T20:03:23Z

core/src/main/java/org/apache/iceberg/TableMetadataParser.java

  }
+
+  private static List<PartitionStatisticsFile> partitionStatisticsFilesFromJson(
+      JsonNode partitionStatisticsFilesList) {


I don't mind using longer variables but I wonder whether a shorter version and staying on one line in some places would be more readable in this method.

aokolnychyi · 2023-12-14T20:06:24Z

@ajantha-bhat, I'll take a look at other PRs once this is in. I feel this one is almost ready to go.

ajantha-bhat · 2023-12-15T13:05:09Z

@aokolnychyi: I have handled all the new comments. Thanks again for the review.

ajantha-bhat · 2023-12-15T16:23:21Z

Re-triggering build due to flaky test in Flink

https://github.com/apache/iceberg/actions/runs/7222400549/job/19679307743?pr=8502

aokolnychyi · 2023-12-18T19:59:39Z

core/src/main/java/org/apache/iceberg/SetPartitionStatistics.java

+import org.apache.iceberg.relocated.com.google.common.collect.Maps;
+import org.apache.iceberg.relocated.com.google.common.collect.Sets;
+
+public class SetPartitionStatistics implements UpdatePartitionStatistics {


Does this have to be public?

aokolnychyi

This looks good to me. I had only one comment about the access modifier for the base implementation class.
This is a big PR so I will not hold it because of that. However, can we please follow up, if needed?

Great work, @ajantha-bhat!

ajantha-bhat · 2023-12-19T01:18:22Z

@aokolnychyi: Thanks for the detailed review and merge.

The next small PR to review will be #9284 (Spark 3.5: Ensure that partition stats files are considered for GC procedures)

I had only one comment about the access modifier for the base implementation class.

I will check this in a followup. Again this follows the style from puffin files.
I will also handle the applicable comments from this PR to stats files code (puffin files) in a follow up PR.

advancedxy · 2024-06-24T09:43:24Z

core/src/main/java/org/apache/iceberg/TableMetadataParser.java

+    List<PartitionStatisticsFile> partitionStatisticsFiles;
+    if (node.has(PARTITION_STATISTICS)) {
+      partitionStatisticsFiles = partitionStatsFilesFromJson(node.get(PARTITION_STATISTICS));
+    } else {
+      partitionStatisticsFiles = ImmutableList.of();
+    }
+


Hi @ajantha-bhat and @aokolnychyi, I have a question about this implementation as I'm exploring to add new fields into TableMetadata. Suppose the table db.table's partition stats is updated by the new version of Iceberg via UpdatePartitionStatistics. After that, some old version of Iceberg library or the PyIceberg client produces a new commit to this table. Per my understanding, that writer will produce TableMetadata without PARTITION_STATISTICS since it knows nothing about PARTITION_STATISTICS, which effectively loses that info for the table.

Do you have any solutions or ideas on how to prevent such cases? I can think of some potential ideas, such as:

upgrade the format_version to a new one whenever we need to add new fields to table metadata, all the old clients will be rejected by the version check then.

define a writer_version field, old client can read metadata produced by new client, but it will reject writers with old versions.

move the check to the REST catalog service?

I feel it's too heavy to do a format upgrade when only adding new fields in TableMetadata.

Do you have any other ideas? Really appreciate your inputs.

The way partition stats tracked and added to table metadata is same as puffin files right now.

The stats are optional, so even if we lose it. Planner won't return wrong query results.

However these stats can be helpful to improve query performance. We are planning to provide a call procedure, compute partition stats. Which will check the last snapshot that had partition stats and incrementally compute the stats for remaining snapshots.

Thanks for your reply.

The way partition stats tracked and added to table metadata is same as puffin files right now.
The stats are optional, so even if we lose it. Planner won't return wrong query results.

Yes, I know it's optional and it doesn't affect the correctness of queries. My main concern here is that how can prevent old writers from corrupting new writers' metadata in general. I think it's OK for now as only statistics files are added, but it's quite annoying. And It would be possible that we need to add some required fields in the TableMetadata in the future.

…pache#8502)

github-actions bot added API core labels Sep 5, 2023

ajantha-bhat mentioned this pull request Sep 5, 2023

Core: Write partition stats during write operation #8488

Closed

ajantha-bhat force-pushed the p_track branch 3 times, most recently from c1428bb to a9741f9 Compare November 3, 2023 12:17

ajantha-bhat commented Nov 28, 2023

View reviewed changes

ajantha-bhat force-pushed the p_track branch from a9741f9 to cc90a60 Compare November 28, 2023 11:31

ajantha-bhat requested review from RussellSpitzer, aokolnychyi, flyrain and nastra November 28, 2023 11:33

ajantha-bhat mentioned this pull request Nov 28, 2023

Spec: Add partition stats spec #7105

Merged

aokolnychyi reviewed Dec 7, 2023

View reviewed changes

Api: Track partition statistics via TableMetadata

8ed6cc1

Tracking `PartitionStatisticsFile` in a same way as how `StatisticsFile` is already tracked.

ajantha-bhat force-pushed the p_track branch from cc90a60 to 35855ce Compare December 11, 2023 14:10

Address comments

3124544

ajantha-bhat force-pushed the p_track branch from 35855ce to 3124544 Compare December 12, 2023 01:50

ajantha-bhat mentioned this pull request Dec 12, 2023

Spark: Ensure that partition stats files are considered for GC procedures #9284

Merged

aokolnychyi reviewed Dec 14, 2023

View reviewed changes

Address new comments

5bd45e1

ajantha-bhat closed this Dec 15, 2023

ajantha-bhat reopened this Dec 15, 2023

aokolnychyi reviewed Dec 18, 2023

View reviewed changes

aokolnychyi approved these changes Dec 18, 2023

View reviewed changes

aokolnychyi merged commit 6e21bbf into apache:main Dec 18, 2023

lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024

API, Core: Track partition statistics in TableMetadata (apache#8502)

89a1014

geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024

API, Core: Track partition statistics in TableMetadata (apache#8502)

869cd11

huyuanfeng2018 mentioned this pull request Apr 12, 2024

[Improvement]: Table partition files list performance issue apache/amoro#2635

Closed

3 tasks

devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024

API, Core: Track partition statistics in TableMetadata (apache#8502)

8e4fd23

advancedxy reviewed Jun 24, 2024

View reviewed changes

deniskuzZ mentioned this pull request Jul 11, 2024

HIVE-28268: Iceberg: Retrieve row count from iceberg SnapshotSummary in case of iceberg.hive.keep.stats=false apache/hive#5215

Merged

zhongyujiang pushed a commit to zhongyujiang/iceberg that referenced this pull request Apr 16, 2025

[Cherry-pick] API, Core: Track partition statistics in TableMetadata (a…

cff4ea6

…pache#8502)

Api: Track partition statistics via TableMetadata #8502

Api: Track partition statistics via TableMetadata #8502

Uh oh!

Conversation

ajantha-bhat commented Sep 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi commented Nov 28, 2023

Uh oh!

ajantha-bhat commented Nov 30, 2023

Uh oh!

aokolnychyi commented Dec 5, 2023

Uh oh!

aokolnychyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ajantha-bhat commented Dec 7, 2023

Uh oh!

ajantha-bhat commented Dec 12, 2023

Uh oh!

aokolnychyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ajantha-bhat commented Sep 5, 2023 •

edited

Loading

aokolnychyi left a comment •

edited

Loading