Spark: Add an action to rewrite equality deletes #2364

chenjunjiedada · 2021-03-24T03:06:26Z

This is a sub-PR of #2216, it adds a spark action to replace the equality deletes to position deletes which I think is minor compaction. The logic is:

Plan and group the tasks by partition. Current it doesn't consider the filter, we may consider filter, such as partition filter, later.
Use the delete matcher to keep rows that match the equality delete set. The rows are projected with file and pos fields.
Write the matched rows via position delete writer.
Perform the rewrite files to replace equality deletes with position deletes.

This adds an API in RewriteFiles to rewrite equality deletes to position deletes. It should keep the same semantic with the current API that rows must be the same as before as after. This may need some changes when #2294 get merged.

There are two following things:

Extract the common part and implement Flink action.
Clustering the position deletes inside the partition.

chenjunjiedada · 2021-03-24T03:09:57Z

@rdblue @openinx @yyanyy , This is part of the equality delete rewrite.

api/src/main/java/org/apache/iceberg/RewriteFiles.java

openinx · 2021-03-24T09:13:32Z

spark/src/main/java/org/apache/iceberg/spark/actions/BaseRewriteDeletesSparkAction.java

+      return new RewriteDeleteActionResult(Collections.emptyList(), Collections.emptyList());
+    }
+
+    CloseableIterable<FileScanTask> tasksWithEqDelete = CloseableIterable.filter(fileScanTasks, scan ->


Do we need to do the nullable check before filter this fileScanTasks ?

This is closeable so I think we don't have to. There is an empty check after grouping tasks. Right？

openinx · 2021-03-24T09:19:41Z

spark/src/main/java/org/apache/iceberg/spark/actions/BaseRewriteDeletesSparkAction.java

+
+  @Override
+  protected RewriteDeletes self() {
+    return null;


null ? here we should return this ?

openinx · 2021-03-24T09:32:42Z

spark/src/main/java/org/apache/iceberg/spark/actions/BaseRewriteDeletesSparkAction.java

+    return encryptionManager;
+  }
+
+  private Map<StructLikeWrapper, Collection<FileScanTask>> groupTasksByPartition(


Nit: I see there are another same groupTasksByPartition in BaseRewriteDeletesSparkAction, maybe we could use the same method.

I think you mean BaseRewriteDataFilesSparkAction, right?

core/src/main/java/org/apache/iceberg/io/SortedPosDeleteWriter.java

openinx · 2021-03-24T10:27:01Z

spark/src/main/java/org/apache/iceberg/spark/source/EqualityDeleteRewriter.java

+    this.spec = table.spec();
+    this.schema = table.schema();
+    this.locations = table.locationProvider();
+    this.caseSensitive = caseSensitive;
+    this.io = io;
+    this.encryptionManager = encryptionManager;
+    this.properties = table.properties();
+    this.nameMapping = table.properties().get(DEFAULT_NAME_MAPPING);
+
+    String formatString = table.properties().getOrDefault(
+        TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+    this.format = FileFormat.valueOf(formatString.toUpperCase(Locale.ENGLISH));


Nit: could we align the assignment order with the field definition order ? That helps a lot when checking all those assignment. thanks.

openinx · 2021-03-29T03:18:22Z

@chenjunjiedada Thanks for updating this patch, I've got the #2294 merged, that patch extends RewriteFiles API to rewrite both insert data files and delete files in iceberg. I think we could rebase this patch based on the latest commit. I will take a look at the patch again once you've rebased this. Thanks.

chenjunjiedada · 2021-03-29T15:20:29Z

Thanks for review, @openinx !

Spark: add position delete row reader minor refactor use subclass of stream filter for stream selector allow delete row reader to read all deleted rows remove reading all delete row once logic implement alternative delete row reader update keepRowsFromDeletes minor refactors remove useless code

jackye1995 · 2021-07-28T03:37:47Z

spark/src/main/java/org/apache/iceberg/spark/actions/ConvertEqDeletesStrategy.java

+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class ConvertEqDeletesStrategy implements RewriteDeleteStrategy {


I think we should have an abstract ConvertEqDeletesStrategy and a Spark3ConvertEqDeletesStrategy

jackye1995 · 2021-07-28T03:41:50Z

spark/src/main/java/org/apache/iceberg/spark/actions/ConvertEqDeletesStrategy.java

+
+  @Override
+  public Iterable<DeleteFile> selectDeletes() {
+    CloseableIterable<FileScanTask> fileScanTasks = null;


nit: can we simplify this block with something like the following?

try (CloseableIterable<FileScanTask> fileScanTasks = table.newScan().ignoreResiduals().planFiles()) { ... } finally { ... }

jackye1995 · 2021-07-28T03:45:10Z

spark/src/main/java/org/apache/iceberg/spark/actions/ConvertEqDeletesStrategy.java

+        scan.deletes().stream().anyMatch(delete -> delete.content().equals(FileContent.EQUALITY_DELETES))
+    );
+
+    Set<DeleteFile> eqDeletes = Sets.newHashSet();


nit: I think we can do a flatMap from tasks to deletes, and then filter and use forEach(eqDeletes.add)

And it seems a bit redundant that we are iterating through tasks 2 times at L119 and here, there should be a way to simplify the whole logic.

jackye1995 · 2021-07-28T04:09:17Z

core/src/main/java/org/apache/iceberg/util/TableScanUtil.java

        BaseCombinedScanTask::new);
  }
+
+  public static Map<StructLikeWrapper, Collection<FileScanTask>> groupTasksByPartition(


maybe I missed some other place, I only see it used in the strategy class, why is it not a private method?

Moving the common groupTasksByPartition between BaseRewriteDataFilesAction and ConvertEqDeletesStrategy into TableScanUtil should be OK for me.

jackye1995 · 2021-07-28T04:24:27Z

spark/src/main/java/org/apache/iceberg/spark/actions/ConvertEqDeletesStrategy.java

+        TableScanUtil.groupTasksByPartition(table.spec(), tasksWithEqDelete.iterator());
+
+    // Split and combine tasks under each partition
+    List<Pair<StructLike, CombinedScanTask>> combinedScanTasks = groupedTasks.entrySet().stream()


After reading this, I think we can make the RewriteDeleteStrategy interface closer to RewriteStrategy interface. What we have here is basically the equivalent of planFileGroups plus rewriteFiles in RewriteStrategy. So I would propose we have the following methods in RewriteDeleteStrategy to be more aligned:

Iterable<DeleteFile> selectDeletesToRewrite(Iterable<FileScanTask> dataFiles); Iterable<List<FileScanTask>> planDeleteGroups(Iterable<DeleteFile> deleteFiles); Set<DeleteFile> rewriteDeletes(List<DeleteFile> deleteFilesToRewrite);

And we can get the partition StructLike directly from the list of scan tasks instead of passing it through the task pair in EqualityDeleteRewriter. In this way, we can also enable partial progress for commits.

I'm updating this PR according to the API changes, the changes of selectDeletesToRewrite and rewriteDeletes are OK to me. But Iterable<List<FileScanTask>> planDeleteGroups(Iterable<DeleteFile> deleteFiles); is a bit wired since it returns groups of List<FileScanTask>, while a FileScanTask could contains several deletes which don't exist in the deleteFiles. So I prefer to return Iterable<list<DeleteFile>>. It is worth noting that one data file could have several deletes, so we could not directly using FileScanTask to transfer the deletes. This is slightly different from the date file rewrite.

And we can get the partition StructLike directly from the list of scan tasks instead of passing it through the task pair in EqualityDeleteRewriter. In this way, we can also enable partial progress for commits.

The scan tasks in a group may belong to different partitions. So unless we group deletes by partition, it needs to know the partition values.

openinx · 2021-07-28T09:16:06Z

core/src/main/java/org/apache/iceberg/util/TableScanUtil.java

  }
+
+  public static Map<StructLikeWrapper, Collection<FileScanTask>> groupTasksByPartition(
+      PartitionSpec spec,


I don't think it's correct to use the table latest partition spec to group the FileScanTask, because different FileScanTask many have different partition specs, the correct way is to use the FileScanTask#spec to group the tasks. We should remove the spec as an argument, otherwise it's introducing a bug...

openinx · 2021-07-28T09:29:34Z

data/src/main/java/org/apache/iceberg/data/DeleteFilter.java


-      Predicate<T> isInDeleteSet = record -> deleteSet.contains(projectRow.wrap(asStructLike(record)));
-      isInDeleteSets.add(isInDeleteSet);
+      isDeleted = isDeleted == null ? record -> deleteSet.contains(projectRow.wrap(asStructLike(record))) :


Initializing the isDeleted as a predicate liket->false will simplify this if-else as:

isDeleted = isDeleted.or(record -> deleteSet.contains(projectRow.wrap(asStructLike(record))));

I found I wrote in this way first but change to the current way because of the comment from Ryan, and that sounds reasonable to me. FYI bfd0aeb#r603700530.

Just reverted back.

openinx · 2021-07-28T09:43:15Z

data/src/main/java/org/apache/iceberg/data/DeleteFilter.java

+
+    } else {
+      List<CloseableIterable<Record>> deletes = Lists.transform(posDeletes, this::openPosDeletes);
+      markedRecords = CloseableIterable.transform(Deletes.streamingDeletedRowMarker(records, this::pos,


We will always load the pos-deletes into in-memory HashSet even if the row count of positional files exceed the given threshold, because in this buildPosDeletePredicate, we've loaded all the file-offset into memory, I think that's not the expected behavior..

Updated to open one by one.

openinx · 2021-07-28T09:56:56Z

data/src/main/java/org/apache/iceberg/data/DeleteFilter.java

+    return deleteMarkerIndex;
+  }
+
+  protected abstract Consumer<T> deleteMarker();


How about introducing a new interface named Setter to set the is_deleted flag (which is similar to the org.apache.iceberg.Accessor) so that we could have a good abstraction to hide the delete marker logic:

interface Setter<T> extends Serializable { T set(T reuse); }

I can try to make this better abstraction in the following PR, this PR contains too many changes now. I think we will have some following-up minor changes and optimizations. Does that sound ok to you?

openinx · 2021-07-28T10:34:06Z

data/src/main/java/org/apache/iceberg/data/DeleteFilter.java

-        .map(Predicate::negate)
-        .reduce(Predicate::and)
-        .orElse(t -> true);
+    Predicate<T> isDeleted = buildEqDeletePredicate();


Looks like we are separating the RewriteDeletes path and normal read path into two branches:
For the RewriteDeletes path, we introduced three new methods:

keepRowsFromDeletes

keepRowsFromEqualityDeletes

keepRowsFromPosDeletes

For the normal read path, we introduced another three methods:

applyEqDeletes

applyPosDeletes

filter

I remember there's an issue that we discussed to introduce the is_deleted meta column because we want to unify all the rewrite paths and normal read path ? ( I cannot find the specific PR now...)

The related PR is : #2372

Thanks @openinx for the detailed reviewing and the findings. I addressed the related changes in the separated PR #2372. #2372 is an independant one for the delete row reader.

chenjunjiedada · 2021-07-28T13:23:31Z

Thanks @openinx and @jackye1995 for the detailed reviews. Let me update related PRs and will ping you guys soon.

github-actions bot added API core spark labels Mar 24, 2021

openinx reviewed Mar 24, 2021

View reviewed changes

api/src/main/java/org/apache/iceberg/RewriteFiles.java Outdated Show resolved Hide resolved

openinx reviewed Mar 24, 2021

View reviewed changes

core/src/main/java/org/apache/iceberg/io/SortedPosDeleteWriter.java Show resolved Hide resolved

openinx reviewed Mar 24, 2021

View reviewed changes

chenjunjiedada force-pushed the replace-equality-deletes-action branch from 97d5bec to e6ebfc1 Compare March 29, 2021 14:53

chenjunjiedada mentioned this pull request Jul 26, 2021

API: add an action API for rewrite deletes #2841

Merged

chenjunjiedada added 5 commits July 28, 2021 10:57

API: add an action API for rewrite deletes

6624df5

rebase

ea5e6f9

use strategy

f7b2543

minor changes

f4a23c1

chenjunjiedada force-pushed the replace-equality-deletes-action branch from e6ebfc1 to f4a23c1 Compare July 28, 2021 03:19

github-actions bot added data flink labels Jul 28, 2021

jackye1995 reviewed Jul 28, 2021

View reviewed changes

openinx reviewed Jul 28, 2021

View reviewed changes

chenjunjiedada and others added 5 commits July 29, 2021 11:50

tmp

186151f

tmp

370cc95

update implementation accoding to API changes

4369d88

fix code style

bec8e30

apply new API definition

792c0a9

chenjunjiedada mentioned this pull request Jul 30, 2021

Spark: update delete row reader to read position deletes #2372

Closed

chenjunjiedada closed this Nov 12, 2021

chenjunjiedada mentioned this pull request Dec 27, 2023

Spark: support replace equality deletes to position deletes #2216

Closed

manuzhang mentioned this pull request Mar 28, 2024

Spark rewrite Files Action OOM #10054

Closed

pdames mentioned this pull request Jun 30, 2025

Native Flink IO Connector ray-project/deltacat#562

Open

Spark: Add an action to rewrite equality deletes #2364

Spark: Add an action to rewrite equality deletes #2364

Uh oh!

Conversation

chenjunjiedada commented Mar 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chenjunjiedada commented Mar 24, 2021

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openinx commented Mar 29, 2021

Uh oh!

chenjunjiedada commented Mar 29, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openinx Jul 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenjunjiedada commented Jul 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chenjunjiedada commented Mar 24, 2021 •

edited

Loading

openinx Jul 28, 2021 •

edited

Loading