Skip to content

Conversation

@RussellSpitzer
Copy link
Member

@RussellSpitzer RussellSpitzer commented Jun 27, 2025

The fix in #11086 allows the base writer class to write manifests in parallel rather than one at a time but it also made the final ordering of the manifest files in the manifest list no longer deterministic.

In order to add back in the determinism and preserve the order of the passed in files (if the collection does have such an order) I added a change to regroup the results of the written manifests based on their original positions.

I don't think there are any performance implications of this (since we stored all the entries in memory both before and after I patched) but I ran the benchmarks just to check. The results look pretty much the same.

Pre-Patch

Benchmark                    (fast)  (numFiles)  Mode  Cnt   Score   Error  Units
AppendBenchmark.appendFiles    true       50000    ss    5   0.468 ± 0.035   s/op
AppendBenchmark.appendFiles    true      100000    ss    5   0.573 ± 0.033   s/op
AppendBenchmark.appendFiles    true      500000    ss    5   1.577 ± 0.383   s/op
AppendBenchmark.appendFiles    true     1000000    ss    5   2.942 ± 0.669   s/op
AppendBenchmark.appendFiles    true     2500000    ss    5   9.758 ± 2.437   s/op
AppendBenchmark.appendFiles   false       50000    ss    5   0.475 ± 0.036   s/op
AppendBenchmark.appendFiles   false      100000    ss    5   0.613 ± 0.045   s/op
AppendBenchmark.appendFiles   false      500000    ss    5   1.652 ± 0.266   s/op
AppendBenchmark.appendFiles   false     1000000    ss    5   2.865 ± 0.436   s/op
AppendBenchmark.appendFiles   false     2500000    ss    5  10.175 ± 1.403   s/op

Post-patch

Benchmark                    (fast)  (numFiles)  Mode  Cnt   Score   Error  Units
AppendBenchmark.appendFiles    true       50000    ss    5   0.452 ± 0.029   s/op
AppendBenchmark.appendFiles    true      100000    ss    5   0.582 ± 0.084   s/op
AppendBenchmark.appendFiles    true      500000    ss    5   1.538 ± 0.161   s/op
AppendBenchmark.appendFiles    true     1000000    ss    5   2.703 ± 0.342   s/op
AppendBenchmark.appendFiles    true     2500000    ss    5  10.652 ± 5.955   s/op
AppendBenchmark.appendFiles   false       50000    ss    5   0.492 ± 0.064   s/op
AppendBenchmark.appendFiles   false      100000    ss    5   0.604 ± 0.042   s/op
AppendBenchmark.appendFiles   false      500000    ss    5   1.596 ± 0.184   s/op
AppendBenchmark.appendFiles   false     1000000    ss    5   2.762 ± 0.264   s/op
AppendBenchmark.appendFiles   false     2500000    ss    5  10.732 ± 5.357   s/op

@RussellSpitzer
Copy link
Member Author

RussellSpitzer commented Jun 27, 2025

Theoretically I think we can use an Array instead of the AtomicReference Array but performance is basically the same again.

Perf with vanilla array

Benchmark                    (fast)  (numFiles)  Mode  Cnt   Score   Error  Units
AppendBenchmark.appendFiles    true       50000    ss    5   0.457 ± 0.026   s/op
AppendBenchmark.appendFiles    true      100000    ss    5   0.574 ± 0.037   s/op
AppendBenchmark.appendFiles    true      500000    ss    5   1.605 ± 0.269   s/op
AppendBenchmark.appendFiles    true     1000000    ss    5   2.788 ± 0.463   s/op
AppendBenchmark.appendFiles    true     2500000    ss    5   8.533 ± 2.381   s/op
AppendBenchmark.appendFiles   false       50000    ss    5   0.487 ± 0.041   s/op
AppendBenchmark.appendFiles   false      100000    ss    5   0.616 ± 0.107   s/op
AppendBenchmark.appendFiles   false      500000    ss    5   1.580 ± 0.267   s/op
AppendBenchmark.appendFiles   false     1000000    ss    5   2.757 ± 0.765   s/op
AppendBenchmark.appendFiles   false     2500000    ss    5  11.593 ± 6.862   s/op

@RussellSpitzer RussellSpitzer marked this pull request as ready for review June 27, 2025 21:26
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

forceGC = true
includeTests = true
humanOutputFile = file(jmhOutputPath)
jvmArgs = ['-Xmx32g']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[doubt] is this required for the change ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to make the benchmarks run without ooming. The OOM had to due with caching the files to be added before committing. The other change fixes a "Table Already Exists" exception that gets thrown if you run the benchmark suite


@Setup
public void setupBenchmark() {
dropTable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't cleanup already called in line 96 in the tearDown method? in the other comment, you mentioned that table exist exception, is it leftover from previous run that was interrupted and not cleaned up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly didn't check, I think it was probably the previous run OOMing

@stevenzwu stevenzwu merged commit be577ee into apache:main Jul 1, 2025
43 checks passed
@stevenzwu
Copy link
Contributor

thanks @RussellSpitzer for the change and @singhpk234 for the review

dramaticlly added a commit to dramaticlly/iceberg that referenced this pull request Sep 19, 2025
…es in manifests

apache#13411 help maintain passed in ordering of files in manifest lists and this help ensure this ordering guarantee
huaxingao pushed a commit that referenced this pull request Sep 19, 2025
…es in manifests (#14111)

#13411 help maintain passed in ordering of files in manifest lists and this help ensure this ordering guarantee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants