-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Add ManifestWrite benchmark #8637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9f6f8aa to
5f40738
Compare
| manifestListFile = String.format("%s/%s.avro", baseDir, UUID.randomUUID()); | ||
|
|
||
| try (ManifestListWriter listWriter = | ||
| ManifestLists.write(1, org.apache.iceberg.Files.localOutput(manifestListFile), 0, 1L, 0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we also add a parameter for format version? I guess if would be great if we can test both format version but not really needed in same PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it would be nice to cover both formats. We can use @Param annotation.
Like in IcebergSourceParquetPosDeleteBenchmark, for instance.
core/src/jmh/java/org/apache/iceberg/ManifestWriteBenchmark.java
Outdated
Show resolved
Hide resolved
| @Benchmark | ||
| @Threads(1) | ||
| public void writeManifestFile() throws IOException { | ||
| baseDir = Files.createTempDir().getAbsolutePath(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: We frequently use this.varName prefix when setting (not getting) instance variable to highlight that it is an instance variable, not a local variable. It is optional, up to you @Fokko. If you decide to apply that, must be done to all places that assign instance variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a pythonista; explicit is better than implicit. I've added this.
| try (ManifestWriter<DataFile> finalWriter = writer) { | ||
| for (int j = 0; j < NUM_ROWS; j++) { | ||
| DataFile dataFile = | ||
| DataFiles.builder(PartitionSpec.unpartitioned()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to cover partitioned and delete manifests (could be a separate PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do it in a separate PR 👍 It feels to me that the benchmark are very much on a need to benchmark basis
aokolnychyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems really close. I agree with comments from @dramaticlly. Could you take a look, @Fokko? Let's get this in.
core/src/jmh/java/org/apache/iceberg/ManifestWriteBenchmark.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Hongyue/Steve Zhang <[email protected]>
|
There was a style violation: |
* Core: Add ManifestWrite benchmark * Thanks for the review! * Cleanup * Set timeout * Remove public Co-authored-by: Hongyue/Steve Zhang <[email protected]> * Make ErrorProne happy --------- Co-authored-by: Hongyue/Steve Zhang <[email protected]>
No description provided.