feat(plugin-iceberg): Support rewrite_manifests procedure for iceberg#26888
Conversation
Reviewer's GuideAdds a new Iceberg system procedure system.rewrite_manifests to rewrite manifest files for a table, wires it into the Iceberg module, introduces a dedicated error code for invalid spec IDs, and provides test coverage and (partial) docs updates. Sequence diagram for executing system.rewrite_manifests proceduresequenceDiagram
actor User as User
participant PrestoCoordinator as PrestoCoordinator
participant IcebergConnector as IcebergConnector
participant RewriteManifestsProcedure as RewriteManifestsProcedure
participant IcebergMetadataFactory as IcebergMetadataFactory
participant ConnectorMetadata as ConnectorMetadata
participant IcebergTable as IcebergTable
participant RewriteManifests as RewriteManifests
User->>PrestoCoordinator: CALL system.rewrite_manifests(schema, table_name, spec_id)
PrestoCoordinator->>IcebergConnector: resolve procedure rewrite_manifests
IcebergConnector->>RewriteManifestsProcedure: invoke get()
RewriteManifestsProcedure-->>IcebergConnector: Procedure instance
IcebergConnector->>RewriteManifestsProcedure: rewriteManifests(session, schemaName, tableName, specId)
activate RewriteManifestsProcedure
RewriteManifestsProcedure->>IcebergMetadataFactory: create()
IcebergMetadataFactory-->>RewriteManifestsProcedure: ConnectorMetadata
RewriteManifestsProcedure->>ConnectorMetadata: getIcebergTable(session, schemaTableName)
ConnectorMetadata-->>RewriteManifestsProcedure: IcebergTable
RewriteManifestsProcedure->>IcebergTable: rewriteManifests()
IcebergTable-->>RewriteManifestsProcedure: RewriteManifests
alt specId is not null
RewriteManifestsProcedure->>IcebergTable: specs()
IcebergTable-->>RewriteManifestsProcedure: specs map
alt specs map does not contain specId
RewriteManifestsProcedure-->>IcebergConnector: throw PrestoException(ICEBERG_INVALID_SPEC_ID)
IcebergConnector-->>PrestoCoordinator: error ICEBERG_INVALID_SPEC_ID
PrestoCoordinator-->>User: failure response
else specs map contains specId
RewriteManifestsProcedure->>RewriteManifests: rewriteIf(manifest -> manifest.partitionSpecId() == specId)
RewriteManifestsProcedure->>RewriteManifests: commit()
RewriteManifests-->>RewriteManifestsProcedure: success
RewriteManifestsProcedure-->>IcebergConnector: return
IcebergConnector-->>PrestoCoordinator: success
PrestoCoordinator-->>User: success response
end
else specId is null
RewriteManifestsProcedure->>RewriteManifests: commit()
RewriteManifests-->>RewriteManifestsProcedure: success
RewriteManifestsProcedure-->>IcebergConnector: return
IcebergConnector-->>PrestoCoordinator: success
PrestoCoordinator-->>User: success response
end
deactivate RewriteManifestsProcedure
Class diagram for new RewriteManifestsProcedure and related typesclassDiagram
class RewriteManifestsProcedure {
- IcebergMetadataFactory metadataFactory
+ RewriteManifestsProcedure(IcebergMetadataFactory metadataFactory)
+ Procedure get()
+ void rewriteManifests(ConnectorSession clientSession, String schemaName, String tableName, Integer specId)
}
class IcebergMetadataFactory {
+ ConnectorMetadata create()
}
class Procedure {
+ Procedure(String schemaName, String procedureName, List arguments, MethodHandle methodHandle)
}
class ConnectorSession
class SchemaTableName {
+ SchemaTableName(String schemaName, String tableName)
}
class ConnectorMetadata
class Table {
+ RewriteManifests rewriteManifests()
+ Map specs()
}
class RewriteManifests {
+ RewriteManifests rewriteIf(ManifestFileFilter manifestFileFilter)
+ void commit()
}
class IcebergErrorCode {
<<enum>>
ICEBERG_INVALID_SPEC_ID
}
class PrestoException {
+ PrestoException(IcebergErrorCode errorCode, String message)
}
RewriteManifestsProcedure --> IcebergMetadataFactory : uses
RewriteManifestsProcedure ..> Procedure : provides
RewriteManifestsProcedure ..> ConnectorSession : parameter
RewriteManifestsProcedure ..> SchemaTableName : builds
RewriteManifestsProcedure ..> ConnectorMetadata : uses
RewriteManifestsProcedure ..> Table : obtains
RewriteManifestsProcedure ..> RewriteManifests : configures
RewriteManifestsProcedure ..> IcebergErrorCode : uses ICEBERG_INVALID_SPEC_ID
RewriteManifestsProcedure ..> PrestoException : throws
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
TestRewriteManifestsProcedure, you're mixing TestNG (@Test) with JUnit'sorg.junit.Assert.assertEquals; for consistency with the rest of the Presto test suite and to avoid dependency mixing, consider switching toorg.testng.Assert.assertEquals.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `TestRewriteManifestsProcedure`, you're mixing TestNG (`@Test`) with JUnit's `org.junit.Assert.assertEquals`; for consistency with the rest of the Presto test suite and to avoid dependency mixing, consider switching to `org.testng.Assert.assertEquals`.
## Individual Comments
### Comment 1
<location> `presto-iceberg/src/test/java/com/facebook/presto/iceberg/procedure/TestRewriteManifestsProcedure.java:40-43` </location>
<code_context>
+ .getQueryRunner();
+ }
+
+ private void createTable(String tableName)
+ {
+ assertUpdate("CREATE TABLE IF NOT EXISTS " + tableName + " (id INTEGER, value VARCHAR)");
+ }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Avoid `CREATE TABLE IF NOT EXISTS` in tests to prevent masking leftover state from previous runs
In tests, `CREATE TABLE IF NOT EXISTS` can hide leftover tables from previous runs (e.g., when a test aborts before `DROP TABLE`). It’s better for tests to fail if the table already exists. Use a plain `CREATE TABLE` instead (optionally preceded by an explicit `DROP TABLE`), so leaked state and flakiness aren’t silently ignored.
```suggestion
private void createTable(String tableName)
{
assertUpdate("DROP TABLE IF EXISTS " + tableName);
assertUpdate("CREATE TABLE " + tableName + " (id INTEGER, value VARCHAR)");
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
...eberg/src/test/java/com/facebook/presto/iceberg/procedure/TestRewriteManifestsProcedure.java
Show resolved
Hide resolved
a790e16 to
b7c278b
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
@agrawalreetika thanks for this feature. I've left some initial comments for your consideration.
...o-iceberg/src/main/java/com/facebook/presto/iceberg/procedure/RewriteManifestsProcedure.java
Outdated
Show resolved
Hide resolved
...o-iceberg/src/main/java/com/facebook/presto/iceberg/procedure/RewriteManifestsProcedure.java
Outdated
Show resolved
Hide resolved
...eberg/src/test/java/com/facebook/presto/iceberg/procedure/TestRewriteManifestsProcedure.java
Show resolved
Hide resolved
b7c278b to
91f79f7
Compare
@hantangwangd Thanks for your review & suggestions. I have applied the changes. Please take a look, when you get a chance. |
hantangwangd
left a comment
There was a problem hiding this comment.
Looks good! Just one more comment on adding a test case.
...eberg/src/test/java/com/facebook/presto/iceberg/procedure/TestRewriteManifestsProcedure.java
Show resolved
Hide resolved
91f79f7 to
66275f3
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
Thanks @agrawalreetika, lgtm!
Description
Support rewrite_manifests procedure for iceberg
Motivation and Context
New procedure to rewrite the manifest files of an Iceberg table to optimize table metadata.
Impact
New procedure to rewrite the manifest files of an Iceberg table to optimize table metadata.
Test Plan
Added tests
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.