Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -88,21 +88,15 @@ protected void commitToExistingTable(TableMetadata base, TableMetadata metadata)
.setParameter(METADATA_LOCATION_PROP, newMetadataLocation)
.setParameter(PREVIOUS_METADATA_LOCATION_PROP, currentMetadataLocation)
.build();

// todo privileges should not be replaced for an alter
Comment thread
findepi marked this conversation as resolved.
PrincipalPrivileges privileges = table.getOwner().map(MetastoreUtil::buildInitialPrivilegeSet).orElse(NO_PRIVILEGES);
metastore.replaceTable(database, tableName, table, privileges);
}
catch (RuntimeException e) {
try {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per apache/iceberg#2317

basically we need to group failures into two categories:

  1. Failures that are reports from the Catalog that it could not perform the operation for some reason
  2. Failures where the client has lost contact for some reason (basically everything else)

This implementation never deletes the new metadata file. Is this intentional?

See https://github.com/apache/iceberg/pull/2328/files#diff-e502621d52f86cf0ec3187dda30ac61f6b76efb7b6276bc8d233ccb2c836fb98R482 where the metadata is deleted only in case of FAILURE

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentional, we do have a little different logic than iceberg library so it can't be copied one to one, also it is safer not to ever delete these files.

io().deleteFile(newMetadataLocation);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This usecase is probably pretty exotic to catch in a real-world test case.
Would it make sense to have a unit test case to simulate such a situation?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to unit test this but in my opinion it wouldn't be very useful, also although code here is pretty simple it would require a lot of mocking and it is only a little part of whole commit process that is initiated in org.apache.iceberg.BaseTransaction#commitTransaction so testing this small piece wouldn't be meaningful.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleteFile(newMetadataLocation) should still be invoked when any of the getter instructions fails (e.g. fromMetastoreApiTable(thriftMetastore.getTable)
the left-over should be left behind only when the mutating call fails.

further, we should have a comment saying that we can introduce error categorization to know when such cleanup would be safe after the mutating call first. We don't need to introduce the categorization in this PR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, GlueIcebergTableOperations has no delete at all. cc @alexjo2144
obviously, we want to keep the behaviors in sync.

@alexjo2144 intentional?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took them out based on this comment from David #10845 (comment)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I aired on the side of, not cleaning up is safe and figuring out which situations are safe is hard. So leave it for remove_orphan_files to do

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findepi so what should I do? add deletion here and in Glue for the case when getter fails ? Or be on the super safe side and not to bother with deletion here at all

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do what Alex did for Glue -- leave behind

}
catch (RuntimeException ex) {
e.addSuppressed(ex);
}
// CommitFailedException is handled as a special case in the Iceberg library. This commit will automatically retry
throw new CommitFailedException(e, "Failed to commit to table %s.%s", database, tableName);
}

// todo privileges should not be replaced for an alter
PrincipalPrivileges privileges = table.getOwner().map(MetastoreUtil::buildInitialPrivilegeSet).orElse(NO_PRIVILEGES);
metastore.replaceTable(database, tableName, table, privileges);
}
finally {
thriftMetastore.releaseTableLock(identity, lockId);
Expand Down