Skip to content

Conversation

@aswinshakil
Copy link
Member

What changes were proposed in this pull request?

When keys are renamed in between snapshots, it is inefficient to find the renamed keys in the previous snapshot keyTable, This is because the TableKey for the renamed key has changed and it becomes impossible to find if the renamed key existed in the previous snapshot's keyTable since there is no way to index the keys with respect to objectID.

This patch proposes to add a new column family renamedKeyTable (objectID --> RenamedKeys) to store this information.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7984

How was this patch tested?

The patch was tested using Unit tests and CI

@aswinshakil aswinshakil added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Mar 1, 2023
@aswinshakil aswinshakil self-assigned this Mar 1, 2023
* |----------------------------------------------------------------------|
* | snapshotInfoTable | /volume/bucket/snapshotName -> SnapshotInfo |
* |----------------------------------------------------------------------|
* | renamedKeyTable | objectID -> RepeatedString |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to also make sure that during snapshot creates, we block renames.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locking will be done in the follow-up jira that actually uses this table in SDT and KDT

Comment on lines 77 to 85
// Remove all entries from renamedKeyTable
TableIterator<Long, ? extends Table.KeyValue<Long, RepeatedOmString>>
iterator = omMetadataManager.getRenamedKeyTable().iterator();

while (iterator.hasNext()) {
Long objectID = iterator.next().getKey();
omMetadataManager.getRenamedKeyTable()
.deleteWithBatch(batchOperation, objectID);
}
Copy link
Contributor

@smengcl smengcl Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done inside createOmSnapshotCheckpoint with lock held like in #4280 . For this PR you can put a TODO here since the other PR is not merged yet (which has the locking and deleteRange implementation).

And it should only remove entries in the snapshot scope in renamedKeyTable, similar to deletedTable.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aswinshakil for the speedy implementaion. Structurally lgtm.

Just need the table schema change and I'm +1.

Key: Long -> String, prefixed with volume and bucket name for efficient deletion.

Value: OmKeyRenameInfo, which at least holds the original name of the key (because that is the name captured in the previous snapshot).

@aswinshakil
Copy link
Member Author

Thanks @smengcl and @prashantpogde, I have updated the PR with the suggestions. Please take a look at it.

Comment on lines +92 to +96
if (omKeyRenameInfo == null) {
omKeyRenameInfo = new OmKeyRenameInfo(fromDbKey);
omMetadataManager.getRenamedKeyTable().putWithBatch(
batchOperation, renameDbKey, omKeyRenameInfo);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we are actually only storing the Ozone key's original name correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes correct, We only need the original key name. In between key renames are not necessary. But keeping this to store OmKeyRenameInfo instead single String, so we can expand on it further if needed.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending CI

Change-Id: Ia6262bd8627befbeffd7cb734789f3d38abbb438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants