Skip to content

Conversation

@sumitagrawl
Copy link
Contributor

What changes were proposed in this pull request?

for multipart, delete key is multi-part + objectId, where multi-part format is /volumeName/bucketName/keyname/uploadId

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8463

How was this patch tested?

Tested with UT and integration cased present

@sumitagrawl
Copy link
Contributor Author

@szetszwo plz review

@szetszwo
Copy link
Contributor

szetszwo commented May 6, 2023

@sumitagrawl , we should fix HDDS-8550 first.

@sumitagrawl
Copy link
Contributor Author

@sumitagrawl , we should fix HDDS-8550 first.

It’s fixed and PR raised

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl , thanks a lot for working on this! Please see the comments inlined.

Comment on lines 100 to 102
RepeatedOmKeyInfo repeatedOmKeyInfo = OmUtils.prepareKeyForDelete(
currentKeyPartInfo, null, omMultipartKeyInfo.getUpdateID(),
isRatisEnabled);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this, the repeatedOmKeyInfo parameter of prepareKeyForDelete(..) is always null. Let's remove it.

+++ b/hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
@@ -446,8 +446,8 @@ public static File createOMDir(String dirPath) {
    * create a new instance to include this key, else we update the existing
    * repeatedOmKeyInfo instance.
    * 3. Set the updateID to the transactionLogIndex.
    * @param keyInfo args supplied by client
-   * @param repeatedOmKeyInfo key details from deletedTable
    * @param trxnLogIndex For Multipart keys, this is the transactionLogIndex
    *                     of the MultipartUploadAbort request which needs to
    *                     be set as the updateID of the partKeyInfos.
@@ -456,8 +456,7 @@ public static File createOMDir(String dirPath) {
    * @return {@link RepeatedOmKeyInfo}
    */
   public static RepeatedOmKeyInfo prepareKeyForDelete(OmKeyInfo keyInfo,
-      RepeatedOmKeyInfo repeatedOmKeyInfo, long trxnLogIndex,
-      boolean isRatisEnabled) {
+      long trxnLogIndex, boolean isRatisEnabled) {
     // If this key is in a GDPR enforced bucket, then before moving
     // KeyInfo to deletedTable, remove the GDPR related metadata and
     // FileEncryptionInfo from KeyInfo.
@@ -473,15 +472,7 @@ public static RepeatedOmKeyInfo prepareKeyForDelete(OmKeyInfo keyInfo,
     // Set the updateID
     keyInfo.setUpdateID(trxnLogIndex, isRatisEnabled);
 
-    if (repeatedOmKeyInfo == null) {
-      //The key doesn't exist in deletedTable, so create a new instance.
-      repeatedOmKeyInfo = new RepeatedOmKeyInfo(keyInfo);
-    } else {
-      //The key exists in deletedTable, so update existing instance.
-      repeatedOmKeyInfo.addOmKeyInfo(keyInfo);
-    }
-
-    return repeatedOmKeyInfo;
+    return new RepeatedOmKeyInfo(keyInfo);
   }

Comment on lines +104 to +105
String deleteKey = omMetadataManager.getOzoneDeletePathKey(
currentKeyPartInfo.getObjectID(), multipartKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should have partNumber/partName. Object id is the same for all parts.

BTW, let's add a utility method to create multipart delete key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szetszwo Each part is created with new object ID as independent upload, and combined to multipart table in repeatedomkeyinfo. ObjectId is not replaced in keyInfo for the part and remains unique.

but part number / name is not unique as user can do re-upload with same part number. If same part number, it will overwrite. And this can cause duplicate entry in deletedTable. So this is not unique.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl , thanks for the info. Do we have a unit test showing it? If not, could you add one?

Copy link
Contributor Author

@sumitagrawl sumitagrawl May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szetszwo This is already covered by existing testcase and same is updated with objectId as correction as part of this PR,
https://github.com/apache/ozone/pull/4660/files#diff-c1e307469ed5f710651ae2ec86baed4d9dce85d742c1c282cb1e7d28a776b503

As part of this, 2 parts are deleted with unique key: part1DeletedKeyName and part2DeletedKeyName

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl , I am asking a unit test. If there is one, which one? I want to run it locally. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szetszwo TestS3MultipartUploadAbortResponse:testAddDBToBatchWithParts cover the scenario as discussed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl , tried the testAddDBToBatchWithParts but it is not a real test -- in createPartKeyInfo, it creates parts random an object id but not the id generated by OM.

We should have a test that a real client uploads multipart to OM and then the parts get deleted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... in createPartKeyInfo, it creates parts random an object id but not the id generated by OM.

Indeed, the random ids could be the same since randomness does not prevent collision.

@sumitagrawl sumitagrawl requested a review from szetszwo May 10, 2023 14:41
@szetszwo
Copy link
Contributor

szetszwo commented May 14, 2023

I actually have a Multipart Upload test and I am able to verify that the object ids for each part are different.

Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=1, batchSize=640 B, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854771968
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=2, batchSize=1.25 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854771712
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=3, batchSize=1.88 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854770688
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=4, batchSize=2.50 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854770432
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=5, batchSize=3.13 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854771456
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=6, batchSize=3.75 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854771200
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=7, batchSize=4.38 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854772992
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=8, batchSize=5 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854772224
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=9, batchSize=5.63 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854770176
Batch-61: deletedTable put(key: 98 B, value: 542 B), #put=10, batchSize=6.25 KB, discarded: 0 (0 B); key=/s3v/s3bucket/testKey/76d7cb94-3369-470c-a83f-28448135f131-110365366127820801/-9223372036854772736
  Batch-61: deletedTable batchSize=6.25 KB, discarded: 0 (0 B), #put=10, #del=0

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@szetszwo szetszwo merged commit 3dd6e29 into apache:master May 14, 2023
errose28 added a commit to errose28/ozone that referenced this pull request May 17, 2023
* master: (78 commits)
  HDDS-8575. Intermittent failure in TestCloseContainerEventHandler.testCloseContainerWithDelayByLeaseManager (apache#4688)
  HDDS-7241. EC: Reconstruction could fail with orphan blocks. (apache#4718)
  HDDS-8577. [Snapshot] Disable compaction log when loading metadata for snapshot (apache#4697)
  HDDS-7080. EC: Offline reconstruction needs better logging (apache#4719)
  HDDS-8626. Config thread pool in ReplicationServer (apache#4715)
  HDDS-8616. Underreplication not fixed if all replicas start decommissioning (apache#4711)
  HDDS-8254. Close containers when volume reaches utilisation threshold (apache#4583)
  HDDS-8254. Close containers when volume reaches utilisation threshold (apache#4583)
  HDDS-8615. Explicitly show EC block type in 'ozone debug chunkinfo' command output (apache#4706)
  HDDS-8623. Delete duplicate getBucketInfo in OMKeyCommitRequest (apache#4712)
  HDDS-8339. Recon Show the number of keys marked for Deletion in Recon UI. (apache#4519)
  HDDS-8572. Support CodecBuffer for protobuf v3 codecs. (apache#4693)
  HDDS-8010. Improve DN warning message when getBlock does not find the block. (apache#4698)
  HDDS-8621. IOException is never thrown in SCMRatisServer.getRatisRoles(). (apache#4710)
  HDDS-8463. S3 key uniqueness in deletedTable (apache#4660)
  HDDS-8584. Hadoop client write slowly when stream enabled (apache#4703)
  HDDS-7732. EC: Verify block deletion from missing EC containers (apache#4705)
  HDDS-8581. Avoid random ports in integration tests (apache#4699)
  HDDS-8504. ReplicationManager: Pass used and excluded node separately for Under and Mis-Replication (apache#4694)
  HDDS-8576. Close RocksDB instance in RDBStore if RDBStore's initialization fails after RocksDB instance creation (apache#4692)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants