Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -567,12 +567,12 @@ public void testHSyncDeletedKey() throws Exception {
// getFileStatus should throw FNFE because the key is deleted
assertThrows(FileNotFoundException.class,
() -> fs.getFileStatus(key1));

// hsync should throw because the open key is gone
try {
os.hsync();
} catch (OMException omEx) {
assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, omEx.getResult());
}
OMException exception = assertThrows(OMException.class,
() -> os.hsync());
assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, exception.getResult());

// key1 should not reappear after failed hsync
assertThrows(FileNotFoundException.class,
() -> fs.getFileStatus(key1));
Expand All @@ -583,6 +583,49 @@ public void testHSyncDeletedKey() throws Exception {
}
}

static Stream<BucketLayout> bucketLayouts() {
return Stream.of(BucketLayout.FILE_SYSTEM_OPTIMIZED, BucketLayout.LEGACY);
}

@ParameterizedTest
@MethodSource("bucketLayouts")
public void testBatchKeyDeletionWithHSync(BucketLayout bucketLayout) throws Exception {
// Create a bucket with the specified layout
OzoneBucket testBucket = TestDataUtil.createVolumeAndBucket(client, bucketLayout);

final String rootPath = String.format("%s://%s/",
OZONE_OFS_URI_SCHEME, CONF.get(OZONE_OM_ADDRESS_KEY));
CONF.set(CommonConfigurationKeysPublic.FS_DEFAULT_NAME_KEY, rootPath);

final String dir = OZONE_ROOT + testBucket.getVolumeName()
+ OZONE_URI_DELIMITER + testBucket.getName();

// Create multiple files, some with hsync
final int numFiles = 5;
List<String> openKeys = new ArrayList<>();

try (FileSystem fs = FileSystem.get(CONF)) {
for (int i = 0; i < numFiles; i++) {
String keyName = "batch-key-" + i;
Path file = new Path(dir, keyName);
try (FSDataOutputStream os = fs.create(file, true)) {
os.write(1);
os.hsync();
openKeys.add(keyName);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sorry now I recall what we wanted to cover in this test case.

We want to open the files, write, hsync; while it's still open, delete the keys.

The try with resources pattern closes the file after exiting this block, so that's not what we want.

Copy link
Member Author

@peterxcli peterxcli Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jojochuang Thanks for the reminder.
I update the test to not use try with resources pattern, and it failed to receive the exception from the second hsync operation after the keys are deleted. I'm still investigating whether the root cause is my incorrect test or an issue in #6472

@ParameterizedTest
@MethodSource("bucketLayouts")
public void testBatchKeyDeletionWithHSync(BucketLayout bucketLayout) throws Exception {
  // Create a bucket with the specified layout
  OzoneBucket testBucket = TestDataUtil.createVolumeAndBucket(client, bucketLayout);
  
  final String rootPath = String.format("%s://%s/",
      OZONE_OFS_URI_SCHEME, CONF.get(OZONE_OM_ADDRESS_KEY));
  CONF.set(CommonConfigurationKeysPublic.FS_DEFAULT_NAME_KEY, rootPath);

  final String dir = OZONE_ROOT + testBucket.getVolumeName()
      + OZONE_URI_DELIMITER + testBucket.getName();

  // Create multiple files, some with hsync
  final int numFiles = 5;
  Map<String, FSDataOutputStream> openKeysOSMap = new HashMap<>();

  try (FileSystem fs = FileSystem.get(CONF)) {
    for (int i = 0; i < numFiles; i++) {
      String keyName = "batch-key-" + i;
      Path file = new Path(dir, keyName);
      FSDataOutputStream os = fs.create(file, true);
      os.write(1);
      os.hsync();
      openKeysOSMap.put(keyName, os);
    }

    try {
      testBucket.deleteKeys(new ArrayList<>(openKeysOSMap.keySet()));
    } catch (Exception ex) {
      fail("deleteKeys should not throw exception, but got " + ex);
    }

    for (Map.Entry<String, FSDataOutputStream> entry : openKeysOSMap.entrySet()) {
      // test if the os is not closed at client side
      OMException exception = assertThrows(OMException.class,
          () -> entry.getValue().hsync(), "hsync should throw because the key " + entry.getKey() + " is deleted");
      assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, exception.getResult());

      // os.close() throws OMException because the key is deleted
      exception = assertThrows(OMException.class,
          () -> entry.getValue().close());
      assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, exception.getResult());
    }
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a product bug to me. Potentially something inside OmKeysDeleteRequestWithFSO.

Note: we have two separate request types for handling single delete and batch deletes. So mistakes can happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some logs to the existing testHsync#testHSyncDeletedKey and found the following:

// delete key
(OMKeyDeleteResponseWithFSO.java:addToDBBatch(126)) - Write open key: /volume42541/bucket06654/key-hsync-del/113980087550083073 to DB with metadata: {hsyncClientId=113980087550083073, deletedHsyncKey=true}

// second hsync
OMKeyCommitRequestWithFSO.java:validateAndUpdateCache(181)) - Read open key: /-9223372036854775552/-9223372036854775040/-9223372036854775040/key-hsync-del/113980087550083073 from DB with metadata: {hsyncClientId=113980087550083073}

It appears that the key /-9223372036854775552/-9223372036854775040/-9223372036854775040/key-hsync-del/113980087550083073 was written during the first hsync process:

omMetadataManager.getOpenKeyTable(getBucketLayout()).putWithBatch(
batchOperation, getOpenKeyName(), getNewOpenKeyInfo());

Given this, I believe we should modify the approach to pass the complete key name along with the desired open key/file info metadata.

I’d love to hear your thoughts on this. I really appreciate your input, @jojochuang.

Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I added a fail(...) to test if the exception really happened

 @Test
  public void testHSyncDeletedKey() throws Exception {
    // Verify that a key can't be successfully hsync'ed again after it's deleted,
    // and that key won't reappear after a failed hsync.

    // Set the fs.defaultFS
    final String rootPath = String.format("%s://%s/",
        OZONE_OFS_URI_SCHEME, CONF.get(OZONE_OM_ADDRESS_KEY));
    CONF.set(CommonConfigurationKeysPublic.FS_DEFAULT_NAME_KEY, rootPath);

    final String dir = OZONE_ROOT + bucket.getVolumeName()
        + OZONE_URI_DELIMITER + bucket.getName();
    final Path key1 = new Path(dir, "key-hsync-del");

    try (FileSystem fs = FileSystem.get(CONF)) {
      // Create key1
      try (FSDataOutputStream os = fs.create(key1, true)) {
        os.write(1);
        os.hsync();
        fs.delete(key1, false);

        // getFileStatus should throw FNFE because the key is deleted
        assertThrows(FileNotFoundException.class,
            () -> fs.getFileStatus(key1));

        // hsync should throw because the open key is gone
        try {
          os.hsync();
+          fail("hsync should throw because the open key is gone");
        } catch (OMException omEx) {
          assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, omEx.getResult());
        }
        // key1 should not reappear after failed hsync
        assertThrows(FileNotFoundException.class,
            () -> fs.getFileStatus(key1));
      } catch (OMException ex) {
        // os.close() throws OMException because the key is deleted
        assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, ex.getResult());
      }
    }
  }

then it failed with:

[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 38.015 s <<< FAILURE! - in org.apache.hadoop.fs.ozone.TestHSync
[ERROR] org.apache.hadoop.fs.ozone.TestHSync.testHSyncDeletedKey  Time elapsed: 0.804 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: hsync should throw because the open key is gone
        at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
        at org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
        at org.apache.hadoop.fs.ozone.TestHSync.testHSyncDeletedKey(TestHSync.java:578)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)

[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   TestHSync.testHSyncDeletedKey:578 hsync should throw because the open key is gone

}

client.getObjectStore().getVolume(testBucket.getVolumeName())
.getBucket(testBucket.getName())
.deleteKeys(openKeys);

for (String key : openKeys) {
assertThrows(FileNotFoundException.class,
() -> fs.open(new Path(dir, key)));
}
}
}

@Test
public void testHSyncOpenKeyDeletionWhileDeleteDirectory() throws Exception {
// Verify that when directory is deleted recursively hsync related openKeys should be deleted,
Expand Down