-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-10220. Add test case to cover HDDS-9930 batch key deletion #7792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-10220. Add test case to cover HDDS-9930 batch key deletion #7792
Conversation
|
I was confusing why the delete keys request works fine with Legacy layout even if the request map like this: Lines 137 to 143 in 5e8d4de
Till I see these😅: Lines 291 to 299 in 5e8d4de
Lines 342 to 345 in 5e8d4de
|
| try { | ||
| fs.open(new Path(dir, key)); | ||
| fail("HSync should throw because the open key is gone"); | ||
| } catch (FileNotFoundException ex) { | ||
| // expected | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this style of test should only exist in the legacy code. Please use assertThrows() for new code
| try { | |
| fs.open(new Path(dir, key)); | |
| fail("HSync should throw because the open key is gone"); | |
| } catch (FileNotFoundException ex) { | |
| // expected | |
| } | |
| assertThrows(FileNotFoundException.class, | |
| () -> fs.open(new Path(dir, key))); |
| os.write(1); | ||
| os.hsync(); | ||
| openKeys.add(keyName); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sorry now I recall what we wanted to cover in this test case.
We want to open the files, write, hsync; while it's still open, delete the keys.
The try with resources pattern closes the file after exiting this block, so that's not what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jojochuang Thanks for the reminder.
I update the test to not use try with resources pattern, and it failed to receive the exception from the second hsync operation after the keys are deleted. I'm still investigating whether the root cause is my incorrect test or an issue in #6472
@ParameterizedTest
@MethodSource("bucketLayouts")
public void testBatchKeyDeletionWithHSync(BucketLayout bucketLayout) throws Exception {
// Create a bucket with the specified layout
OzoneBucket testBucket = TestDataUtil.createVolumeAndBucket(client, bucketLayout);
final String rootPath = String.format("%s://%s/",
OZONE_OFS_URI_SCHEME, CONF.get(OZONE_OM_ADDRESS_KEY));
CONF.set(CommonConfigurationKeysPublic.FS_DEFAULT_NAME_KEY, rootPath);
final String dir = OZONE_ROOT + testBucket.getVolumeName()
+ OZONE_URI_DELIMITER + testBucket.getName();
// Create multiple files, some with hsync
final int numFiles = 5;
Map<String, FSDataOutputStream> openKeysOSMap = new HashMap<>();
try (FileSystem fs = FileSystem.get(CONF)) {
for (int i = 0; i < numFiles; i++) {
String keyName = "batch-key-" + i;
Path file = new Path(dir, keyName);
FSDataOutputStream os = fs.create(file, true);
os.write(1);
os.hsync();
openKeysOSMap.put(keyName, os);
}
try {
testBucket.deleteKeys(new ArrayList<>(openKeysOSMap.keySet()));
} catch (Exception ex) {
fail("deleteKeys should not throw exception, but got " + ex);
}
for (Map.Entry<String, FSDataOutputStream> entry : openKeysOSMap.entrySet()) {
// test if the os is not closed at client side
OMException exception = assertThrows(OMException.class,
() -> entry.getValue().hsync(), "hsync should throw because the key " + entry.getKey() + " is deleted");
assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, exception.getResult());
// os.close() throws OMException because the key is deleted
exception = assertThrows(OMException.class,
() -> entry.getValue().close());
assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, exception.getResult());
}
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a product bug to me. Potentially something inside OmKeysDeleteRequestWithFSO.
Note: we have two separate request types for handling single delete and batch deletes. So mistakes can happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some logs to the existing testHsync#testHSyncDeletedKey and found the following:
// delete key
(OMKeyDeleteResponseWithFSO.java:addToDBBatch(126)) - Write open key: /volume42541/bucket06654/key-hsync-del/113980087550083073 to DB with metadata: {hsyncClientId=113980087550083073, deletedHsyncKey=true}
// second hsync
OMKeyCommitRequestWithFSO.java:validateAndUpdateCache(181)) - Read open key: /-9223372036854775552/-9223372036854775040/-9223372036854775040/key-hsync-del/113980087550083073 from DB with metadata: {hsyncClientId=113980087550083073}
It appears that the key /-9223372036854775552/-9223372036854775040/-9223372036854775040/key-hsync-del/113980087550083073 was written during the first hsync process:
Lines 83 to 84 in 0ba9147
| omMetadataManager.getOpenKeyTable(getBucketLayout()).putWithBatch( | |
| batchOperation, getOpenKeyName(), getNewOpenKeyInfo()); |
Given this, I believe we should modify the approach to pass the complete key name along with the desired open key/file info metadata.
I’d love to hear your thoughts on this. I really appreciate your input, @jojochuang.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I added a fail(...) to test if the exception really happened
@Test
public void testHSyncDeletedKey() throws Exception {
// Verify that a key can't be successfully hsync'ed again after it's deleted,
// and that key won't reappear after a failed hsync.
// Set the fs.defaultFS
final String rootPath = String.format("%s://%s/",
OZONE_OFS_URI_SCHEME, CONF.get(OZONE_OM_ADDRESS_KEY));
CONF.set(CommonConfigurationKeysPublic.FS_DEFAULT_NAME_KEY, rootPath);
final String dir = OZONE_ROOT + bucket.getVolumeName()
+ OZONE_URI_DELIMITER + bucket.getName();
final Path key1 = new Path(dir, "key-hsync-del");
try (FileSystem fs = FileSystem.get(CONF)) {
// Create key1
try (FSDataOutputStream os = fs.create(key1, true)) {
os.write(1);
os.hsync();
fs.delete(key1, false);
// getFileStatus should throw FNFE because the key is deleted
assertThrows(FileNotFoundException.class,
() -> fs.getFileStatus(key1));
// hsync should throw because the open key is gone
try {
os.hsync();
+ fail("hsync should throw because the open key is gone");
} catch (OMException omEx) {
assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, omEx.getResult());
}
// key1 should not reappear after failed hsync
assertThrows(FileNotFoundException.class,
() -> fs.getFileStatus(key1));
} catch (OMException ex) {
// os.close() throws OMException because the key is deleted
assertEquals(OMException.ResultCodes.KEY_NOT_FOUND, ex.getResult());
}
}
}then it failed with:
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 38.015 s <<< FAILURE! - in org.apache.hadoop.fs.ozone.TestHSync
[ERROR] org.apache.hadoop.fs.ozone.TestHSync.testHSyncDeletedKey Time elapsed: 0.804 s <<< FAILURE!
org.opentest4j.AssertionFailedError: hsync should throw because the open key is gone
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
at org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
at org.apache.hadoop.fs.ozone.TestHSync.testHSyncDeletedKey(TestHSync.java:578)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestHSync.testHSyncDeletedKey:578 hsync should throw because the open key is gone
|
Hi @jojochuang, I just left a comment on another PR - #6079 (comment). |
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
|
Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it. |
What changes were proposed in this pull request?
HDDS-9930 added test cases for single key deletion testHSyncDeletedKey. But not for batch key deletion (see PR description checkbox).
Batch key deletion tests remain to be added (in TestHSync or in a new test class).
New test should ensure the behavior of OMKeysDeleteRequest and OmKeysDeleteRequestWithFSO is correct.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10220
How was this patch tested?
Add test case in TestHSync to cover HDDS-9930 batch key deletion
CI: https://github.com/peterxcli/ozone/actions/runs/13100233276