Skip to content

Conversation

@steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Sep 7, 2020

status

This PR was merged into trunk and branch-3.3, but reverted after it became clear that it was incomplete.

See #2310 for the successor PR

description

This changes directory tree deletion so that only files are incrementally deleted
from S3Guard after the objects are deleted; the directories are left alone
until metadataStore.deleteSubtree(path) is invoked

This avoids directory tombstones being added above files/child directories,
which stop that subtree operation from actually finishing.

Also:

  • callback to delete objects splits files and dirs so that
    any problems deleting the dirs doesn't trigger s3guard updates
  • new statistic to measure #of objects deleted, alongside request count.
  • callback listFilesAndEmptyDirectories renamed listFilesAndDirectoryMarkers
    to clarify behavior.
    Test enhancements in ITestDirectoryOperationCost, but I didn't manage to
    replicate it there (didn't quite understand the cause); the FileContext
    test works for regression testing.

Change-Id: I8f96536bc6ffaf548b3da1068b9bfc897d34a993

This changes directory tree deletion so that only files are incrementally deleted
from S3Guard after the objects are deleted; the directories are left alone
until   metadataStore.deleteSubtree(path) is invoked

This avoids directory tombstones being added above files/child directories,
which stop that subtree operation from actually finishing.

Also:

* callback to delete objects splits files and dirs so that
any problems deleting the dirs doesn't trigger s3guard updates
* new statistic to measure #of objects deleted, alongside request count.
* callback listFilesAndEmptyDirectories renamed listFilesAndDirectoryMarkers
  to clarify behavior.
Test enhancements in ITestDirectoryOperationCost, but I didn't manage to
replicate it there (didn't quite understand the cause); the FileContext
test works for regression testing.

Change-Id: I8f96536bc6ffaf548b3da1068b9bfc897d34a993
@steveloughran
Copy link
Contributor Author

Tests in progress

@steveloughran
Copy link
Contributor Author

broke the local fs tests

[ERROR] Failures: 
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameDirectoryAsEmptyDirectory:1192 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameDirectoryAsFile:1241 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameDirectoryAsNonEmptyDirectory:1216 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameDirectoryToItself:1112 Renamed directory to itself
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameDirectoryToNonExistentParent:1135 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameFileAsExistingDirectory:1093 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameFileAsExistingFile:1072 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameFileToDestinationWithParentFile:1018 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameFileToItself:1047 Renamed file to itself
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameFileToNonExistentDirectory:994 Expected exception was not thrown
[ERROR]   TestLocalFSFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameNonExistentPath:971 Should throw FileNotFoundException
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameDirectoryAsEmptyDirectory:1192 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameDirectoryAsFile:1241 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameDirectoryAsNonEmptyDirectory:1216 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameDirectoryToItself:1112 Renamed directory to itself
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameDirectoryToNonExistentParent:1135 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameFileAsExistingDirectory:1093 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameFileAsExistingFile:1072 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameFileToDestinationWithParentFile:1018 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameFileToItself:1047 Renamed file to itself
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameFileToNonExistentDirectory:994 Expected exception was not thrown
[ERROR]   TestFcMainOperationsLocalFs>FileContextMainOperationsBaseTest.testRenameNonExistentPath:971 Should throw FileNotFoundException
[INFO] 
[ERROR] Tests run: 4439, Failures: 22, Errors: 0, Skipped: 165

* Fix S3A test failures caused by DeleteOperation bugs added
* Extend S3A tests
* Fix FileContext base test so that raised IOEs are always rethrown

regarding change apache#3, I think those IOEs should be thrown first, but
as the existing code does the asserts in finally(), I'm leaving it alone

Change-Id: Ia3159db7760424264d84fa2e38d2cbb452a9854a
@bgaborg bgaborg self-requested a review September 9, 2020 10:24
@steveloughran
Copy link
Contributor Author

Note: I ran the hdfs contract tests in my IDE to verify it was happy too; all good.

Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, +1 pending on integration test results, and which endpoint you tested against.

@steveloughran
Copy link
Contributor Author

repeatedly tested against london with options showing the error and the -Dkeep option and unguarded. Some transient failures related to local network issues, addressed in HADOOP-17181.

@steveloughran steveloughran merged commit 9960c01 into apache:trunk Sep 10, 2020
asfgit pushed a commit that referenced this pull request Sep 10, 2020
#2280)

This changes directory tree deletion so that only files are incrementally deleted
from S3Guard after the objects are deleted; the directories are left alone
until metadataStore.deleteSubtree(path) is invoke.

This avoids directory tombstones being added above files/child directories,
which stop the treewalk and delete phase from working.

Also:

* Callback to delete objects splits files and dirs so that
any problems deleting the dirs doesn't trigger s3guard updates
* New statistic to measure #of objects deleted, alongside request count.
* Callback listFilesAndEmptyDirectories renamed listFilesAndDirectoryMarkers
  to clarify behavior.
* Test enhancements to replicate the failure and verify the fix

Contributed by Steve Loughran

Change-Id: I0e6ea2c35e487267033b1664228c8837279a35c7
asfgit pushed a commit that referenced this pull request Sep 11, 2020
…maturely. (#2280)"

This reverts commit 0c82eb0.

Change-Id: I6bd100d9de19660b0f28ee0ab16faf747d6d9f05
asfgit pushed a commit that referenced this pull request Sep 11, 2020
…maturely. (#2280)"

This reverts commit 9960c01.

Change-Id: I820534c3292f2a343693d835f625488c325fb5d6
@steveloughran
Copy link
Contributor Author

This patch is incomplete. Reverted and will create a new one.

Issues

  • cleanup on a multipart fail will tombstone everything; no information at that point as to what is/isn't a dir
  • delete needs more coverage

I think I may want to push this delete/no-delete logic into s3guard -it can do the check rather than relying on the layer above. But: we will need to pass down for each path whether a path is a dir or a file, so that s3guard doesn't do a DDB query on every single path entry to work that out for itself

@apache apache deleted a comment from hadoop-yetus Sep 17, 2020
@apache apache deleted a comment from hadoop-yetus Sep 17, 2020
@steveloughran steveloughran added the fs/s3 changes related to hadoop-aws; submitter must declare test endpoint label Sep 17, 2020
@steveloughran steveloughran deleted the s3/HADOOP-17244-auth-dir-rename-test branch October 15, 2021 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fs/s3 changes related to hadoop-aws; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants