-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-16721. Improve S3A rename resilience #2742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16721. Improve S3A rename resilience #2742
Conversation
|
Testing with -Dparallel-tests -DtestsThreadCount=8 -Dmarkers=keep -Dscale If the store is set to raise exceptions, a lot of tests which expect rename(bad options) to return false now get exceptions. One of the contract tests would downgrade if the raised exception was FileAlreadyExistsException and the contract xml said that was ok. I'm reluctant to go with a bigger patch. this PR is so that Hive and friends can get better reporting on errors, rather than have them lost. It will be optional |
f2622c7 to
6b3a984
Compare
|
Testing: s3 london with markers==keep and delete, s3guard on and off.
|
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/ITestlRenameDeleteRace.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/ITestlRenameDeleteRace.java
Outdated
Show resolved
Hide resolved
|
LGTM overall. Tests against Tokyo region worked. I just left comments for nits. |
|
I got error on ITestS3AContractDistCp but this seems not to be related. I can not reproduce the error by running the ITestS3AContractDistCp alone by |
Should this JIRA be marked as incompatible for applicatinos assuming existing behavior? |
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/ITestRenameDeleteRace.java
Outdated
Show resolved
Hide resolved
715b635 to
d37997a
Compare
fs.s3a.rename.raises.exceptions: raise exceptions on rename failures fs.s3a.rename.reduced.probes: don't look for parent dir (LIST), just verify it isn't a file. The reduced probe not only saves money, it avoids race conditions where one thread deleting a subdir can cause LIST <parent> to fail before a dir marker is recreated. Note: * file:// rename() creates parent dirs, so this isn't too dangerous. * tests will switch modes. We could always just do the HEAD; topic for discussion. This patch: optional Change-Id: Ic0f8a410b45fef14ff522cb5aa1ae2bc19c8eeee
Check for parent dir is now always !file, not isDirectory; avoids race conditions with parallel deletes of subdirectories under destination Also switches connector to raise exceptions on conditions which don't trigger exceptions on HDFS (just return false * source file missing * destination existing Some other connectors already report these as failures. With this change S3 moves away from HDFS behaviour, but towards one with better error reporting, and which we know is handled in existing applications. Documentated the changes in filesystem.md and troubleshooting s3a Change-Id: I5daf54636e63c19f273c86519d5eeb3cbeffeb49
…turning false. about to revert. Change-Id: I0d91330e2a59e8c6863ebeedce17573e69e2bef5
revert option to raise exceptions instead of returning false; explicitly raising FileNotFoundException/FileAlreadyExistsException is better. Change-Id: Ica78e0abf5a1444227e4843288679a2d3441dd70
Change-Id: Ic8787cb6f87d4b5cda23c6ced1e73922b665e797
modified: src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md modified: src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java renamed: src/test/java/org/apache/hadoop/fs/s3a/impl/ITestlRenameDeleteRace.java -> src/test/java/org/apache/hadoop/fs/s3a/impl/ITestRenameDeleteRace.java moved the "disable s3guard" test setup into S3ATestUtils; new suites are going to be adopting it until we pull out S3Guard completely. Change-Id: I0303aec2f6e4c98685da62e9f31c7e44a1e67d41
Change-Id: I0dc69dd913d7f9ca36930ab09da1060bdfee8970
d37997a to
3eff875
Compare
Change-Id: I288e65899302adcebab5221508a005784cfe1d89
3eff875 to
b10cc28
Compare
|
testing: s3 london, most recently with |
iwasakims
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. The ITestRenameDeleteRace failed without the fix of S3AFileSystem as expected. The test passed with the fix regardless of -Ds3guard. Thanks, @steveloughran.
|
ah, thanks -lovely. just tweaked the imports slightly for better backporting; I'll merge once the next compile is good, then do a cp and retest for branch-3.3. |
The S3A connector's rename() operation now raises FileNotFoundException if the source doesn't exist; a FileAlreadyExistsException if the destination exists and is unsuitable for the source file/directory. When renaming to a path which does not exist, the connector no longer checks for the destination parent directory existing -instead it simply verifies that there is no file immediately above the destination path. This is needed to avoid race conditions with delete() and rename() calls working on adjacent subdirectories. Contributed by Steve Loughran.
The S3A connector's rename() operation now raises FileNotFoundException if the source doesn't exist; a FileAlreadyExistsException if the destination exists and is unsuitable for the source file/directory. When renaming to a path which does not exist, the connector no longer checks for the destination parent directory existing -instead it simply verifies that there is no file immediately above the destination path. This is needed to avoid race conditions with delete() and rename() calls working on adjacent subdirectories. Contributed by Steve Loughran.
The S3A connector's rename() operation now raises FileNotFoundException if
the source doesn't exist; a FileAlreadyExistsException if the destination
exists and is unsuitable for the source file/directory.
When renaming to a path which does not exist, the connector no longer checks
for the destination parent directory existing -instead it simply verifies
that there is no file immediately above the destination path.
This is needed to avoid race conditions with delete() and rename()
calls working on adjacent subdirectories.
Contributed by Steve Loughran.
Conflicts:
hadoop-tools/hadoop-aws/src/test/resources/contract/s3a.xml
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
Change-Id: I1996625d750f62c3e51686ff5317bd47ca0233bf
S3A Rename to
If thread/process 1 deleted the subdir
dest/subdir1one and thereare no sibling subdirectories, then then the dir
destwould notexist until
maybeCreateFakeParentDirectory()had performed aLISTand, if needed, aPUTof a marker.This creates a window where thread/process 2, trying to rename
staging/subdir2into
destcould fail "parent does not exist".And guess what:
performance of S3.
the window of parent-dir-not-found can last long enough for things to fail.
Prior to S3 being consistent this wouldn't have been an issue
The fix: go from verifying parent dir exists to simply making sure that it isn't a file
is a weakening of the requirement "parent dir must exist" -but file:// already doesn't
require that.
Consequences
(FS contract test has switch for this)
isn't consistent with HDFS, but is with other stores (FS contract test has switch for this)
create()lets you do that already,and nobody has noticed in production.
If directory marker retention was enabled there's more likelihood that an empty dir marker existed -if it did then the race condition wouldn't exist. But as there are no guarantees that the marker will be there, that's not safe to rely on.