Skip to content

Conversation

@steveloughran
Copy link
Contributor

HADOOP-15999 improve test resilience and probes

  • Add delays long enough for timestamps to be different
  • Add delays for S3 to stabilize after writes/deletes, so that listings and HEAD calls will get the new value, not old ones
  • probes for differences look for file lengths ahead of timestamps, for more tangible failures.
  • and they validate the raw FS status acquired after the stabiliziation delay
  • package private (currently) probe for S3A to verify that an FS instances considers its store to be authoritative.
    Currently we've been checking the config, but to really know what's happening: lets query the internal state of FS.

Change-Id: Ib0184a2aacbec1e4b316cb8cad0265bd0b579bcd

This is PR #624 with another patch applied

@steveloughran
Copy link
Contributor Author

Tested: S3 ireland w/ DDB in both parallel and standalone runs, DDB set up for PAYG.

-Dparallel-tests -DtestsThreadCount=12 -Ds3guard -Ddynamo

I've not seen the problem surface again. Now, it could still be there, but I've gone through the tests line-by-line trying to think where stabilisation problems could surface, and I think it'll be ok. the only variable would be "how to allow for stabilisation?". I've chosen 2000 just for something non-zero-but not to slow down tests. We could up that to, say, 10s. Or actually play games with eventually, repeatedly calling getFileStatus until the modtime actually increased.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 28 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
0 mvndep 62 Maven dependency ordering for branch
+1 mvninstall 1071 trunk passed
+1 compile 983 trunk passed
+1 checkstyle 190 trunk passed
+1 mvnsite 115 trunk passed
+1 shadedclient 985 branch has no errors when building and testing our client artifacts.
+1 findbugs 143 trunk passed
+1 javadoc 88 trunk passed
_ Patch Compile Tests _
0 mvndep 23 Maven dependency ordering for patch
+1 mvninstall 76 the patch passed
+1 compile 900 the patch passed
+1 javac 900 the patch passed
-0 checkstyle 184 root: The patch generated 3 new + 6 unchanged - 0 fixed = 9 total (was 6)
+1 mvnsite 116 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 668 patch has no errors when building and testing our client artifacts.
+1 findbugs 173 the patch passed
+1 javadoc 100 the patch passed
_ Other Tests _
-1 unit 496 hadoop-common in the patch failed.
+1 unit 289 hadoop-aws in the patch passed.
+1 asflicense 43 The patch does not generate ASF License warnings.
6693
Reason Tests
Failed junit tests hadoop.util.TestDiskCheckerWithDiskIo
hadoop.util.TestBasicDiskValidator
hadoop.util.TestReadWriteDiskValidator
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-630/1/artifact/out/Dockerfile
GITHUB PR #630
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux c55c0a5fa5c5 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / f2b862c
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-630/1/artifact/out/diff-checkstyle-root.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-630/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-630/1/testReport/
Max. process+thread count 1654 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-630/1/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

Gabor: have a look at the patch, make sure it works for you, with DDB as the store, rather than just local.

if you are happy with my changes, I'll do another run, and if all is well, give it a vote up

(cherry picked from commit 5c78e44)

Change-Id: I34ed39a82893ed2f1caed8ac80609075a68dec57
* Add delays long enough for timestamps to be different
* Add delays for S3 to stabilize after writes/deletes, so that listings and HEAD calls will get the new value, not old ones
* probes for differences look for file lengths ahead of timestamps, for more tangible failures.
* and they validate the raw FS status acquired after the stabiliziation delay
* package private (currently) probe for S3A to verify that an FS instances considers its store to be authoritative.
Currently we've been checking the config, but to really know what's happening: lets query the internal state of FS.

Change-Id: Ib0184a2aacbec1e4b316cb8cad0265bd0b579bcd
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 28 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
0 mvndep 22 Maven dependency ordering for branch
+1 mvninstall 993 trunk passed
+1 compile 961 trunk passed
+1 checkstyle 194 trunk passed
+1 mvnsite 127 trunk passed
+1 shadedclient 1055 branch has no errors when building and testing our client artifacts.
+1 findbugs 154 trunk passed
+1 javadoc 104 trunk passed
_ Patch Compile Tests _
0 mvndep 20 Maven dependency ordering for patch
+1 mvninstall 76 the patch passed
+1 compile 891 the patch passed
+1 javac 891 the patch passed
-0 checkstyle 188 root: The patch generated 3 new + 6 unchanged - 0 fixed = 9 total (was 6)
+1 mvnsite 125 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 670 patch has no errors when building and testing our client artifacts.
+1 findbugs 171 the patch passed
+1 javadoc 101 the patch passed
_ Other Tests _
-1 unit 504 hadoop-common in the patch failed.
+1 unit 290 hadoop-aws in the patch passed.
+1 asflicense 34 The patch does not generate ASF License warnings.
6655
Reason Tests
Failed junit tests hadoop.util.TestDiskChecker
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-630/2/artifact/out/Dockerfile
GITHUB PR #630
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 7e8ff71b7686 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 9f1c017
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-630/2/artifact/out/diff-checkstyle-root.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-630/2/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-630/2/testReport/
Max. process+thread count 1345 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-630/2/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

failed tests are about a directory not existing on the test VM/container

ava.nio.file.NoSuchFileException: /testptch/hadoop/hadoop-common-project/hadoop-common/target/test/data/4/test3157601392604950757
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
	at java.nio.file.Files.createDirectory(Files.java:674)
	at java.nio.file.TempFileHelper.create(TempFileHelper.java:136)
	at java.nio.file.TempFileHelper.createTempDirectory(TempFileHelper.java:173)
	at java.nio.file.Files.createTempDirectory(Files.java:950)
	at org.apache.hadoop.util.TestDiskChecker.createTempDir(TestDiskChecker.java:153)
	at org.apache.hadoop.util.TestDiskChecker._checkDirs(TestDiskChecker.java:158)
	at org.apache.hadoop.util.TestDiskChecker.testCheckDir_notReadable(TestDiskChecker.java:121)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

@steveloughran steveloughran changed the title HADOOP-15999 improve test resilience and probes HADOOP-15999 S3Guard OOB: improve test resilience and probes Mar 22, 2019
Use  eventually() to await status changes in files: creation, deletion, overwrites. Allows 20s for state to change, wrapping

* getFileStatus() calls on new file (swallowing IOEs until happy)
* getFileStatus() on deleted files, expecting FNFEs to eventually surface
* the checks for changed files in the overwrite tests.

Change-Id: Ia128ebe4e1dc02f904794966b7a51f6ad91dd50f
@steveloughran steveloughran added the fs/s3 changes related to hadoop-aws; submitter must declare test endpoint label Mar 26, 2019
@steveloughran
Copy link
Contributor Author

Updated test with eventually called around operations where eventual consistency is possible, including some of the assertions. The only thing isn't checked is the s3guard list operations.

Tested: S3 ireland with DDB, standalone and in parallel runs

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 26 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
0 mvndep 23 Maven dependency ordering for branch
+1 mvninstall 1181 trunk passed
+1 compile 954 trunk passed
+1 checkstyle 204 trunk passed
+1 mvnsite 118 trunk passed
+1 shadedclient 1079 branch has no errors when building and testing our client artifacts.
+1 findbugs 156 trunk passed
+1 javadoc 91 trunk passed
_ Patch Compile Tests _
0 mvndep 20 Maven dependency ordering for patch
+1 mvninstall 77 the patch passed
+1 compile 893 the patch passed
+1 javac 893 the patch passed
-0 checkstyle 212 root: The patch generated 3 new + 6 unchanged - 0 fixed = 9 total (was 6)
+1 mvnsite 117 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 722 patch has no errors when building and testing our client artifacts.
+1 findbugs 170 the patch passed
+1 javadoc 88 the patch passed
_ Other Tests _
+1 unit 525 hadoop-common in the patch passed.
+1 unit 279 hadoop-aws in the patch passed.
+1 asflicense 42 The patch does not generate ASF License warnings.
6912
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-630/3/artifact/out/Dockerfile
GITHUB PR #630
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux eab766e33607 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 82d4772
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-630/3/artifact/out/diff-checkstyle-root.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-630/3/testReport/
Max. process+thread count 1344 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-630/3/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

Checkstyle

./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractGetFileStatusTest.java:400:      + "-" + UUID.randomUUID());: '+' has incorrect indentation level 6, expected level should be 8. [Indentation]
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardOutOfBandOperations.java:401:      assertArraySize("Added one file to the new dir and modified the same file, ": Line is longer than 80 characters (found 82). [LineLength]
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardOutOfBandOperations.java:509:  /**: First sentence should end with a period. [JavadocStyle]

@bgaborg
Copy link

bgaborg commented Mar 27, 2019

Tested against ireland. Found one flaky/intermittent race condition issue fixed in https://issues.apache.org/jira/browse/HADOOP-16186. Otherwise it looks good.

…hich will be committed, once my local test run completes

Change-Id: I448d74af1914b3037a661dd8d2048846ebe22891
@steveloughran
Copy link
Contributor Author

OK, if it works for you, I'm +1 too. fixing up the checkstyles and rerunning the tests before I do the final commit.

@steveloughran steveloughran deleted the s3/HADOOP-15999-oob branch March 28, 2019 16:01
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
… IR as input

This PR has the following changes:
- Let QueryTranslator take Calcite IR as input
- Include 'INSERT INTO' sql statement for Calcite plan
- Basic DSLConverter Framework with SamzaSQL dialect as an example
- Some fixes to stream-table join wrt Serde

Author: Aditya Toomula <[email protected]>

Reviewers: Srinivasulu <[email protected]>, Weiqing <[email protected]>

Closes apache#630 from atoomula/dsl3 and squashes the following commits:

93c66cee [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
21c0175b [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
15a1e9fb [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
5bf0c7e1 [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
98cd9777 [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
63a66fb1 [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
6794b512 [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
c9d434a9 [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
94e53b64 [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
30c76ed9 [Aditya Toomula] Samza SQL: Code re-org to accomodate Samza SQL engine to take Calcite IR as input.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fs/s3 changes related to hadoop-aws; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants