-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-4057. Failed acceptance test missing from bundle #1283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
elek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to work on it @adoroszlai
As usual, I am fine with the patch, but I have related thoughts to discuss.
set +eandset -e
I think it was a mistake to not using set -e everywhere from the beginning. I mean with --nostatusrc al the other lines supposed to be error free. (or we need to add | tree).
Not clear why do we need the set +e here if we introduce --nostatusrc
failing1-2
I have mixed feeling about including the failing1 directory to here. The original vision with compose was to provide easy to understood examples for the users for different use-cases. From this point of view failing1 is a confusing name as it's test related.
But we already started to add more and more test related stuff (for example upgrade), even if upgrade can be interesting for the users as well.
What I am thinking about is some kind of separations. What about keeping the compose folder and introduce a new smoketest. compose setup should be tested as before, but complex test cases (like a restart test, or log snapshotting) can be moved to the smoketest folder.
Not sure what is the best approach, interested about your opinion.
I agree it would be nice to implement both: consistent
|
Yes, that's a good question. The scripts of If we would like to keep the same contract for the individual test.sh files we need this Another option what I started to follow is to make Don't know which one is better. Anyway, let me merge this PR and we can continue to think about theses questions. |
|
Thanks @elek for reviewing and committing it. |
* master: (28 commits) HDDS-4037. Incorrect container numberOfKeys and usedBytes in SCM after key deletion (apache#1295) HDDS-3232. Include the byteman scripts in the distribution tar file (apache#1309) HDDS-4095. Byteman script to debug HCFS performance (apache#1311) HDDS-4057. Failed acceptance test missing from bundle (apache#1283) HDDS-4040. [OFS] BasicRootedOzoneFileSystem to support batchDelete (apache#1286) HDDS-4061. Pending delete blocks are not always included in #BLOCKCOUNT metadata (apache#1288) HDDS-4067. Implement toString for OMTransactionInfo (apache#1300) HDDS-3878. Make OMHA serviceID optional if one (but only one) is defined in the config (apache#1149) HDDS-3833. Use Pipeline choose policy to choose pipeline from exist pipeline list (apache#1096) HDDS-3979. Make bufferSize configurable for stream copy (apache#1212) HDDS-4048. Show more information while SCM version info mismatch (apache#1278) HDDS-4078. Use HDDS InterfaceAudience/Stability annotations (apache#1302) HDDS-4034. Add Unit Test for HadoopNestedDirGenerator. (apache#1266) HDDS-4076. Translate CSI.md into Chinese (apache#1299) HDDS-4046. Extensible subcommands for CLI applications (apache#1276) HDDS-4051. Remove whitelist/blacklist terminology from Ozone (apache#1306) HDDS-4055. Cleanup GitHub workflow (apache#1282) HDDS-4042. Update documentation for the GA release (apache#1269) HDDS-4066. Add core-site.xml to intellij configuration (apache#1292) HDDS-4073. Remove leftover robot.robot (apache#1297) ...
What changes were proposed in this pull request?
The problem is caused by this line in
test-all.sh:https://github.com/apache/hadoop-ozone/blob/e219aae7c7bcd8eacd592ca2ccfcf58626477f37/hadoop-ozone/dist/src/main/compose/test-all.sh#L28
because
testlib.shturns on "exit on error":https://github.com/apache/hadoop-ozone/blob/e219aae7c7bcd8eacd592ca2ccfcf58626477f37/hadoop-ozone/dist/src/main/compose/testlib.sh#L17
So
test-all.shexits after the first failedtest.shexecution,... is FAILEDis not present in output.https://github.com/apache/hadoop-ozone/blob/e219aae7c7bcd8eacd592ca2ccfcf58626477f37/hadoop-ozone/dist/src/main/compose/test-all.sh#L46-L50
The fix is simple: temporarily turn off "exit on error" before running
test.sh, then turn it back on. We also need to makerebotreturn zero code even if some tests failed.https://issues.apache.org/jira/browse/HDDS-4057
How was this patch tested?
Added two always failing tests and two environments. These can be used to test failure handling in acceptance test runner. Without the fix, only the first test in the first environment is run, and its logs are not copied to
target/acceptance:With the fix, both tests in both environments are run:
Regular CI:
https://github.com/adoroszlai/hadoop-ozone/runs/937941098