Skip to content

Conversation

@timmylicheng
Copy link
Contributor

What changes were proposed in this pull request?

#HDDS-2115 Add acceptance test for createPipeline CLI and datanode list CLI

(Please fill in changes proposed in this fix)

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-2115

(Please create an issue in ASF JIRA before opening a pull request,
and you need to set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)

Please replace this section with the link to the Apache JIRA)

How was this patch tested?

Run test-single.sh locally and all passed.

(Please explain how this patch was tested. Ex: unit tests, manual tests)
(If this patch involves UI changes, please attach a screen-shot; otherwise, remove this)

@timmylicheng
Copy link
Contributor Author

timmylicheng commented Dec 19, 2019

../test-single.sh scm scmcli/datanode.robot
==============================================================================
ozone-datanode :: Smoketest ozone cluster startup
==============================================================================
Run list pipeline                                                     | PASS |
------------------------------------------------------------------------------
ozone-datanode :: Smoketest ozone cluster startup                     | PASS |
1 critical test, 1 passed, 0 failed
1 test total, 1 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-datanode-scm.xml

@timmylicheng
Copy link
Contributor Author

timmylicheng commented Dec 19, 2019

../test-single.sh scm scmcli/pipeline.robot
==============================================================================
ozone-pipeline :: Smoketest ozone cluster startup
==============================================================================
Run list pipeline                                                     | PASS |
------------------------------------------------------------------------------
Run create pipeline                                                   | PASS |
------------------------------------------------------------------------------
ozone-pipeline :: Smoketest ozone cluster startup                     | PASS |
2 critical tests, 2 passed, 0 failed
2 tests total, 2 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-pipeline-scm.xml
Robot framework is not installed, the reports can be generated (sudo pip install robotframework).

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @timmylicheng for working on this. LGTM, except two typos. Please also consider two suggestions.

@timmylicheng
Copy link
Contributor Author

ERROR: Test execution of /home/runner/work/hadoop-ozone/hadoop-ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/ozonesecure is FAILED!!!!

Failed acceptance test is irrelevant.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @timmylicheng for updating the patch.

@hanishakoneru
Copy link
Contributor

Thanks @timmylicheng for working on this.
The patch LGTM overall. The acceptance test failure you are seeing is because of another issue which was fixed by HDDS-2774. I have rerun the checks again.

@hanishakoneru
Copy link
Contributor

@timmylicheng can you please rebase with latest master. We wouldn't get a clean acceptance test run without the fix in HDDS-2774.

@timmylicheng
Copy link
Contributor Author

@timmylicheng can you please rebase with latest master. We wouldn't get a clean acceptance test run without the fix in HDDS-2774.

Yea sure. We are going merge tip master to HDDS-1564.

@timmylicheng
Copy link
Contributor Author

After merge with master, acceptance fails at s3 gateway test. @hanishakoneru

@elek
Copy link
Member

elek commented Jan 6, 2020

After merge with master, acceptance fails at s3 gateway test.

Can't see the exact error message. The acceptance test log files are missing from the archive.

Let me try to retrigger it....

@hanishakoneru
Copy link
Contributor

I am not able to expand the acceptance test logs. Going to re-trigger the checks again to see if this is a one time problem or persistent.

@timmylicheng
Copy link
Contributor Author

Removing network ozonesecure_default
cp: cannot stat '/home/runner/work/hadoop-ozone/hadoop-ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/ozonesecure/result/robot-.xml': No such file or directory
[ ERROR ] Reading XML source '/home/runner/work/hadoop-ozone/hadoop-ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/result/robot-
.xml' failed: No such file or directory

Try --help for usage information.
cp: cannot stat '/home/runner/work/hadoop-ozone/hadoop-ozone/hadoop-ozone/dev-support/checks/../../../target/acceptance/log.html': No such file or directory
##[error]Process completed with exit code 1.

@hanishakoneru @elek

@adoroszlai
Copy link
Contributor

@timmylicheng all acceptance tests failed because SCM could not exit safe mode:

2020-01-07T03:46:31.5337614Z SCM is in safe mode.
2020-01-07T03:46:41.4364121Z SCM is in safe mode.
2020-01-07T03:46:50.4562615Z SCM is in safe mode.
2020-01-07T03:47:00.1458169Z SCM is in safe mode.
2020-01-07T03:47:08.0186479Z SCM is in safe mode.
2020-01-07T03:47:17.1032118Z SCM is in safe mode.
2020-01-07T03:47:26.6599830Z SCM is in safe mode.
2020-01-07T03:47:28.6637806Z WARNING! Safemode is still on. Please check the docker-compose files

Docker container logs are full of messages like group-9415212EE472 not found. My guess is that this is the same problem as fixed in HDDS-2679 and RATIS-783. Can you please try merging from master again?

@hanishakoneru
Copy link
Contributor

Thanks @adoroszlai. I am running CI with @timmylicheng 's changes on top of current master to see if rebasing to master would resolve the issue. I ll post the results when it completes.

@timmylicheng
Copy link
Contributor Author

timmylicheng commented Jan 8, 2020

I merged with master and acceptance failed again with a different message. @hanishakoneru @adoroszlai

hadoop27-mapreduce :: Execute MR jobs

Execute PI calculation | FAIL |
Test timeout 4 minutes exceeded.
Execute WordCount | FAIL |
255 != 0
hadoop27-mapreduce :: Execute MR jobs | FAIL |
2 critical tests, 0 passed, 2 failed
2 tests total, 0 passed, 2 failed

Output: /tmp/smoketest/hadoop27/result/robot-hadoop27-hadoop27-mapreduce-rm.xml
/home/runner/work/hadoop-ozone/hadoop-ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/ozone-mr/hadoop27/../../testlib.sh: line 125: 97441 Killed docker-compose -f "$COMPOSE_FILE" --no-ansi logs > "$RESULT_DIR/docker-$OUTPUT_NAME.log"
ERROR: Test execution of /home/runner/work/hadoop-ozone/hadoop-ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/ozone-mr/hadoop27 is FAILED!!!!
##[error]Process completed with exit code 137.

@elek
Copy link
Member

elek commented Jan 8, 2020

Acceptance tests on HDDS-1564 branch seems to be unstable

https://github.com/apache/hadoop-ozone/actions?query=branch%3AHDDS-1564

compared to the master:

https://github.com/apache/hadoop-ozone/actions?query=branch%3Amaster

I think it's a generic problem with the branch as I have seen same failures on different PRs (eg. #406).

I would suggest to merge this PR as is. It wouldn't break the master (feature branch) and the failure is clearly not the problem with this PR.

And after committing the issues in the queue, we need to find the problem and fix the tests on this branch...

@hanishakoneru
Copy link
Contributor

Thanks @elek . I also tested the patch extensively. Most times the acceptance test fails without any logs. Tested the robot tests locally and they work fine. Also tested with just this change on top of master and the results are as expected.
I will merge this PR shortly.

@hanishakoneru
Copy link
Contributor

Thank you @timmylicheng for the contributions, @adoroszlai and @elek for the reviews.
+1 to merge into branch HDDS-1564.

@hanishakoneru hanishakoneru merged commit 129d464 into apache:HDDS-1564 Jan 8, 2020
@timmylicheng timmylicheng deleted the HDDS-2115 branch January 9, 2020 02:50
timmylicheng added a commit to timmylicheng/hadoop-ozone that referenced this pull request Feb 10, 2020
…t CLI (apache#375)

* HDDS-2115 Add acceptance test for createPipeline CLI and datanode list CLI.
timmylicheng added a commit to timmylicheng/hadoop-ozone that referenced this pull request Feb 12, 2020
…t CLI (apache#375)

* HDDS-2115 Add acceptance test for createPipeline CLI and datanode list CLI.
anuengineer pushed a commit that referenced this pull request Feb 19, 2020
* HDDS-1577. Add default pipeline placement policy implementation. (#1366)



(cherry picked from commit b640a5f6d53830aee4b9c2a7d17bf57c987962cd)

* HDDS-1571. Create an interface for pipeline placement policy to support network topologies. (#1395)

(cherry picked from commit 753fc6703a39154ed6013e44dbae572391748906)

* HDDS-2089: Add createPipeline CLI. (#1418)

(cherry picked from commit 326b5acd4a63fe46821919322867f5daff30750c)

* HDDS-1569 Support creating multiple pipelines with same datanode. Contributed by Li Cheng. 

This closes #28

* HDDS-1572 Implement a Pipeline scrubber to clean up non-OPEN pipeline. (#237)

* Rebase Fix

* HDDS-2650 Fix createPipeline CLI. (#340)

* HDDS-2035 Implement datanode level CLI to reveal pipeline relation. (#348)

* Revert "HDDS-2650 Fix createPipeline CLI. (#340)"

This reverts commit 7c71710.

* HDDS-2650 Fix createPipeline CLI and make it message based. (#370)

* HDDS-1574 Average out pipeline allocation on datanodes and add metrcs/test (#291)

* Resolve rebase conflict.

* HDDS-2756. Handle pipeline creation failure in different way when it exceeds pipeline limit

Closes #401

* HDDS-2115 Add acceptance test for createPipeline CLI and datanode list CLI (#375)

* HDDS-2115 Add acceptance test for createPipeline CLI and datanode list CLI.

* HDDS-2772 Better management for pipeline creation limitation. (#410)

*  HDDS-2913 Update config names and CLI for multi-raft feature. (#462)

* HDDS-2924. Fix Pipeline#nodeIdsHash collision issue. (#478)

* HDDS-2923 Add fall-back protection for rack awareness in pipeline creation. (#516)

* HDDS-3007 Fix CI test failure for TestSCMNodeManager. (#550)

Co-authored-by: Sammi Chen <[email protected]>
Co-authored-by: Xiaoyu Yao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants