-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-644] Automated flaky test detection #11991
Conversation
Why do you have some updates in mshadow and tvm? Did you do |
9b3edb9
to
8ec8f08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some reviews for now
tools/flaky_tests/check_branch.py
Outdated
return [(filename, test) | ||
for filename in deps.keys() | ||
for test in deps[filename] | ||
if test.startswith("test_")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this "test_" a constant header and document this clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tools/flaky_tests/check_branch.py
Outdated
for t in tests: | ||
total_time += time_test(t) | ||
|
||
n = int(TIME_BUDGET / total_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to handle divide-by-zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tools/flaky_tests/check_branch.py
Outdated
|
||
|
||
def output_results(flaky, nonflaky): | ||
print("Following tests failed flakiness checker:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use logger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually my approach is to use logging for outputting information about program execution and using print for actual output. However, I'd be fine with switching over to logging, if that's more in line with what MXNet does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using logging for everything is preferred in general
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@marcoabreu, could you take a look at least at the Jenkinsfile to see if I'm missing anything |
tools/flaky_tests/Jenkinsfile
Outdated
node('mxnetlinux-gpu') { | ||
ws('workspace/flakiness_check'){ | ||
init_git() | ||
docker_run('ubuntu_gpu', 'run_flakiness_checker', true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it not need a compiled version of MXNet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, for now I'll build mxnet each time we run this, but really we should only do so when dependency analyzer detects changed tests. This would take some refactoring, however, and I think it would be good to get a version of this tool running ASAP.
Sorry, I'm currently swamped with high priority tasks and I don't have time to review your pull request. |
reorganized code changed diff collator output added logging and improved command-line options Removed extra space added some comments fixed documentation create folder
wip on chack_branch changed logging and added support for cross-file dependnecies finished basic check_branch
# The first commit's message is: check_branch is demo-ready # This is the 2nd commit message: renamed test_selector to dependency_analyzer # This is the 3rd commit message: fixed check_branch output # This is the 4th commit message: refactoring # This is the 5th commit message: improved logging
renamed test_selector to dependency_analyzer fixed check_branch output refactoring improved logging wip ci deployment changde logging levels code improvment
changed Jenkinsfile to use redesigned system included config file for dependency analyzer minor fixes
tools/flaky_tests/Jenkinsfile
Outdated
// only continue if some tests were selected | ||
if( ! tests ) { | ||
currentBuild.result = 'SUCCESS' | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't return here. The wrapping logic (utils.main_wrapper) is taking care of propagating the results properly. Just do nothing in that case.
tools/flaky_tests/Jenkinsfile
Outdated
utils.init_git() | ||
utils.docker_run('ubuntu_cpu', 'select_tests', false) | ||
tests = fileExists('tests.tmp') | ||
stash name:'flaky_tests', includes:'tests.tmp' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent!
tools/flaky_tests/Jenkinsfile
Outdated
node(NODE_LINUX_CPU){ | ||
ws('workspace/fc-preprocessing'){ | ||
utils.init_git() | ||
utils.docker_run('ubuntu_cpu', 'select_tests', false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select_tests does not exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch, fixed
acb5642
to
fa27648
Compare
Thanks for the reviews @marcoabreu anything else needed in order to merge? |
There is an error on the run job: http://jenkins.mxnet-ci-dev.amazon-ml.com/job/flaky-test-detector/view/change-requests/job/PR-11991/4/console
@cetsai would you be able to look into it? |
@lebeg it was fixed in the last commit E: sorry, I forgot I changed the directory name |
Have you tried to test your changes locally? The command would be:
and
|
72b63e3
to
a442e7a
Compare
a442e7a
to
300492b
Compare
} | ||
flaky_check_run_flakiness_checker(){ | ||
set -ex | ||
export PYTHONPATH=./python/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be export PYTHONPATH=/work/mxnet/python/
like here: https://github.com/apache/incubator-mxnet/pull/11991/files#diff-1335fbaf3930b1438d9be18edb07a1a6R921 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, I was looking at the python2 unit test runtime function, which uses export PYTHONPATH=./python/ https://github.com/apache/incubator-mxnet/blob/64566872a28a9426f3ec20bcf0210ebb608854f8/ci/docker/runtime_functions.sh#L639
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the pipeline is failing on not able to import mxnet here: http://jenkins.mxnet-ci-dev.amazon-ml.com/blue/organizations/jenkins/flaky-test-detector/detail/PR-11991/11/pipeline/67. Maybe we should switch to /work/mxnet/python/
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps-- however, that run was triggered before the last commit, so I don't know which of these options we should go with. Perhaps @marcoabreu can help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcoabreu could you give some help here to get this to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haojin2 @cetsai @lebeg @marcoabreu requesting an update on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can I help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haojin2 I'm having trouble running this locally, can you try ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, don't have any cycles for this at this moment...
ping @cetsai any updates? |
@nswamy @sandeep-krishnamurthy can you please close this PR. @cetsai feel free to reopen the PR once the changes are ready. |
@nswamy @sandeep-krishnamurthy can you please close this PR. |
Description
This PR adds the necessary components for an automated flaky tests detection measure, the design of which is detailed on the wiki.
These components, diff collator, dependency analyzer, and flakiness checker are used by the check_flakiness script, which will be run in a Jenkins pipeline to automatically check PRs for flaky tests. Once active, the tool will mark PRs that cause flaky tests so that they can be fixed before being merged with master.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes