Skip to content

Commit

Permalink
Remove regression checks for website links (apache#12507)
Browse files Browse the repository at this point in the history
* Remove regression checks for website links

* Add redirection ignore regex
  • Loading branch information
sandeep-krishnamurthy authored and anirudh2290 committed Sep 19, 2018
1 parent c0658a2 commit b04f802
Show file tree
Hide file tree
Showing 5 changed files with 5 additions and 59 deletions.
4 changes: 0 additions & 4 deletions tests/nightly/broken_link_checker_test/JenkinsfileForBLC
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,7 @@ core_logic: {
timeout(time: 60, unit: 'MINUTES') {
try {
utils.init_git()
sh 'aws s3 cp s3://mxnet-ci-prod-slave-data/url_list.txt ./tests/nightly/broken_link_checker_test/url_list.txt'
utils.docker_run('ubuntu_blc', 'broken_link_checker', false)
} finally {
sh "echo Storing the new url_list.txt to S3 bucket"
sh 'aws s3 cp ./tests/nightly/broken_link_checker_test/url_list.txt s3://mxnet-ci-prod-slave-data/url_list.txt'
}
}
}
Expand Down
5 changes: 1 addition & 4 deletions tests/nightly/broken_link_checker_test/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
# Broken link checker test

This folder contains the scripts that are required to run the nightly job of checking the broken links. The job also checks whether the link that were published before are still accessible.
This folder contains the scripts that are required to run the nightly job of checking the broken links.

## JenkinsfileForBLC
This is configuration file for jenkins job.

## Details
The `broken_link_checker.sh` is a top level script that invokes the `test_broken_links.py` and `check_regression.sh` scripts.
The `test_broken_links.py` invokes broken link checker tool (blc) from nodeJs and reports the list of URLs that are not accessible.
The `check_regression.sh` scripts downloads the file `url_list.txt` that contains links that are publicly accessible from s3 bucket
The scripts merges this list with the output of `test_broken_links.py` and checks whether all those links are accessible using 'curl' command.
The updated `url_list.txt` is uploaded to s3 bucket.
3 changes: 0 additions & 3 deletions tests/nightly/broken_link_checker_test/broken_link_checker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,3 @@ echo `pwd`

echo "Running test_broken_links.py"
python test_broken_links.py

echo "Running check_regression.sh"
./check_regression.sh
46 changes: 0 additions & 46 deletions tests/nightly/broken_link_checker_test/check_regression.sh

This file was deleted.

6 changes: 4 additions & 2 deletions tests/nightly/broken_link_checker_test/test_broken_links.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ def prepare_link_test_result(command_output):
# Whitelisted broken links patterns
HTTP_403_REGEX = "(HTTP_403)"
HTTP_401_REGEX = "(HTTP_401)"
HTTP_409_REGEX = "(HTTP_409)"
HTTP_3XX_REGEX = "(HTTP_3"
BLC_UNKNOWN_REGEX = "(BLC_UNKNOWN)"
HTTP_UNDEFINED = "HTTP_undefined"
FALSE_SCALA_API_DOC_LINK = "java$lang.html"
Expand All @@ -53,8 +55,8 @@ def prepare_link_test_result(command_output):
current_page_broken_links = ""

if line.find(BROKEN_PAGE_START_REGEX) != -1:
# Skip (401, 403, unknown issues)
if HTTP_403_REGEX not in line and HTTP_401_REGEX not in line and BLC_UNKNOWN_REGEX not in line and HTTP_UNDEFINED not in line and FALSE_SCALA_API_DOC_LINK not in line and FALSE_SCALA_API_DEPRECATED_LINK not in line and FALSE_PAPER_LINK not in line:
# Skip (401, 403, 409, unknown issues)
if HTTP_403_REGEX not in line and HTTP_401_REGEX not in line and HTTP_409_REGEX not in line and HTTP_3XX_REGEX not in line and BLC_UNKNOWN_REGEX not in line and HTTP_UNDEFINED not in line and FALSE_SCALA_API_DOC_LINK not in line and FALSE_SCALA_API_DEPRECATED_LINK not in line and FALSE_PAPER_LINK not in line:
current_page_broken = True
current_page_broken_links += line.split(BROKEN_PAGE_START_REGEX)[1] + "\n"

Expand Down

0 comments on commit b04f802

Please sign in to comment.