Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TorBot Stable Version1.2 #56

Merged
merged 85 commits into from
Feb 12, 2018
Merged

TorBot Stable Version1.2 #56

merged 85 commits into from
Feb 12, 2018

Conversation

PSNAppz
Copy link
Member

@PSNAppz PSNAppz commented Feb 1, 2018

#49

PSNAppz and others added 30 commits November 16, 2017 22:38
Fixed requirements.txt by adding missing "="
Fixed the error occurring when given a broken link with -u flag and other minor improvements.
PEP8 code and fixed some minor bugs
To socket code and put it in side of a function. Also took out the
Controller module since it wasn't actually being used.
Removed code that wasn't being called and now using
regular expressions to validate urls
Instead of just checking if string contains http or https
Uses two functions which use regular expressions to match
valid urls. One funciton is specifically geared towards
onion address and the other is for general url validation.
Using the -s flag will now save the results in a json file
within the current working directory.
Takes list of URLs and asychrnously calls HEAD request on each
link in the list and test status code for a 200 response. If the
response is not 200 or takes longer than 8 seconds, the link is
declared dead. Also switched from urllib.request to requests for
not only simplicity but thread-safety also.
Was previously showing dead for any status code which isn't
200, so I use raise_for_status function from requests which only
raises an error if the status code is an HTTP error status code
such as 4xx and 5xx
To use dash -e flag, pass url name like www.url.com
and it will try to establish an https connection than http if
the secure conneciton fails. If both fail then an error message is
printed. If the connection is successful than we search for the
onion domain name and the others that were passed with the flag.
Copy link
Member

@KingAkeem KingAkeem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-Should remove Python Stem Module from Dependencies (README.md, Line 88)
-Fix Grammar error. Remove the word "should" before "default setting". (README.md, Line 105)
-Still using urlliib.request, should be requests.

KingAkeem and others added 3 commits February 11, 2018 15:02
Fixed requirements.txt, I took out the stem module since it's not
used anymore. Added requests module also
@PSNAppz
Copy link
Member Author

PSNAppz commented Feb 12, 2018

@KingAkeem Thanks for pointing out. I will fix those asap.

@@ -1,17 +1,36 @@
import urllib.request
import urllib.request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be switched to requests

def readPage(site):
headers = {'User-Agent':
'TorBot - Onion crawler | www.github.com/DedSecInside/TorBot'}
req = urllib.request.Request(site, None, headers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs to be switched to requests

while (attempts_left):
try:
response = urllib.request.urlopen(req)
page = BeautifulSoup(response.read(), 'html.parser')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requests

requirements.txt Outdated
@@ -1,3 +1,4 @@
beautifulsoup4==4.6.0
PySocks==1.6.7
stem==1.5.4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stem is no longer being used and request requirement should be added for version 2.18.4

@PSNAppz
Copy link
Member Author

PSNAppz commented Feb 12, 2018

This PR will be updated once #66 is merged. @KingAkeem

@PSNAppz
Copy link
Member Author

PSNAppz commented Feb 12, 2018

@KingAkeem Please review this

- sudo apt-get -y install python3-pip
- pip3 install bs4
- pip3 install -r requirements.txt
- cd tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are using pytest, whenever you go into the test suite. You only need to run pytest to run all of the tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

README.md Outdated
5. Save crawl info to JSON file.(Completed)
6. Crawl custom domains.(Completed)
7. Check if the link is live.(Not Started)
4. Built-in Updater.(Completed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built-in Updater should be number 8 instead of 4.

Copy link
Member

@KingAkeem KingAkeem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks fine.

Copy link
Member

@KingAkeem KingAkeem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good.

@PSNAppz
Copy link
Member Author

PSNAppz commented Feb 12, 2018

@KingAkeem @agostinelli @agrepravin @leaen Thanks for the awesome work

@PSNAppz PSNAppz merged commit abe6e8a into master Feb 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants