Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major Upgrade v1.3 #145

Merged
merged 248 commits into from
Oct 28, 2018
Merged
Changes from 16 commits
Commits
Show all changes
248 commits
Select commit Hold shift + click to select a range
73ba9f4
TO-DO list update
PSNAppz Feb 14, 2018
5ad194f
Added Go version of getweblinks getLinks function
KingAkeem Feb 17, 2018
7696666
Forgot to add folders :)
KingAkeem Feb 17, 2018
ad5f686
Put go_modules inside of modules to not break code
KingAkeem Feb 17, 2018
00d4744
Added modules folder
KingAkeem Feb 18, 2018
5584d9e
title feature added
Feb 23, 2018
deced4f
title feature tested, plus little refactor
Feb 24, 2018
3238e4d
Merge pull request #68 from KingAkeem/goBot
PSNAppz Mar 13, 2018
84424c9
Merge pull request #69 from Agostinelli/dev
PSNAppz Mar 13, 2018
0e3c7db
Torbot is now OSNIT Tool .
PSNAppz Mar 14, 2018
7e390e3
TorBot is now OSINT tool
PSNAppz Mar 14, 2018
16782bb
Added new http requests mock for pagereader test
Mar 19, 2018
6ea473a
Adding requests_mock version 1.4.0 to requirements
Mar 19, 2018
e9b79de
Adding BeautifulSoup import to test
Mar 19, 2018
96301e4
Fixing typo
KingAkeem Mar 20, 2018
e76a578
Added Database Register
tiagoCMatias Mar 22, 2018
42c66ef
Removed debug messages
tiagoCMatias Mar 22, 2018
7b50af0
description page function added
Mar 23, 2018
0fb17c8
description page function added
Mar 23, 2018
09df97d
Handle exception
shivankar-madaan Mar 27, 2018
2189948
changed the check for exception
shivankar-madaan Mar 27, 2018
fed695a
reverted
shivankar-madaan Mar 27, 2018
8f4b9f3
Changed exit code
shivankar-madaan Mar 27, 2018
cb01e68
fixed indentation
shivankar-madaan Mar 27, 2018
dfde4f5
Save to Database Refactoring
tiagoCMatias Mar 27, 2018
246ab96
changed verification of credentials to what was requested in the pull…
tiagoCMatias Mar 27, 2018
2fb66a1
Merge pull request #74 from KingAkeem/dev
PSNAppz Mar 28, 2018
30e9b62
Made Schema optional for more robust choices url
KingAkeem Mar 29, 2018
5a4252f
Merge pull request #80 from KingAkeem/dev
PSNAppz Mar 29, 2018
87cd399
Merge branch 'dev' into patch-1
PSNAppz Mar 29, 2018
b178180
Merge pull request #76 from Agostinelli/dev
PSNAppz Mar 29, 2018
b723ba3
Requested changes applied #75
tiagoCMatias Apr 1, 2018
cf07f2f
Adding Contributor
PSNAppz Apr 2, 2018
bc86db3
Merge pull request #78 from shivankar-madaan/patch-1
PSNAppz Apr 8, 2018
336559b
Contributor Updates
PSNAppz Apr 8, 2018
b96fda6
Update README.md
PSNAppz Apr 9, 2018
c2d56fd
Added feature to -i flag that displays all header information from HT…
KingAkeem Apr 23, 2018
c0345be
Merge pull request #82 from KingAkeem/dev
PSNAppz Apr 23, 2018
391421d
Update README.md
PSNAppz May 4, 2018
efed6d5
Adding installation script and instructions
KingAkeem May 4, 2018
c638cd0
Added some comments
KingAkeem May 4, 2018
ee65592
Merge branch 'dev' of https://github.com/DedSecInside/TorBot into dev
KingAkeem May 4, 2018
194737a
Merge pull request #83 from KingAkeem/dev
PSNAppz May 5, 2018
e55ff0c
Removed unneccessary dependencies
KingAkeem May 5, 2018
21cc45a
Adding pyinstaller (module being used to turn torBot into executable)…
May 6, 2018
c9add89
Beginning development on Go branch again
May 7, 2018
d112b5b
Adding documentation and modularizing functions.
May 7, 2018
5d9d7bd
Fixing modules, going to add color next
May 7, 2018
dd93363
Adding interrupt and fixing concurrency issues
KingAkeem May 8, 2018
d353d7a
Checking system calls for weird responses
KingAkeem May 8, 2018
e4ed4b4
Added interface to make checkURL function reusuable and refactoring
May 8, 2018
ca93975
Added some testing for go module
May 8, 2018
cca1124
Handling case of no urls being found
May 8, 2018
376f45f
Adding more coverage for tests
May 9, 2018
8ec4e62
Removed shared object and added file to .gitignore
KingAkeem May 9, 2018
fb8a944
Adding testing documenation
KingAkeem May 9, 2018
fd7a83a
Fixing the check of HTTPError
SubaruSama May 16, 2018
66d154e
Fixed some errors with README.md
KingAkeem May 16, 2018
e12123d
Merge pull request #90 from SubaruSama/patch-2
PSNAppz May 17, 2018
6cd12a6
Merge remote-tracking branch 'origin/gobot' into dev
KingAkeem May 17, 2018
816b24a
Merge branch 'dev' of github.com:DedSecInside/TorBot into dev
KingAkeem May 17, 2018
6fa3bb2
Updated Change log and requirements
KingAkeem May 17, 2018
da52bb3
Merge branch 'dev' of github.com:KingAkeem/TorBot into dev
KingAkeem May 17, 2018
caa2536
Added some shell scripts and made some minor fixes
KingAkeem May 17, 2018
b8ef7b6
Merge branch 'dev' into gobot
KingAkeem May 17, 2018
bb1192a
Merge branch 'gobot' into dev
KingAkeem May 17, 2018
4e00944
Updated Change log
KingAkeem May 17, 2018
255906a
added new line at the end of tests/test_savetofile.py
robly78746 May 31, 2018
a73a37d
removed argument live from get_links and removed parameter live from …
robly78746 May 31, 2018
9783684
Merge pull request #91 from robly78746/fix-final-newline-missing
PSNAppz Jun 1, 2018
bc42c8b
Revert "removed argument live from get_links and removed parameter li…
robly78746 Jun 1, 2018
bb1c2f4
used live argument to determine whether to print live statuses of web…
robly78746 Jun 1, 2018
01c351a
Merge pull request #92 from robly78746/fix-unused-argument
PSNAppz Jun 1, 2018
816df4a
Merge branch 'dev' of github.com:DedSecInside/TorBot into dev
KingAkeem Jun 16, 2018
89f07ea
Update README.md
PSNAppz Jun 22, 2018
774c91f
Update TO-DO list
PSNAppz Jul 1, 2018
a903992
Refactored tests and TorBot app so that tests no longer need to touch
KingAkeem Jul 1, 2018
86026b6
Pep8. Whitespace. Fixed raised string error
AlwaysSayingPleaseAndThankYou Jul 2, 2018
21c5494
Fixing errors with CodeFactor
KingAkeem Jul 4, 2018
c4ff40e
Adding yattag to requirements for testing
KingAkeem Jul 4, 2018
503353a
Merge pull request #96 from HotPushUpGuy420/dev
PSNAppz Jul 4, 2018
bc56cb9
Merge branch 'dev' into refactor_for_testing
KingAkeem Jul 4, 2018
4d2eb46
Merge pull request #95 from KingAkeem/refactor_for_testing
KingAkeem Jul 4, 2018
0b391a3
Merge branch 'dev' of github.com:DedSecInside/TorBot into dev
KingAkeem Jul 4, 2018
bf12820
Adding configurable ip and port arguments
KingAkeem Jul 5, 2018
3f825a0
Formatted .gitignore file and added __init__to modules dir
KingAkeem Jul 5, 2018
d7016dc
Merge branch 'dev' into formatting_gitignore
KingAkeem Jul 5, 2018
7fafcf0
Update README.md
PSNAppz Jul 7, 2018
c02aa5e
Update README.md
PSNAppz Jul 8, 2018
14b250b
Update README.md
PSNAppz Jul 8, 2018
a829cc1
Update README.md
PSNAppz Jul 8, 2018
72368bc
Adding Contributors
PSNAppz Jul 9, 2018
2638f8d
Testing Doc Partial Commit
PSNAppz Jul 9, 2018
9addc15
typo in Testing.MD fixed
PSNAppz Jul 9, 2018
87cb7d5
Merge pull request #99 from KingAkeem/configurable_ip/port
PSNAppz Jul 9, 2018
295953f
Update README.md
PSNAppz Jul 16, 2018
a89409b
Merge pull request #101 from DedSecInside/formatting_gitignore
PSNAppz Jul 19, 2018
b04da59
Remove pointless-string-statement
Jul 21, 2018
85116d2
Fix bad-indentation
Jul 21, 2018
ca6179e
Fix redefined-builtin 'ConnectionError' and 'exit'
Jul 21, 2018
d99c11c
Remove unused-argument 'conn'
Jul 21, 2018
a7212c6
Merge branch 'dev' into feature-db
tiagoCMatias Jul 22, 2018
877a91e
Merge pull request #103 from aldokkani/Feature_PEP8
PSNAppz Jul 26, 2018
5b228e0
Updating LOGO
PSNAppz Jul 29, 2018
f19a657
Update README.md
PSNAppz Jul 29, 2018
4465185
Update torBot.py
PSNAppz Jul 29, 2018
42a56dc
Update README.md
PSNAppz Jul 29, 2018
620eebd
Merge branch 'dev' into dev
KingAkeem Jul 30, 2018
d12f077
Merge pull request #85 from KingAkeem/dev
PSNAppz Jul 30, 2018
95247b3
Use Python library instead of Golang
KingAkeem Jul 30, 2018
a32af34
Correcting comment
KingAkeem Jul 30, 2018
282b3aa
New Templates when someone is trying to edit the project
KingAkeem Jul 30, 2018
9efabf9
Merge pull request #105 from DedSecInside/KingAkeem-patch-1
PSNAppz Jul 30, 2018
e661f85
Update README.md
PSNAppz Jul 30, 2018
f080aae
Rename pull-request.md to PULL_REQUEST_TEMPLATE.md
KingAkeem Jul 31, 2018
78533d3
Moving to correct location
KingAkeem Jul 31, 2018
f41a646
Merge pull request #106 from DedSecInside/production
KingAkeem Jul 31, 2018
68eb9a2
Update PULL_REQUEST_TEMPLATE.md
KingAkeem Jul 31, 2018
c4dd2e8
Update settings.py
PSNAppz Aug 3, 2018
bab571b
Adding get_args
KingAkeem Jul 5, 2018
075eea5
Merge pull request #108 from DedSecInside/changelog
PSNAppz Aug 4, 2018
a6e310a
Changes
Aug 4, 2018
7151656
Revert "Changes"
Aug 4, 2018
a9a99cd
Bug_fixes
Aug 4, 2018
8c7263b
Merge pull request #109 from DedSecInside/bug_fixes
PSNAppz Aug 4, 2018
c049a6a
Fixing Codefactor issue
PSNAppz Aug 5, 2018
e2e7690
Merge pull request #75 from tiagoCMatias/feature-db
PSNAppz Aug 5, 2018
509145f
Using multithreading and queues for speed increase
KingAkeem Aug 5, 2018
e068cd7
Removing daemon threads for safety
KingAkeem Aug 5, 2018
434854a
Made multi-threading queue more generalized so it can't be used in other
KingAkeem Aug 6, 2018
ac2986b
Adding comments
KingAkeem Aug 6, 2018
7f96121
Using title string instead of html
KingAkeem Aug 6, 2018
3735b3d
Merge branch 'prestarting_threads' into MyDev
KingAkeem Aug 8, 2018
148002c
Added BFS traveral function for links
KingAkeem Aug 8, 2018
10e3cb9
Fixing for CodeFactor
KingAkeem Aug 8, 2018
dd8e6a8
Adding more test coverage
KingAkeem Aug 8, 2018
86c5aee
Fixing issue with CodeFactor
KingAkeem Aug 8, 2018
49e193a
Merge pull request #112 from KingAkeem/more_test_coverage
KingAkeem Aug 8, 2018
d80d33a
Merge pull request #110 from KingAkeem/prestarting_threads
KingAkeem Aug 8, 2018
08d58ce
Adding support for optional parameters
KingAkeem Aug 8, 2018
b1586d4
Adding exception handling for get requests within traversal function
KingAkeem Aug 9, 2018
eb83451
Skip to next element if GET requests fail for traversal
KingAkeem Aug 9, 2018
a9920c0
Merge branch 'MainDev' into bfs_crawl
KingAkeem Aug 9, 2018
063a209
Adding specific exceptions to satisfy CodeFactor
KingAkeem Aug 9, 2018
5ca2d21
Merge pull request #111 from KingAkeem/bfs_crawl
PSNAppz Aug 10, 2018
86ae5f9
Updated Req.txt
PSNAppz Aug 11, 2018
9cef70d
CodeFactor Fix
PSNAppz Aug 11, 2018
f147ed5
removing __pycache__
PSNAppz Aug 11, 2018
9b1ab8c
MySQLdb
PSNAppz Aug 14, 2018
f65c50c
libmysqlclient-dev
PSNAppz Aug 14, 2018
d2c0aef
Adding FAQ.md
PSNAppz Aug 15, 2018
2d863a6
Merge pull request #114 from DedSecInside/bug_fixes
PSNAppz Aug 15, 2018
a5e75d0
Revert "Merge pull request #75 from tiagoCMatias/feature-db"
KingAkeem Aug 19, 2018
6643cc0
Delete savedb.py
KingAkeem Aug 19, 2018
f302bd8
Merge pull request #117 from DedSecInside/MainDev
KingAkeem Aug 19, 2018
37db2c5
Update TESTING.md
KingAkeem Aug 19, 2018
9313822
Update TESTING.md
KingAkeem Aug 19, 2018
a9b4cf8
Update README.md
KingAkeem Aug 19, 2018
17b9902
Update README.md
PSNAppz Aug 26, 2018
82d36bb
Refactoring
KingAkeem Aug 26, 2018
69b2374
Fixing PyLint
KingAkeem Sep 14, 2018
4615455
Fixing imports
KingAkeem Sep 14, 2018
28a3133
A lot more refactoring
KingAkeem Sep 14, 2018
3409666
Trying to remove cyclic import error
KingAkeem Sep 14, 2018
9e31c33
Updating requirements
KingAkeem Sep 14, 2018
f32fe12
Updating README
KingAkeem Sep 14, 2018
b2f9967
Adding pyinstaller to requirements and to install script
KingAkeem Sep 14, 2018
551192f
Updating README
KingAkeem Sep 14, 2018
0fe0d58
Merge pull request #119 from KingAkeem/refactoring
PSNAppz Sep 15, 2018
1ac142d
Merge pull request #8 from DedSecInside/dev
KingAkeem Sep 20, 2018
4818fc7
Fixing indentation
KingAkeem Sep 20, 2018
f30e9f4
Merge pull request #122 from KingAkeem/dev
KingAkeem Sep 20, 2018
21fbce5
Correctly generating tree and tree visuals
KingAkeem Oct 6, 2018
393a796
Code cleanup
KingAkeem Oct 6, 2018
3fedf98
Correcting variable names and function parameters
KingAkeem Oct 6, 2018
352aa48
Code cleanup
KingAkeem Oct 6, 2018
46562ba
Adding ability to set file name and correctly setting depth
KingAkeem Oct 6, 2018
9b27256
Adding visualizer module
KingAkeem Oct 7, 2018
742f218
Fixing test
KingAkeem Oct 7, 2018
0b5f42f
Minor refactoring
KingAkeem Oct 8, 2018
96d95cd
remove Inappropriate space
tharudaya Oct 10, 2018
fa8eb53
Merge pull request #128 from tharudaya/dev
PSNAppz Oct 10, 2018
b374363
Removed extra param from hasattr
rmad17 Oct 12, 2018
35d9a80
Merge pull request #132 from rmad17/dev
KingAkeem Oct 12, 2018
581ed44
Fix cprints by adding commas separating text and color parameters
42B Oct 12, 2018
20982c3
Merge pull request #133 from 42B/fix_cprints
KingAkeem Oct 12, 2018
ed6b89c
Removing example file
KingAkeem Oct 12, 2018
920327b
Refactoring colors and adding try/catch for displaying urls
KingAkeem Oct 12, 2018
638ee2a
More refacotring
KingAkeem Oct 12, 2018
a6042d6
Created LinkTree class
KingAkeem Oct 13, 2018
28a761d
Update CHANGELOG.md
BlackBox712 Oct 13, 2018
a26d5d7
Merge pull request #134 from Sankalp00/patch-1
PSNAppz Oct 13, 2018
aba3e77
Adding LinkTree class
KingAkeem Oct 13, 2018
73ec8f2
Fixing tests and more refactoring
KingAkeem Oct 13, 2018
5eec3c8
Merge branch 'dev' into add_visualizer_module
KingAkeem Oct 13, 2018
197a2a6
Fixing imports
KingAkeem Oct 13, 2018
6106962
Merge branch 'add_visualizer_module' of github.com:DedSecInside/TorBo…
KingAkeem Oct 13, 2018
03b682d
More refactoring
KingAkeem Oct 13, 2018
bd27e54
Upgrading yaml to 3.6 to use f-strings
KingAkeem Oct 13, 2018
a4edd62
Fixing typ
KingAkeem Oct 13, 2018
c28c268
Fixing another tyo
KingAkeem Oct 13, 2018
c22da46
More refactoring
KingAkeem Oct 14, 2018
baa56cd
Fixing travis.yml
KingAkeem Oct 14, 2018
ff36007
Removing unnecessary comment
KingAkeem Oct 14, 2018
1801d03
Adding consistency
KingAkeem Oct 14, 2018
eb3aaa6
More uniformity for tests
KingAkeem Oct 14, 2018
5bbfe1d
Finishing up refactoring
KingAkeem Oct 14, 2018
23efaac
Finishing up refactoring
KingAkeem Oct 14, 2018
141675c
Merge branch 'add_visualizer_module' of github.com:DedSecInside/TorBo…
KingAkeem Oct 14, 2018
64a9ee4
Fixing last tests
KingAkeem Oct 14, 2018
f06586e
CodeFactor
KingAkeem Oct 14, 2018
f9bed35
Merge pull request #125 from DedSecInside/add_visualizer_module
PSNAppz Oct 14, 2018
1bd5b9a
Update README.md
PSNAppz Oct 16, 2018
2c93f03
Update README.md
PSNAppz Oct 16, 2018
540badf
Beginning refactoring with classes
KingAkeem Oct 18, 2018
80b3cb4
TorBot working again after refactoring
KingAkeem Oct 18, 2018
2a07f88
Adding PyQt5 to requirements.txt
KingAkeem Oct 18, 2018
f85a5eb
More passing tests
KingAkeem Oct 18, 2018
2d1a767
Fixed another test :)
KingAkeem Oct 18, 2018
d051f1b
Removing Go because we don't need it and fixing requirements
KingAkeem Oct 19, 2018
8f55cf0
Beginning to add tests for analyzer module
KingAkeem Oct 19, 2018
dadac88
Updating install script
KingAkeem Oct 19, 2018
66a2ccd
Fixed Usage and Updated Version to 1.3
PSNAppz Oct 19, 2018
fa08e5e
Fixed Usage and Updated Version to 1.3
PSNAppz Oct 19, 2018
a707dd7
Catch both ConnectionErrors
KingAkeem Oct 19, 2018
56d227e
Merge branch 'refactoring' of https://github.com/KingAkeem/TorBot int…
PSNAppz Oct 19, 2018
43fbe96
Removing ununsed flag
KingAkeem Oct 19, 2018
146c623
Catching ChunkEncodingError
KingAkeem Oct 19, 2018
ac22ab5
Update torBot.py
PSNAppz Oct 19, 2018
8664b2b
Merge pull request #139 from KingAkeem/refactoring
PSNAppz Oct 19, 2018
a7310c5
Ignore .DS_Store
PSNAppz Oct 19, 2018
88ce98b
Updated CHANGELOG & Adding Hound Automation
PSNAppz Oct 19, 2018
f78b6d9
Adding coveralls to .travis.yml
PSNAppz Oct 19, 2018
2fda2a8
Adding coveralls to .travis.yml
PSNAppz Oct 19, 2018
0ceb61b
Adding coveralls to .travis.yml
PSNAppz Oct 19, 2018
65bee14
Adding coveralls to .travis.yml
PSNAppz Oct 19, 2018
191e1f1
Adding multithreading to tree generation
KingAkeem Oct 20, 2018
3a55f25
Removing debugging print statement
KingAkeem Oct 20, 2018
f847f83
Fixing houndbot complaints
KingAkeem Oct 20, 2018
a1815a5
Adding documentation and using properties for class
KingAkeem Oct 20, 2018
b36eb1c
Fixing tests and properties
KingAkeem Oct 20, 2018
dc36fa6
Fixing houndbot complaints
KingAkeem Oct 20, 2018
a9f61af
More documentation
KingAkeem Oct 20, 2018
3265e1b
Fixing houndbot complaints
KingAkeem Oct 20, 2018
5832d1a
Using correct property
KingAkeem Oct 20, 2018
a9d945d
Fixing properties and checking for existence of link before validation
KingAkeem Oct 20, 2018
1fd77b7
Merge pull request #144 from KingAkeem/multi_thread_tree
PSNAppz Oct 27, 2018
1bd1a52
Merge branch 'master' into dev
PSNAppz Oct 28, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
22 changes: 12 additions & 10 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -2,34 +2,36 @@
--------------------
All notable changes to this project will be documented in this file.

## 1.3.0 | Future
## 1.3.0 | Present

### Changed
* Moderate code improvements

* Major code improvements
* Updated README.md
* Updated dependencies
* Modularize and documented Golang library
* Using Golang library instead of Python library for getting links
* Refactored TorBot

### Added
* Unit tests for Golang Library

* Visualizer Module
* Download option to save Tree into different formats.
* DB module
* Installation shell script to create torBot binary
* Testing documentation for Golang test suite.
* Test for getting links that uses a Mock Object to reproduce tests without touching actual servers.
* Script for getting Golang dependencies
* Script for building Golang shared object
* Installs Golang dependencies when install.sh is executed
* BFS algorithm for crawling


## 1.2.0 | Present (Stable)
## 1.2.0 | Nov 16, 2017 - Oct 19, 2018

### Changed

* Major code improvements
* Pep 8 Standard
* Tests
* Library changes

### Added

* Documentation
* Save to JSON
* Testcase for Save to JSON
4 changes: 0 additions & 4 deletions install.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
#!/bin/bash

# Get Golang Dependencies
go get github.com/mgutz/ansi
go get golang.org/x/net/html

# Makes directory for dependencies and executable to be installed
mkdir -p tmp_build
mkdir -p tmp_dist
53 changes: 18 additions & 35 deletions modules/analyzer.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
"""
Module is used for analyzing link relationships
"""
import requests
from requests.exceptions import HTTPError

from bs4 import BeautifulSoup
from ete3 import Tree, TreeStyle, TextFace, add_face_to_node
from .getweblinks import get_urls_from_page
from .pagereader import read
from .link import LinkNode

class LinkTree:
"""
@@ -20,14 +17,14 @@ class LinkTree:
tld (bool): Decides whether or not to use additional top-level-domains besides .tor
stop_depth (int): Depth of which to stop searching for links
"""
def __init__(self, root, tld=False, stop_depth=1):
self._tree = build_tree(root, tld=tld, stop=stop_depth)
def __init__(self, root_node, *, tld=False, stop_depth=1):
self._tree = build_tree(root_node, tld=tld, stop=stop_depth)

def __len__(self):
return len(self._tree)

def __contains__(self, link):
return link in self._tree
return self._tree.search_nodes(name=link)

def save(self, file_name):
"""
@@ -57,25 +54,8 @@ def my_layout(node):
style.layout_fn = my_layout
self._tree.show(tree_style=style)

def get_node_children(link, tld):
"""
Returns children for link node

Args:
link (str): link node to get children for
tld (bool): Additional top-level-domains
Returns:
children (list): A list of children from linknode
"""
try:
resp = requests.get(link)
soup = BeautifulSoup(resp.text, 'html.parser')
children = get_urls_from_page(soup, tld)
except (HTTPError, ConnectionError):
children = []
return children

def initialize_tree(link, tld):
def initialize_tree(root_node):
"""
Creates root of tree
Args:
@@ -85,13 +65,11 @@ def initialize_tree(link, tld):
root (ete3.Tree): root node of tree
to_visit (list): Children of root node
"""
root = Tree(name=link)
html_content = read(link)
soup = BeautifulSoup(html_content, 'html.parser')
to_visit = get_urls_from_page(soup, extension=tld)
return root, to_visit
root = Tree(name=root_node.name)
children = root_node.get_children()
return root, children

def build_tree(link, tld, stop=1, *, rec=0, to_visit=None, tree=None):
def build_tree(link, *, tld, stop=1, rec=0, to_visit=None, tree=None):
"""
Builds tree using Breadth First Search. You can specify stop depth.
Rec & tree arguments are used for recursion.
@@ -111,7 +89,7 @@ def build_tree(link, tld, stop=1, *, rec=0, to_visit=None, tree=None):
tree (ete3.Tree): built tree
"""
if rec == 0:
tree, to_visit = initialize_tree(link, tld)
tree, to_visit = initialize_tree(link)

sub_tree = Tree(name=tree.name)

@@ -121,8 +99,13 @@ def build_tree(link, tld, stop=1, *, rec=0, to_visit=None, tree=None):

children_to_visit = list()
for link_name in to_visit:
link_node = sub_tree.add_child(name=link_name)
link_children = get_node_children(link_name, tld)
try:
node = LinkNode(link_name, tld=tld)
except (ValueError, ConnectionError, HTTPError):
continue

link_node = sub_tree.add_child(name=node.name)
link_children = node.get_children()
# No need to find children if we aren't going to visit them
if stop != rec + 1:
for child in link_children:
@@ -135,4 +118,4 @@ def build_tree(link, tld, stop=1, *, rec=0, to_visit=None, tree=None):
return sub_tree

new_tree = tree.add_child(sub_tree)
return build_tree(to_visit, tld, stop, rec=rec, tree=new_tree)
return build_tree(to_visit, tld=tld, stop=stop, rec=rec, tree=new_tree)
6 changes: 6 additions & 0 deletions modules/color.py
Original file line number Diff line number Diff line change
@@ -38,3 +38,9 @@ def __init__(self, message, selected):

def __str__(self):
return self._color + self._msg + COLORS['end']

def __add__(self, other):
return str(self) + other

def __radd__(self, other):
return other + str(self)
34 changes: 0 additions & 34 deletions modules/getemails.py

This file was deleted.

109 changes: 0 additions & 109 deletions modules/getweblinks.py

This file was deleted.

21 changes: 0 additions & 21 deletions modules/go_linker.py

This file was deleted.

4 changes: 2 additions & 2 deletions modules/info.py
Original file line number Diff line number Diff line change
@@ -5,11 +5,11 @@
from requests.exceptions import HTTPError
import requests

from .pagereader import read
from .link_io import LinkIO


def execute_all(link, *, display_status=False):
page, response = read(link, response=True, show_msg=display_status)
page, response = LinkIO.read(link, response=True, show_msg=display_status)
soup = BeautifulSoup(page, 'html.parser')
validation_functions = [get_robots_txt, get_dot_git, get_dot_svn, get_dot_git]
for validate_func in validation_functions:
2 changes: 0 additions & 2 deletions modules/lib/build.sh

This file was deleted.

3 changes: 0 additions & 3 deletions modules/lib/go_dep.sh

This file was deleted.

122 changes: 0 additions & 122 deletions modules/lib/go_get_urls.go

This file was deleted.

72 changes: 0 additions & 72 deletions modules/lib/go_get_urls.h

This file was deleted.

74 changes: 0 additions & 74 deletions modules/lib/go_get_urls_test.go

This file was deleted.

72 changes: 72 additions & 0 deletions modules/link.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import re

import requests
import requests.exceptions
import validators

from bs4 import BeautifulSoup
from .color import color

class LinkNode:

def __init__(self, link, *, tld=False):
if not self.valid_link(link):
raise ValueError("Invalid link format.")

self.tld = tld
self._children = []
self._emails = []

try:
self.response = requests.get(link)
except (requests.exceptions.ChunkedEncodingError, requests.exceptions.HTTPError, requests.exceptions.ConnectionError, ConnectionError) as err:
raise err

self._node = BeautifulSoup(self.response.text, 'html.parser')
if not self._node.title:
self.name = "TITLE NOT FOUND"
self.status = color(link, 'yellow')
else:
self.name = self._node.title.string
self.status = color(link, 'green')

def get_emails(self):
if self._emails:
return self._emails

children = self._node.find_all('a')
email_nodes = []
for child in children:
link = child.get('href')
if link and 'mailto' in link:
email_addr = link.split(':')
if self.valid_email(email_addr[1]) and len(email_addr) > 1:
email_nodes.append(email_addr[1])
self._emails = email_nodes
return email_nodes

def get_children(self):
if self._children:
return self._children

children = self._node.find_all('a')
child_nodes = []
for child in children:
link = child.get('href')
if link and self.valid_link(link):
child_nodes.append(link)

self._children = child_nodes
return child_nodes

@staticmethod
def valid_email(email):
if validators.email(email):
return True
return False

@staticmethod
def valid_link(link):
if validators.url(link):
return True
return False
85 changes: 85 additions & 0 deletions modules/link_io.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
"""
This module is used for reading HTML pages using either bs4.BeautifulSoup objects or url strings
"""
import requests.exceptions
from bs4 import BeautifulSoup

from .link import LinkNode
from .utils import multi_thread
from .color import color

class LinkIO:

@staticmethod
def display_children(link_node):
children = link_node.get_children()
sucess_msg = color(f'Links Found - {len(children)}', 'green')
print(sucess_msg + '\n' + '---------------------------------')
multi_thread(children, LinkIO.display)

@staticmethod
def read(link, *, response=False, show_msg=False, headers=None, schemes=None):
"""
Attempts to retrieve HTML from link
Args:
headers (dict)
schemes (list)
Returns:
resp.text (str): html from page
"""
headers = {'User-Agent': 'XXXX-XXXXX-XXXX'} if not headers else headers
# Attempts to connect directly to site if no scheme is passed
if not schemes:
if show_msg:
print(f'Attempting to connect to {link}')
if LinkNode.valid_link(link):
node = LinkNode(link, tld=True)
if response:
return node.response.text, node.response
return node.response.text

schemes = ['https://', 'http://'] if not schemes else schemes

for scheme in schemes:
temp_url = scheme + link
if show_msg:
print(f'Attempting to connect to {link}')
if LinkNode.valid_link(temp_url):
node = LinkNode(temp_url, tld=True)
if response:
return node.response.text, node.response
return node.response.text
raise ConnectionError

@staticmethod
def display(link):
"""
Prints the status of a link
"""
if LinkNode.valid_link(link):
try:
node = LinkNode(link, tld=True)
title = node.name
link_status = node.status
except (requests.exceptions.HTTPError, requests.exceptions.ConnectionError, ConnectionError):
title = 'Not Found'
link_status = color(link, 'red')

print("%-80s %-30s" % (link_status, title))


@staticmethod
def display_ip():
"""Returns users tor ip address
https://check.torproject.org/ tells you if you are using tor and it
displays your IP address which we scape and return
"""

page = LinkIO.read('https://check.torproject.org/', show_msg=True)
page = BeautifulSoup(page, 'html.parser')
ip_cont = page.find('strong')
ip_addr = ip_cont.renderContents()
ip_string = color(ip_addr.decode("utf-8"), 'yellow')
print(f'Tor IP Address: {ip_string}')
80 changes: 0 additions & 80 deletions modules/pagereader.py

This file was deleted.

48 changes: 48 additions & 0 deletions modules/tests/test_analyzer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import pytest
import requests_mock

from yattag import Doc
from ..analyzer import LinkTree
from ..link import LinkNode

def create_page(name):
doc, tag, _, line = Doc().ttl()
doc.asis('<!DOCTYPE html>')
with tag('html'):
line('title', name)
with tag('body'):
line('h1', 'Something')
return doc.getvalue()

def create_root_page_with_links(root, links):
doc, tag, _, line = Doc().ttl()
doc.asis('<!DOCTYPE html>')
with tag('html'):
line('title', root)
with tag('body'):
for link in links:
line('a', 'test', href=link)

return doc.getvalue()

@pytest.fixture
def test_links_in_tree():
links = ['http://dog.onion', 'http://cat.onion', 'http://foo.cnion']
with requests_mock.Mocker() as mock_connection:
root_page = create_root_page_with_links('http://root.onion', links)
for link in links:
page = create_page(link)
mock_connection.register_uri('GET', link, text=page)
mock_connection.register_uri('GET', 'http://root.onion', text=root_page)

node = LinkNode('http://root.onion')
tree = LinkTree(node)

for link in links:
assert link in tree

def test_run():
test_links_in_tree()

if __name__ == '__main__':
test_run()
57 changes: 0 additions & 57 deletions modules/tests/test_getemails.py

This file was deleted.

53 changes: 27 additions & 26 deletions modules/tests/test_getweblinks.py
Original file line number Diff line number Diff line change
@@ -6,10 +6,10 @@

from bs4 import BeautifulSoup
from yattag import Doc
from ..getweblinks import get_links
from ..link import LinkNode


def setup_html(test_links):
def setup_html(test_links, *, fail=False):
"""
Sets up test html containing links
@@ -23,7 +23,8 @@ def setup_html(test_links):
with tag('html'):
with tag('body'):
for data in test_links:
line('a', 'test_anchor', href=data)
if not fail:
line('a', 'test_anchor', href=data)

return doc.getvalue()

@@ -33,38 +34,38 @@ def test_get_links_fail():
"""
Test links that have incorrect scheme
"""
test_data = ['ssh://aff.ironsocket.tor',
'ftp://aff.ironsocket.tor',
'lol://wsrs.tor',
'dial://cmsgear.tor']
test_data = ['ssh://aff.ironsocket.onion',
'ftp://aff.ironsocket.onion',
'lol://wsrs.onion',
'dial://cmsgear.onion']

mock_html = setup_html(test_data)
mock_soup = BeautifulSoup(mock_html, 'html.parser')
mock_html = setup_html(test_data, fail=True)
with requests_mock.Mocker() as mock_connection:
for data in test_data:
mock_connection.register_uri('GET', data, text='Received')

result = get_links('test', test_html=mock_soup)
assert result == []

mock_connection.register_uri('GET', data, text=mock_html)
with pytest.raises(ValueError):
node = LinkNode(data)
result = node.get_children()
assert result == []

@pytest.fixture
def test_get_links_tor():
"""
Test links that return sucessfully
"""
test_data = ['https://aff.ironsocket.tor',
'https://aff.ironsocket.tor',
'https://wsrs.tor',
'https://cmsgear.tor']
test_data = ['https://aff.ironsocket.onion',
'https://aff.ironsocket.onion',
'https://wsrs.onion',
'https://cmsgear.onion']

mock_html = setup_html(test_data)
mock_soup = BeautifulSoup(mock_html, 'html.parser')
mock_link = 'http://test.tor'
with requests_mock.Mocker() as mock_connection:
for data in test_data:
mock_connection.register_uri('GET', data, text='Received')
mock_connection.register_uri('GET', mock_link, text=mock_html)

result = get_links('test', test_html=mock_soup, ext=['.tor'])
node = LinkNode(mock_link)
result = node.get_children()
assert result == test_data


@@ -86,14 +87,14 @@ def test_get_links_tld():
line('a', 'test_anchor', href=data)

mock_html = doc.getvalue()

mock_soup = BeautifulSoup(mock_html, 'html.parser')
mock_url = 'http://test.tor'
with requests_mock.Mocker() as mock_connection:
for data in test_data:
mock_connection.register_uri('GET', data, text='Received')
mock_connection.register_uri('GET', mock_url, text=mock_html)

result = get_links(data, test_html=mock_soup, ext=['.com', '.gov', '.net'])
assert result == test_data
node = LinkNode(mock_url)
links = node.get_children()
assert links == test_data


def test_run():
4 changes: 2 additions & 2 deletions modules/tests/test_pagereader.py
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@
import requests_mock

from yattag import Doc
from ..pagereader import read
from ..link_io import LinkIO


@pytest.fixture
@@ -36,7 +36,7 @@ def test_read():
mock_connection.register_uri('GET',
test_data[i][0],
text=test_data[i][1])
result = read(test_data[i][0])
result = LinkIO.read(test_data[i][0])
assert result == test_data[i][1]


2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -6,3 +6,5 @@ requests_mock==1.4.0
yattag==1.10.0
pyinstaller==3.4.0
ete3==3.1.1
PyQt5==5.11.3
validators==0.12.2
43 changes: 21 additions & 22 deletions torBot.py
Original file line number Diff line number Diff line change
@@ -5,11 +5,12 @@
import socket
import socks

from requests.exceptions import HTTPError

from modules.analyzer import LinkTree
from modules.getweblinks import get_links
from modules.color import color
from modules.pagereader import display_ip
from modules.getemails import get_mails
from modules.link_io import LinkIO
from modules.link import LinkNode
from modules.updater import updateTor
from modules.savefile import saveJson
from modules.info import execute_all
@@ -19,7 +20,7 @@
DEFPORT = 9050

# TorBot VERSION
__VERSION = "1.2"
__VERSION = "1.3"


def connect(address, port):
@@ -94,7 +95,7 @@ def get_args():
"""
parser = argparse.ArgumentParser(prog="TorBot",
usage="Gather and analayze data from Tor sites.")
parser.add_argument("-v", "--version", action="store_true",
parser.add_argument("--version", action="store_true",
help="Show current version of TorBot.")
parser.add_argument("--update", action="store_true",
help="Update TorBot to the latest stable version")
@@ -110,14 +111,12 @@ def get_args():
default=[],
help=' '.join(("Specifiy additional website",
"extensions to the list(.com , .org, .etc)")))
parser.add_argument("-l", "--live", action="store_true",
help="Check if websites are live or not (slow)")
parser.add_argument("-i", "--info", action="store_true",
help=' '.join(("Info displays basic info of the",
"scanned site, (very slow)")))
parser.add_argument("--visualize", action="store_true",
"scanned site")))
parser.add_argument("-v", "--visualize", action="store_true",
help="Visualizes tree of data gathered.")
parser.add_argument("--download", action="store_true",
parser.add_argument("-d", "--download", action="store_true",
help="Downloads tree of data gathered.")
return parser.parse_args()

@@ -128,7 +127,10 @@ def main():
"""
args = get_args()
connect(args.ip, args.port)
link = args.url
try:
node = LinkNode(args.url, tld=args.extension)
except (ValueError, HTTPError, ConnectionError) as err:
raise err

# If flag is -v, --update, -q/--quiet then user only runs that operation
# because these are single flags only
@@ -143,34 +145,31 @@ def main():
# If url flag is set then check for accompanying flag set. Only one
# additional flag can be set with -u/--url flag
if args.url:
display_ip()
LinkIO.display_ip()
# -m/--mail
if args.mail:
emails = get_mails(link)
emails = node.get_emails()
print(emails)
if args.save:
saveJson('Emails', emails)
# -i/--info
elif args.info:
execute_all(link)
execute_all(node.name)
if args.save:
print('Nothing to save.\n')
elif args.visualize:
tree = LinkTree(link, args.extension)
tree = LinkTree(node, tld=node.tld)
tree.show()
elif args.download:
tree = LinkTree(link, args.extension)
tree = LinkTree(node, tld=node.tld)
file_name = str(input("File Name (.pdf/.png/.svg): "))
tree.save(file_name)
else:
# Golang library isn't being used.
# links = go_linker.GetLinks(link, LOCALHOST, PORT, 15)
links = get_links(link, ext=args.extension, display_status=args.live)
LinkIO.display_children(node)
if args.save:
saveJson("Links", links)
saveJson("Links", node.get_children())
else:
print("usage: torBot.py [-h] [-v] [--update] [-q] [-u URL] [-s] [-m]",
"[-e EXTENSION] [-l] [-i]")
print("usage: See torBot.py -h for possible arguments.")

print("\n\n")