1
1
<pre >
2
2
3
3
4
- ████████╗ ██████╗ ██████╗ ██████╗ ██████╗ ████████╗
5
- ╚══██╔══╝██╔═══██╗██╔══██╗ ██╔══██╗██╔═████╗╚══██╔══╝
6
- ██║ ██║ ██║██████╔╝ ██████╔╝██║██╔██║ ██║
7
- ██║ ██║ ██║██╔══██╗ ██╔══██╗████╔╝██║ ██║
8
- ██║ ╚██████╔╝██║ ██║ ██████╔╝╚██████╔╝ ██║
9
- ╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═╝
4
+ ████████╗ ██████╗ ██████╗ ██████╗ ██████╗ ████████╗
5
+ ╚══██╔══╝██╔═══██╗██╔══██╗ ██╔══██╗██╔═████╗╚══██╔══╝
6
+ ██║ ██║ ██║██████╔╝ ██████╔╝██║██╔██║ ██║
7
+ ██║ ██║ ██║██╔══██╗ ██╔══██╗████╔╝██║ ██║
8
+ ██║ ╚██████╔╝██║ ██║ ██████╔╝╚██████╔╝ ██║
9
+ ╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═╝
10
10
11
-
12
-
13
- `.` `
14
- ``.:.--.`
15
- .-+++/-`
16
- `+sso:`
17
- `` /yy+.
18
- -+.oho.
19
- o../+y
20
- -s.-/:y:`
21
- .:o+-`--::oo/-`
22
- `/o+:.```---///oss+-
23
- .+o:.``...`-::-+++++sys-
24
- :y/```....``--::-yooooosh+
25
- -h-``--.```..-:-::ssssssssd+
26
- h:``:.``....`--:-++hsssyyyym.
27
- .d.`/.``--.```:--//odyyyyyyym/
28
- `d.`+``:.```.--/-+/smyyhhhhhm:
29
- os`./`/````/`-/:+oydhhhhhhdh`
30
- `so.-/-:``./`.//osmddddddmd.
31
- /s/-/:/.`/..+/ydmdddddmo`
32
- `:oosso/:+/syNmddmdy/.
33
- `-/++oosyso+/.`
34
-
35
-
36
- ██████╗ ███████╗██████╗ ███████╗██████╗ ██████╗ ██╗███╗ ██╗███████╗██╗██████╗ ███████╗
37
- ██╔══██╗██╔════╝██╔══██╗██╔════╝╚════██╗██╔════╝ ██║████╗ ██║██╔════╝██║██╔══██╗██╔════╝
38
- ██║ ██║█████╗ ██║ ██║███████╗ █████╔╝██║ ██║██╔██╗ ██║███████╗██║██║ ██║█████╗
39
- ██║ ██║██╔══╝ ██║ ██║╚════██║ ╚═══██╗██║ ██║██║╚██╗██║╚════██║██║██║ ██║██╔══╝
40
- ██████╔╝███████╗██████╔╝███████║██████╔╝╚██████╗ ██║██║ ╚████║███████║██║██████╔╝███████╗
41
- ╚═════╝ ╚══════╝╚═════╝ ╚══════╝╚═════╝ ╚═════╝ ╚═╝╚═╝ ╚═══╝╚══════╝╚═╝╚═════╝ ╚══════╝
11
+
12
+
13
+ `.` `
14
+ ``.:.--.`
15
+ .-+++/-`
16
+ `+sso:`
17
+ `` /yy+.
18
+ -+.oho.
19
+ o../+y
20
+ -s.-/:y:`
21
+ .:o+-`--::oo/-`
22
+ `/o+:.```---///oss+-
23
+ .+o:.``...`-::-+++++sys-
24
+ :y/```....``--::-yooooosh+
25
+ -h-``--.```..-:-::ssssssssd+
26
+ h:``:.``....`--:-++hsssyyyym.
27
+ .d.`/.``--.```:--//odyyyyyyym/
28
+ `d.`+``:.```.--/-+/smyyhhhhhm:
29
+ os`./`/````/`-/:+oydhhhhhhdh`
30
+ `so.-/-:``./`.//osmddddddmd.
31
+ /s/-/:/.`/..+/ydmdddddmo`
32
+ `:oosso/:+/syNmddmdy/.
33
+ `-/++oosyso+/.`
34
+
35
+
36
+ ██████╗ ███████╗██████╗ ███████╗██████╗ ██████╗ ██╗███╗ ██╗███████╗██╗██████╗ ███████╗
37
+ ██╔══██╗██╔════╝██╔══██╗██╔════╝╚════██╗██╔════╝ ██║████╗ ██║██╔════╝██║██╔══██╗██╔════╝
38
+ ██║ ██║█████╗ ██║ ██║███████╗ █████╔╝██║ ██║██╔██╗ ██║███████╗██║██║ ██║█████╗
39
+ ██║ ██║██╔══╝ ██║ ██║╚════██║ ╚═══██╗██║ ██║██║╚██╗██║╚════██║██║██║ ██║██╔══╝
40
+ ██████╔╝███████╗██████╔╝███████║██████╔╝╚██████╗ ██║██║ ╚████║███████║██║██████╔╝███████╗
41
+ ╚═════╝ ╚══════╝╚═════╝ ╚══════╝╚═════╝ ╚═════╝ ╚═╝╚═╝ ╚═══╝╚══════╝╚═╝╚═════╝ ╚══════╝
42
42
43
43
44
44
45
45
</pre >
46
46
47
47
## A python web crawler for Deep and Dark Web.
48
48
[ ![ Build Status] ( https://travis-ci.org/DedSecInside/TorBoT.svg?branch=master )] ( https://travis-ci.org/DedSecInside/TorBoT )
49
- [ ![ ] ( https://img.shields.io/badge/Donate-Bitcoin-blue.svg?style=flat-square )] ( https://blockchain.info/address/14st7SzDbQZuu8fpQ74x477WoRJ7gpHFaj )
50
- [ ![ forthebadge ] ( http ://forthebadge.com/images/badges/built-with-love .svg)] ( http://forthebadge.com )
51
- [ ![ forthebadge ] ( http ://forthebadge.com/images/badges/made-with-python .svg)] ( http://forthebadge.com )
49
+ [ ![ ] ( https://img.shields.io/badge/Donate-Bitcoin-blue.svg?style=flat )] ( https://blockchain.info/address/14st7SzDbQZuu8fpQ74x477WoRJ7gpHFaj )
50
+ [ ![ ] ( https ://img.shields.io/badge/Built%20with-❤-orange .svg?style=flat )] ( )
51
+ [ ![ ] ( https ://img.shields.io/badge/Made%20with-Python-red .svg?style=flat )] ( )
52
52
53
53
54
54
### Working Procedure/Basic Plan
@@ -65,62 +65,69 @@ the following steps:
65
65
8 . After all URLs are processed, return the most relevant page.
66
66
67
67
### Features
68
- 1 . Crawls Tor links (.onion) only.
69
- 2 . Returns Page title and address.
70
- 3 . Cache links so that there won't be duplicate links.
68
+ 1 . Crawls Tor links (.onion).(Completed)
69
+ 2 . Returns Page title and address with a short description about the site.(Not Started)
70
+ 3 . Save links to database.(Not Started)
71
+ 4 . Get emails from site.(Completed)
72
+ 5 . Save crawl info to JSON file.(Completed)
73
+ 6 . Crawl custom domains.(Completed)
74
+ 7 . Check if the link is live.(Not Started)
75
+ 8 . Built-in Updater.(Completed)
71
76
...(will be updated)
72
77
73
78
## Contribute
74
79
Contributions to this project are always welcome.
75
- To add a new feature fork this repository and give a pull request when your new feature is tested and complete.
80
+ To add a new feature fork the dev branch and give a pull request when your new feature is tested and complete.
76
81
If its a new module, it should be put inside the modules directory and imported to the main file.
77
82
The branch name should be your new feature name in the format <Feature_featurename_version(optional)>. For example, <i >Feature_FasterCrawl_1.0</i >.
78
83
Contributor name will be updated to the below list. : D
79
84
80
85
## Dependencies
81
86
1 . Tor
82
- 2 . Python 3.x (Make sure pip3 is there )
83
- 3 . Python Stem Module
84
- 4 . urllib
85
- 5 . Beautiful Soup 4
86
- 6 . Socket
87
- 7 . Sock
88
- 8 . Argparse
89
- 9 . Stem module
90
- 10 . Git
87
+ 2 . Python 3.x (Make sure pip3 is installed )
88
+ 3 . requests
89
+ 4 . Beautiful Soup 4
90
+ 5 . Socket
91
+ 6 . Sock
92
+ 7 . Argparse
93
+ 8 . Git
94
+ 9 . termcolor
95
+ 10 . tldextract
91
96
92
97
## Basic setup
93
98
Before you run the torBot make sure the following things are done properly:
94
99
95
100
* Run tor service
96
101
` sudo service tor start `
97
102
98
- * Set a password for tor
99
- ` tor --hash-password "my_password" `
100
-
101
- * Give the password inside torbot.py
102
- `from stem.control import Controller
103
- with Controller.from_port(port = 9051) as controller:
104
- controller.authenticate("your_password_hash")
105
- controller.signal(Signal.NEWNYM)`
103
+ * Make sure that your torrc is configured to SOCKS_PORT localhost:9050
106
104
107
- ` python3 torBot.py `
108
- `usage: torBot.py [ -h] [ -q] [ -u URL] [ -m] [ -e EXTENSION] [ -l]
105
+ ` python3 torBot.py or use the -h/--help argument `
106
+ <pre >
107
+ `usage: torBot.py [-h] [-v] [--update] [-q] [-u URL] [-s] [-m] [-e EXTENSION]
108
+ [-l] [-i]
109
109
110
110
optional arguments:
111
- -h, --help show this help message and exit
112
- -q, --quiet
113
- -u URL, --url URL Specifiy a website link to crawl
111
+ -h, --help Show this help message and exit
112
+ -v, --version Show current version of TorBot.
113
+ --update Update TorBot to the latest stable version
114
+ -q, --quiet Prevent header from displaying
115
+ -u URL, --url URL Specifiy a website link to crawl, currently returns links on that page
116
+ -s, --save Save results to a file in json format
114
117
-m, --mail Get e-mail addresses from the crawled sites
115
118
-e EXTENSION, --extension EXTENSION
116
119
Specifiy additional website extensions to the
117
120
list(.com or .org etc)
118
- -l, --live Check if websites are live or not (slow)`
121
+ -l, --live Check if websites are live or not (slow)
122
+ -i, --info Info displays basic info of the scanned site (very
123
+ slow)` </pre >
124
+
125
+ * NOTE: All flags under -u URL, --url URL must also be passed a -u flag.
119
126
120
127
Read more about torrc here : [ Torrc] ( https://github.com/DedSecInside/TorBoT/blob/master/Tor.md )
121
128
122
129
## TO-DO
123
- A TO-DO list will be added here as soon as its complete.
130
+ - [ ] Implement A \* Search for webcrawler
124
131
125
132
### Have ideas?
126
133
If you have new ideas which is worth implementing, mention those by starting a new issue with the title [ FEATURE_REQUEST] .
@@ -133,7 +140,11 @@ GNU Public License
133
140
134
141
- [X] [ P5N4PPZ] ( https://github.com/PSNAppz ) - Owner
135
142
- [X] [ agrepravin] ( https://github.com/agrepravin ) - Contributor,Reviewer
136
- - [X] [ y-mehta] ( https://github.com/y-mehta ) - Contributer
143
+ - [X] [ y-mehta] ( https://github.com/y-mehta ) - Contributor
144
+ - [X] [ Manfredi Martorana] ( https://github.com/Agostinelli ) - Contributor
145
+ - [X] [ KingAkeem] ( https://github.com/KingAkeem ) - Contributor
146
+ - [X] [ Evan Sia Wai Suan] ( https://github.com/waisuan ) - New Contributor
147
+
137
148
138
149
![ ] ( https://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Opensource.svg/200px-Opensource.svg.png )
139
150
0 commit comments