Skip to content

Commit 13841a8

Browse files
authored
Merge pull request #221 from DedSecInside/feature/torbotv2.1
Torbot v2.1.0
2 parents 79030c8 + 2aa776c commit 13841a8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+646
-16330
lines changed

.flake8

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[flake8]
2+
max-line-length = 119

.gitignore

+2-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ __pycache*
2121
__pycache__/
2222

2323
# Misc
24-
torBot
24+
2525
.*.swp
2626
.ropeproject/
2727
.idea/
@@ -33,3 +33,4 @@ venv/
3333
.DS_Store
3434
.env
3535
data/*.csv
36+
torbot/modules/nlp/training_data/

.hound.yml

-2
This file was deleted.

.style.yapf

+66
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
[style]
2+
based_on_style=pep8
3+
4+
# The column limit.
5+
column_limit=119
6+
7+
# Align closing bracket with visual indentation.
8+
align_closing_bracket_with_visual_indent=False
9+
10+
allow_split_before_dict_value = False
11+
12+
# Put closing brackets on a separate line, dedented, if the bracketed
13+
# expression can't fit in a single line. Applies to all kinds of brackets,
14+
# including function definitions and calls. For example:
15+
#
16+
# config = {
17+
# 'key1': 'value1',
18+
# 'key2': 'value2',
19+
# } # <--- this bracket is dedented and on a separate line
20+
#
21+
# time_series = self.remote_client.query_entity_counters(
22+
# entity='dev3246.region1',
23+
# key='dns.query_latency_tcp',
24+
# transform=Transformation.AVERAGE(window=timedelta(seconds=60)),
25+
# start_ts=now()-timedelta(days=3),
26+
# end_ts=now(),
27+
# ) # <--- this bracket is dedented and on a separate line
28+
dedent_closing_brackets=True
29+
30+
# Insert a space between the ending comma and closing bracket of a list,
31+
# etc.
32+
space_between_ending_comma_and_closing_bracket=False
33+
34+
# Split after the opening parenthesis which surrounds an expression if it doesn't
35+
# fit on a single line.
36+
split_before_expression_after_opening_paren=True
37+
38+
# Set to True to split list comprehensions and generators that have
39+
# non-trivial expressions and multiple clauses before each of these
40+
# clauses. For example:
41+
#
42+
# result = [
43+
# a_long_var + 100 for a_long_var in xrange(1000)
44+
# if a_long_var % 10]
45+
#
46+
# would reformat to something like:
47+
#
48+
# result = [
49+
# a_long_var + 100
50+
# for a_long_var in xrange(1000)
51+
# if a_long_var % 10]
52+
split_complex_comprehension=True
53+
54+
# Insert a blank line before a 'def' or 'class' immediately nested
55+
# within another 'def' or 'class'. For example:
56+
#
57+
# class Foo:
58+
# # <------ this blank line
59+
# def method():
60+
# ...
61+
blank_line_before_nested_class_or_def=True
62+
63+
# The i18n function call names. The presence of this function stops
64+
# reformatting on that line, because the string it has cannot be moved
65+
# away from the i18n comment.
66+
i18n_function_call=['_']

CHANGELOG.md

+31
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,37 @@
22
--------------------
33
All notable changes to this project will be documented in this file.
44

5+
## 2.1.0
6+
7+
### Added
8+
* GoTor API - A Golang implementation of Core TorBot functionality.
9+
* Phone number extractor - Extracts phone numbers from urls.
10+
* Integrated NLP module with TorBot
11+
* Major code refactoring
12+
13+
### Removed
14+
* No longer using the tree module
15+
* Poetry Implementation removed
16+
17+
## 2.0.0
18+
19+
### Added
20+
* Fix data collection and add progress indicator by @KingAkeem in #192
21+
* convert port to integer by @KingAkeem in #193
22+
* Use hiddenwiki.org as default URL for collecting data by @KingAkeem in #194
23+
* Bump jinja2 from 2.11.2 to 2.11.3 in /src/api by @dependabot in #200
24+
* Simplify LinkNode and add new display by @KingAkeem in #202
25+
* Remove live flag by @KingAkeem in #203
26+
* Poetry Implementation by @NeoLight1010 in #206
27+
* Delete .DS_Store by @stefins in #204
28+
* Fix the basic functionality of tree features by @KingAkeem in #214
29+
* Save results as json by @KingAkeem in #215
30+
* Organize data file location by @KingAkeem in #216
31+
* Add CodeTriage link and image by @KingAkeem in #213
32+
* Add website classification by @KingAkeem in #218
33+
* Use GoTor HTTP service by @KingAkeem in #219
34+
35+
536
## 1.4.0 | Present
637

738
### Added

CITATION.cff

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# @InProceedings{10.1007/978-981-15-0146-3_19,
2+
# author="Narayanan, P. S.
3+
# and Ani, R.
4+
# and King, Akeem T. L.",
5+
# editor="Ranganathan, G.
6+
# and Chen, Joy
7+
# and Rocha, {\'A}lvaro",
8+
# title="TorBot: Open Source Intelligence Tool for Dark Web",
9+
# booktitle="Inventive Communication and Computational Technologies",
10+
# year="2020",
11+
# publisher="Springer Singapore",
12+
# address="Singapore",
13+
# pages="187--195",
14+
# abstract="The dark web has turned into a dominant source of illegal activities. With several volunteered networks, it is becoming more difficult to track down these services. Open source intelligence (OSINT) is a technique used to gather intelligence on targets by harvesting publicly available data. Performing OSINT on the Tor network makes it a challenge for both researchers and developers because of the complexity and anonymity of the network. This paper presents a tool which shows OSINT in the dark web. With the use of this tool, researchers and Law Enforcement Agencies can automate their task of crawling and identifying different services in the Tor network. This tool has several features which can help extract different intelligence.",
15+
# isbn="978-981-15-0146-3"
16+
# }
17+
18+
cff-version: 1.2.0
19+
message: "If you use this software, please cite the following paper:"
20+
authors:
21+
- family-names: P. S.
22+
given-names: Narayanan
23+
affiliation: Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, India
24+
- family-names: Akeem T. L.
25+
given-names: King
26+
affiliation: USPA Technologies
27+
- family-names: R
28+
given-names: Ani
29+
affiliation: Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, India
30+
keywords:
31+
- tor
32+
- research
33+
- osint
34+
identifiers:
35+
- type: doi
36+
value: 10.1007/978-981-15-0146-3_19
37+
license: GNU Public License
38+
reposiory-code: https://github.com/DedSecInside/TorBot
39+
title: TorBot - Open Source Intelligence Tool for Dark Web
40+
date-released: 2020-01-30

README.md

+49-52
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ If its a new module, it should be put inside the modules directory.
4545
The branch name should be your new feature name in the format <Feature_featurename_version(optional)>. For example, <i>Feature_FasterCrawl_1.0</i>.
4646
Contributor name will be updated to the below list. 😀
4747
<br>
48+
4849
<b> NOTE : The PR should be made only to `dev` branch of TorBot. </b>
4950

5051
### OS Dependencies
@@ -54,53 +55,62 @@ Contributor name will be updated to the below list. 😀
5455

5556
### Python Dependencies
5657

57-
(see pyproject.toml for more detail)
58-
- beautifulsoup4
59-
- pyinstaller
60-
- PySocks
61-
- termcolor
62-
- requests
63-
- requests_mock
64-
- yattag
65-
- numpy
66-
58+
(see requirements.txt for more details)
59+
altgraph==0.17.2
60+
beautifulsoup4==4.11.1
61+
certifi==2022.5.18.1
62+
charset-normalizer==2.0.12
63+
decorator==5.1.1
64+
ete3==3.1.2
65+
idna==3.3
66+
macholib==1.16
67+
numpy==1.22.4
68+
progress==1.6
69+
pyinstaller==5.1
70+
pyinstaller-hooks-contrib==2022.7
71+
PySocks==1.7.1
72+
python-dotenv==0.20.0
73+
requests==2.28.0
74+
requests-mock==1.9.3
75+
six==1.16.0
76+
soupsieve==2.3.2.post1
77+
termcolor==1.1.0
78+
threadsafe==1.0.0
79+
urllib3==1.26.9
80+
validators==0.20.0
81+
yattag==1.14.0
82+
pyqt5==5.15.6 (Install using apt/brew if pip installation fails.)
6783
### Golang Dependencies
6884
- https://github.com/KingAkeem/gotor (This service needs to be ran in tandem with TorBot)
6985

70-
## Basic setup
86+
## Installation
87+
88+
### From source
7189
Before you run the torBot make sure the following things are done properly:
7290

7391
* Run tor service
7492
`sudo service tor start`
7593

7694
* Make sure that your torrc is configured to SOCKS_PORT localhost:9050
7795

78-
* Install [Poetry](https://python-poetry.org/docs/)
96+
* Open a new terminal and run `cd gotor && go run main.go -server`
7997

80-
* Disable Poetry virtualenvs (not required)
81-
`poetry config settings.virtualenvs.create false`
98+
* Install TorBot Python requirements using
99+
`pip install -r requirements.txt`
82100

83-
* Install TorBot Python requirements
84-
`poetry install`
101+
Finally run the following command
85102

86-
On Linux platforms, you can make an executable for TorBot by using the install.sh script.
87-
You will need to give the script the correct permissions using `chmod +x install.sh`
88-
Now you can run `./install.sh` to create the torBot binary.
89-
Run `./torBot` to execute the program.
90-
91-
An alternative way of running torBot is shown below, along with help instructions.
92-
93-
`python3 torBot.py or use the -h/--help argument`
103+
`python3 run.py -h`
94104
<pre>
95-
usage: torBot.py [-h] [-v] [--update] [-q] [-u URL] [-s] [-m] [-e EXTENSION]
105+
usage: run.py [-h] [-v] [--update] [-q] [-u URL] [-s] [-m] [-e EXTENSION]
96106
[-i]
97107

98108
optional arguments:
99109
-h, --help Show this help message and exit
100110
-v, --version Show current version of TorBot.
101111
--update Update TorBot to the latest stable version
102112
-q, --quiet Prevent header from displaying
103-
-u URL, --url URL Specifiy a website link to crawl, currently returns links on that page (if used alone e.g. python3 torBot.py -u https://www.github.com)
113+
-u URL, --url URL Specifiy a website link to crawl, currently returns links on that page (if used alone e.g. python3 run.py -u https://www.github.com)
104114
-s, --save Save results to a file in json format
105115
-m, --mail Get e-mail addresses from the crawled sites
106116
-e EXTENSION, --extension EXTENSION
@@ -113,11 +123,7 @@ optional arguments:
113123

114124
Read more about torrc here : [Torrc](https://github.com/DedSecInside/TorBoT/blob/master/Tor.md)
115125

116-
117-
#### Using the GUI
118-
119-
120-
#### Using Docker
126+
### Using Docker
121127

122128
- Ensure than you have a tor container running on port 9050.
123129
- Build the image using following command (in the root directory):
@@ -127,6 +133,14 @@ Read more about torrc here : [Torrc](https://github.com/DedSecInside/TorBoT/blob
127133

128134
`docker run --link tor:tor --rm -ti dedsecinside/torbot`
129135

136+
### Using executable (Linux Only)
137+
138+
On Linux platforms, you can make an executable for TorBot by using the install.sh script.
139+
You will need to give the script the correct permissions using `chmod +x install.sh`
140+
Now you can run `./install.sh` to create the torBot binary.
141+
Run `./torBot` to execute the program.
142+
143+
130144
## TO-DO
131145
- [x] Visualization Module
132146
- [x] Implement BFS Search for webcrawler
@@ -140,27 +154,8 @@ Read more about torrc here : [Torrc](https://github.com/DedSecInside/TorBoT/blob
140154
- [x] Increase efficiency
141155

142156
### Have ideas?
143-
If you have new ideas which is worth implementing, mention those by starting a new issue with the title [FEATURE_REQUEST].
144-
If the idea is worth implementing, congratz, you are now a contributor.
145-
146-
### Cite this [paper](https://link.springer.com/chapter/10.1007/978-981-15-0146-3_19)
147-
148-
@InProceedings{10.1007/978-981-15-0146-3_19,
149-
author="Narayanan, P. S.
150-
and Ani, R.
151-
and King, Akeem T. L.",
152-
editor="Ranganathan, G.
153-
and Chen, Joy
154-
and Rocha, {\'A}lvaro",
155-
title="TorBot: Open Source Intelligence Tool for Dark Web",
156-
booktitle="Inventive Communication and Computational Technologies",
157-
year="2020",
158-
publisher="Springer Singapore",
159-
address="Singapore",
160-
pages="187--195",
161-
abstract="The dark web has turned into a dominant source of illegal activities. With several volunteered networks, it is becoming more difficult to track down these services. Open source intelligence (OSINT) is a technique used to gather intelligence on targets by harvesting publicly available data. Performing OSINT on the Tor network makes it a challenge for both researchers and developers because of the complexity and anonymity of the network. This paper presents a tool which shows OSINT in the dark web. With the use of this tool, researchers and Law Enforcement Agencies can automate their task of crawling and identifying different services in the Tor network. This tool has several features which can help extract different intelligence.",
162-
isbn="978-981-15-0146-3"
163-
}
157+
If you have new ideas which is worth implementing, mention those by creating a new issue with the title [FEATURE_REQUEST].
158+
164159

165160

166161
### References
@@ -208,4 +203,6 @@ GNU Public License
208203
- [X] [SubaruSama](https://github.com/SubaruSama) - New Contributor
209204
- [X] [robly78746](https://github.com/robly78746) - New Contributor
210205

206+
... see all contributors here (https://github.com/DedSecInside/TorBot/graphs/contributors)
207+
211208

docker/Dockerfile

+10-7
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,24 @@
1-
FROM python:3
1+
FROM python:3.9
22
LABEL maintainer="dedsec_inside"
33

4-
# Install PyQt5
4+
55
RUN apt-get update \
6-
&& apt-get install -y --no-install-recommends python3-pyqt5 \
6+
&& apt-get install -y virtualenv \
7+
&& apt-get install -y tor \
78
&& apt-get clean \
89
&& rm -rf /var/lib/apt/lists/*
910

1011
WORKDIR /app
1112

1213
COPY . .
1314

14-
RUN pip install --no-cache-dir poetry
15-
RUN poetry config virtualenvs.create false
16-
RUN python -m poetry install --no-dev
15+
# Create virtual env
16+
RUN virtualenv venv --python=python3.9
17+
RUN source venv/bin/activate
18+
RUN pip install -r requirements.txt
19+
1720

1821
RUN chmod +x install.sh
1922
RUN bash install.sh
2023

21-
ENTRYPOINT ["./torBot", "--ip", "tor"]
24+
ENTRYPOINT ["./run.py", "--ip", "tor"]

gotor

Submodule gotor updated from ddf4a70 to d123947

install.sh

+5-13
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,17 @@
11
#!/bin/bash
22

33
# Makes directory for dependencies and executable to be installed
4-
mkdir -p tmp_build
4+
mkdir -p tmp_build
55
mkdir -p tmp_dist
66

7-
# attempt to install pyinstaller using pip, python3 is prioritized
8-
if command -v poetry &> /dev/null; then
9-
poetry install
10-
poetry update
11-
else
12-
echo "poetry is required for installation."
13-
exit 1
14-
fi
15-
7+
pip install pyinstaller
168

179
# Creates executable file and sends dependences to the recently created directories
18-
pyinstaller --onefile --workpath ./tmp_build --distpath ./tmp_dist --paths=src src/torBot.py
10+
pyinstaller --onefile --workpath ./tmp_build --distpath ./tmp_dist --paths=src torbot/main.py
1911

2012
# Puts the executable in the current directory
21-
mv tmp_dist/torBot .
13+
mv tmp_dist/torBot .
2214

2315
# Removes both directories and unneeded file
2416
rm -r tmp_build tmp_dist
25-
rm torBot.spec
17+
rm torBot.spec

0 commit comments

Comments
 (0)