Pelusa Server

Description

Pelusa (Predictive Engine for Legitimate & Unverified Site Assessment) is a machine learning based application that predicts the legitimacy of a website based on the URL provided. It is built using FastAPI and PostgreSQL, deployed with Docker Compose.

Installation

To get started, clone the repository and run with Docker Compose.

git clone https://github.com/javi-aranda/pelusa-server
cd pelusa-server
docker-compose up  # add flag -d to run detached
docker-compose exec -T backend alembic upgrade head  # run SQLAlchemy migrations

That should run the application on http://localhost:8000.

Usage

You can get a more detailed reference of the API by visiting http://localhost:8000/docs. But mainly it consists of an endpoint api/v1/analysis that accepts JSON body with {"input": "<URL_TO_CHECK>"} and returns the legitimacy of the website (1 means potentially bad, 0 means potentially safe).

Those results are stored in a PostgreSQL database, which could be useful to train the model in a future or as persistence mechanism in case an URL is submitted multiple times in a short period of time.

Exploring the database

There is a pgAdmin instance running on http://localhost:5050 with credentials defined in .env file. After connecting to the PostgreSQL server, you can explore the database and run any query you want.

Dataset

The dataset used for training the model is handmade, it consists on 30000 URLs, 50% legitimate and 50% malicious.

Malicious websites were randomly sampled from PhishTank active threats and legitimate URLs were sampled from multiple Kaggle datasets. After extracting features for both types, the resulting dataset is phishing_dataset.csv

Training

The model is trained using a Random Forest Classifier with an accuracy of 94% over the training dataset and the code is available as a Jupyter Notebook in train.ipynb

Credits

This project was made keeping in mind FastAPI Starter as a reference, but bundling the frontend in a different repository, which is available in Pelusa React.

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.github		.github
backend		backend
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
docker-compose.ci.yml		docker-compose.ci.yml
docker-compose.override.yml		docker-compose.override.yml
docker-compose.yml		docker-compose.yml
env-template		env-template

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pelusa Server

Description

Installation

Usage

Exploring the database

Dataset

Training

Credits

About

Releases

Sponsor this project

Packages

Contributors 2

Languages

License

javi-aranda/pelusa-server

Folders and files

Latest commit

History

Repository files navigation

Pelusa Server

Description

Installation

Usage

Exploring the database

Dataset

Training

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 2

Languages

Packages