Twitter-sentiment-analysis

A sentiment analysis model trained using a Kaggle GPU. Sentiment140 Dataset, with 1.6 million tweets.

**Deployed on my personal Docker Hub repository: Click here

**Kaggle Notebook link: Kaggle notebook

Dataset (Sentiment140+GloVe)

Train/test split : 90% / 10%
Size : 1.6M samples
Link : Dataset

Model

Model type : Sequential, RNN, Binary classification
Optimizer : Adam
Loss function : Binary cross entropy
Outputs : Sentiment score [0;1]
Thresholds (fine-tuned): >=0.625 ---> "Positive", <0.625 ----> "Negative"
Best validation accuracy : 83%
F1-score : 0.8340
Version : 4

Metric	Score
Precision	Negative: 0.84; Positive: 0.82
Recall	Negative: 0.82; Positive: 0.84
F-1 score	Negative: 0.83; Positive: 0.83

Training

Training epochs : initially 50, but 22 with early stopping and a patience factor = 10
Training environment : Kaggle GPU

Architecture

Inferences (with Tensorflow Serving REST API)

Some results using Power BI + Python

Positive tweets

Negative tweets

Data by country (when available)

Useful scripts and notebooks

Notebooks

Training notebook

How inferences were made on our dataset

Data cleaning notebook

Data exploration notebook

Scripts

Link to the Tensorflow Sevring script

**There's also a useful script (command line runner) that converts .h5 models to TF SavedModel format here

Data collection (tweets about Messi and Ronaldo)

Collected using the Twitter API
Scripts for searching and saving 100*n tweets containing a keyword : Tweets about Messi & Tweets about Ronaldo

NOTE: Executing these scripts requires a developer account, as well as a bearer_token stored into a text file whose path is manually given into the code, or exported as an environment variable

Libraries

Deep Learning Framework : Tensorflow 2.6 or higher
Data visualization : Pandas, Seaborn, Matplotlib
Regular expressions builder : re
NLP library : NLTK
Train/test splitting, classification_report : Scikit-learn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Twitter-sentiment-analysis

Dataset (Sentiment140+GloVe)

Model

Training

Architecture

Inferences (with Tensorflow Serving REST API)

Some results using Power BI + Python

Positive tweets

Negative tweets

Data by country (when available)

Useful scripts and notebooks

Notebooks

Scripts

Data collection (tweets about Messi and Ronaldo)

Libraries

Files

README.md

Latest commit

History

README.md

File metadata and controls

Twitter-sentiment-analysis

Dataset (Sentiment140+GloVe)

Model

Training

Architecture

Inferences (with Tensorflow Serving REST API)

Some results using Power BI + Python

Positive tweets

Negative tweets

Data by country (when available)

Useful scripts and notebooks

Notebooks

Scripts

Data collection (tweets about Messi and Ronaldo)

Libraries