🤖 Sentiment Analysis Using TF-IDF and a Neural Network

📝 Project Overview

This project focuses on binary sentiment classification using book reviews. Given the text of a review, the goal is to predict whether the sentiment is positive or negative. This has been implemented by training a feedforward neural network for improved performance.

This work is part of my learning journey through the Break Through Tech AI Program, where I applied machine learning concepts in natural language processing (NLP).

📚 What the Project Covers

✅ 1. Define the ML Problem

Problem Type: Binary classification
Input: Raw text from book reviews
Output: Sentiment label (positive or negative)

🧹 2. Data Preprocessing

Tokenized and cleaned text reviews
Removed stopwords and punctuation
Transformed text using TfidfVectorizer with:
- Custom max_df and min_df thresholds
- Max features set to 3000

🤖 3. Model Development

Model Type: Feedforward Neural Network using tensorflow.keras
Architecture:
- Dense hidden layers with ReLU activation
- Dropout regularization
- Final sigmoid layer for binary output
Trained on the TF-IDF transformed text data
Used binary crossentropy loss and SGD optimizer

💾 4. Model Persistence

Saved both the trained model and the vectorizer using:
- model.save() from TensorFlow
- pickle.dump() for the TF-IDF vectorizer

📈 5. Evaluation

Accuracy, precision, recall, and F1-score calculated on test data
Plotted the precision-recall curve to analyze classifier performance

🔧 Technologies Used

Python
pandas, numpy
scikit-learn (TfidfVectorizer, metrics)
TensorFlow / Keras (modeling)
matplotlib, seaborn (visualizations)
pickle (persistence)

🎯 Key Takeaways

Implemented a custom neural network for NLP tasks
Tuned TF-IDF vectorizer parameters to improve model generalization
Understood the importance of precision-recall trade-offs
Learned to persist and reload models and vectorizers for deployment use cases

📌 How to Reuse the Model

Load the saved TF-IDF vectorizer from the .pkl file
Load the neural network using tensorflow.keras.models.load_model()
Pass new text data through the vectorizer, then predict with the model

🤔 Reflection

This project helped me solidify my understanding of:

NLP text vectorization techniques
Neural network design and optimization
Model persistence for real-world ML pipelines
Evaluation strategies beyond accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
DefineAndSolveMLProblem.ipynb		DefineAndSolveMLProblem.ipynb
Feedforward_NeuralNetwork.keras		Feedforward_NeuralNetwork.keras
README.md		README.md
bookReviewsData.csv		bookReviewsData.csv
tfidf_vectorizer.pkl		tfidf_vectorizer.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 Sentiment Analysis Using TF-IDF and a Neural Network

📝 Project Overview

📚 What the Project Covers

✅ 1. Define the ML Problem

🧹 2. Data Preprocessing

🤖 3. Model Development

💾 4. Model Persistence

📈 5. Evaluation

🔧 Technologies Used

🎯 Key Takeaways

📌 How to Reuse the Model

🤔 Reflection

About

Uh oh!

Releases

Packages

Languages

Shi-web/SentimentAnalyze

Folders and files

Latest commit

History

Repository files navigation

🤖 Sentiment Analysis Using TF-IDF and a Neural Network

📝 Project Overview

📚 What the Project Covers

✅ 1. Define the ML Problem

🧹 2. Data Preprocessing

🤖 3. Model Development

💾 4. Model Persistence

📈 5. Evaluation

🔧 Technologies Used

🎯 Key Takeaways

📌 How to Reuse the Model

🤔 Reflection

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages