Skip to content

Fine-tuning a Large Language Model (LLM) for sentiment analysis. This repository includes data preprocessing, model training, evaluation, and inference for sentiment classification.

License

Notifications You must be signed in to change notification settings

KaushiML3/Fine-tuning-a-LLM-for-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fine-Tuning DistilBERT for Sentiment Analysis

This repository contains the code and workflow for fine-tuning the DistilBERT (distilbert-base-uncased) model from Hugging Face on a sentiment analysis task. The dataset used for training is sourced from Kaggle.

🚀 Features

Fine-tuning distilbert-base-uncased for sentiment classification. Data preprocessing and tokenization using Hugging Face Transformers. Model training and evaluation. Inference script to predict sentiment on new text samples.

📂 Dataset

The 3 datasets used for fine-tuning is available on Kaggle. You can download it using below links:

  1. IMDB dataset (Sentiment analysis) in CSV format link
  2. Sentiment Analysis Dataset link
  3. Stock News Sentiment Analysis(Massive Dataset) link
  4. final dataset in Huggingface link

📊 Data Preprocessing & Visualization

The dataset is cleaned, preprocessed, and visualized using Pandas, Matplotlib, and Seaborn. Open and run the notebook:

📜 Notebook: notebooks/data_preprocessing.ipynb

📦 Installation

Clone the repository and install the required dependencies:

``` python 
git clone https://github.com/KaushiML3/Fine-tuning-a-LLM-for-sentiment-analysis.git
cd your-repo-name
pip install -r requirements.txt
```

🛠 Training the Model

The DistilBERT model is fine-tuned using Hugging Face's Transformers library. Training includes learning rate scheduling, and evaluation metrics. Open and run the notebook:

  1. 📜 Notebook: notebook/Fine tune LLM with LoRA for sentiment analysis.ipynb

2.Alternatively, run the training script:

``` python 
python train.py
```

🔍 Inference

To test the model on new text inputs, run:

``` python 
    python app.py 
```

image image

  • sentiment analysis DistilBERT model demo

📄 Acknowledgments

  1. Hugging Face for the DistilBERT model.
  2. Kaggle for the dataset.

About

Fine-tuning a Large Language Model (LLM) for sentiment analysis. This repository includes data preprocessing, model training, evaluation, and inference for sentiment classification.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published