Skip to content

advil64/whats-the-tea

Repository files navigation

What's the Tea

Web Application

Abstract

We have implemented Topic Classification for news articles to classify different articles into multiple topics in real-time. We have used a deep learning network model to classify news articles into 42 categories. We trained our classification model to classify different news articles, and then applied this model to real-time Tweets from various authorized Twitter news handles to predict the topics at any given time. We also allow users to view the top ’N’ most popular Twitter topics at any given time and see their related Tweets as well.

View the report here.

Data Source

We are using multiple datasets for this project:

HuffPost Dataset: https://www.kaggle.com/datasets/rmisra/news-category-dataset
RealNews Dataset: https://paperswithcode.com/dataset/realnews
News Aggregator Dataset: https://www.kaggle.com/datasets/uciml/news-aggregator-dataset
A Million News Headlines: https://www.kaggle.com/datasets/therohk/million-headlines
All the News 2.0: https://components.one/datasets/all-the-news-2-news-articles-dataset/
India News Headlines Dataset: https://www.kaggle.com/datasets/therohk/india-headlines-news-dataset

Installation

  • Create a .env file in the root directory with the following fields for Tweepy user authentication:
bearer_token=YOUR_BEARER_TOKEN
consumer_key=YOUR_CONSUMER_KEY
consumer_secret=YOUR_CONSUMER_SECRET
access_token=YOUR_ACCESS_TOKEN
access_token_secret=YOUR_ACCESS_TOKEN_SECRET
  • Install required libraries: pip install -r requirements.txt
  • In the deep_learning_clustering/twitter_dash directory, run the server: flask --app main.py run
  • Open a web browser and visit the following URL: http://127.0.0.1:5000/api/docs

Contributors


Advith Chegu

Vipul Gharde


Diksha Wuthoo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published