Skip to content

Final year project for CS&AI degree at the University of Edinburgh

Notifications You must be signed in to change notification settings

cristeaadrian/dissertation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mining Reddit to Identify Factors that Describe Prominent Links Between Different Communities

This project attempted to identify factors that describe prominent links between online communities on Reddit. Data was mined and processed from an online storage service, called Google BigQuery. After processing this dataset, we created a social network graph, that identified nodes as subreddits, and shared users as edges. A weight was assigned to these links: whenever a user would post a comment in two subreddits, the weighted value for the edge would increase. After this process was completed, we chose four centrality measures, which are widely used in social networking analysis, to use as a measure for the popularity of subreddits.

We had three research objectives: replicate results from a previous study, find addi- tional factors that are unique to Reddit, and use an advanced sentiment analysis tool to analyse the content of the comments in context. For each one of our three research objectives, we analysed different types of factors that we could correlate to the results obtained previously, and found several predictors that helped explain the variance ob- tained in the results. Finally, we provided detailed discussion and interpretation by comparing it to previous research done in this field, while suggesting future areas to explore.

About

Final year project for CS&AI degree at the University of Edinburgh

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published