This repository contains Apache Spark based projects in either Python or Scala. It is intended that each directory contain both implementations. A comprehensive explanation each project and it's specifications are within the project's directory.
This project uses Spark's Streaming API to gather and process Twitter data, analyzing both live stream and historic data to answer some analysis questions such as the most common hashtag being used currently, the most common users mentioned by a specified user, the most common hashtags used by a specific user.
This project uses data from the Our World In Data and the IMF World Economies datasets to probe some interesting questions about the pandemic, it's effects on global economies, and an assessment of how countries responded to the pandemic.
This project uses data from the MeetUp.com API to decypher and chart trends in the data such as which state has the most MeetUp venues, are longer events or shorter events more popular, and what is the most common payment method.
This project uses data from CryptoDataDownload.com's historical exchange data for cryptocurrencies to find trends in price fluctuations as well as symbiotic movements in coin prices.