Skip to content
View patmendoza330's full-sized avatar

Block or report patmendoza330

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
patmendoza330/README.md

Hi, I'm Pat Mendoza, a recent grad from Cornell. I enjoy figuring out ways to integrate and present data in an aesthetically pleasing and informative format. This often involves transforming files in python then integrating them and creating visualizations in R.

I've worked in SQL, R, and Python for several years and am currently working on uploading some examples of my work.

Here are a few that I have so far (more upcoming):

  1. Google Data Analytics Professional Certificate Capstone Project - Tableau Link
    • This is the final project for my certificate. I extract viewership data from an API and convert the JSON format into tabular data which gets loaded onto kaggle. Then I created a viz on Tableau that allows users to explore the data.
    1. Extracting the data - here I go through the code that allows an extraction of data from MyAnimeList via their API and convert them into tables.
    2. Cleaning the data - here I go through my cleaning process for the data so that I ensure that its ready for loading into Tableau and Kaggle.
  2. R
    1. Mirrorplot - this is creating a simple mirrorplot that can be good visualization for showing up/down regulated genes in an RNA-seq.
    2. Clustering Samples - clustering is a common exercise to determine how closely samples are related to eachother. This shows how samples can be clustered using a PCoA and PCA and visualizing using ggplot.
    3. Data Wrangling with tidyr and dplyr - converting and integrating data from multiple sources is often tricky business. Luckily there are some great tools available that make this a breeze.
  3. Python
    1. Converting files in non-tabular to tabular format - Oftentimes, we come across data that isn't in the form that we need to make joins, when that happens, we can convert those using simple python scripts
    2. Data Wrangling with pandas and numpy - This is a replica of the above R data wrangling but using python with pandas and numpy in place of tidyr and dplyr
  4. Misc Python (in development) - some misc scripts that I used to add in RNA-seq generation.

Popular repositories Loading

  1. animelistextract animelistextract Public

    Extract API information from MyAnimeList

    3

  2. crunchyrolltitles crunchyrolltitles Public

    Scraping Crunchyroll titles and dataset

    3 1

  3. redfinrentinc2021 redfinrentinc2021 Public

    1

  4. patmendoza330 patmendoza330 Public

    Config files for my GitHub profile.

  5. mirrorplot mirrorplot Public

    Creating a simple mirrorplot can be good visualization for showing up/down regulated genes in an RNA-seq. This details how to create a mirrorplot using ggplot2.

    R

  6. clustering clustering Public

    Clustering is a common exercise to determine how closely samples are related to each other. This shows how samples can be clustered using a PCoA and PCA and visualizing using ggplot. Particularly, …

    R