A carefully-curated repository of Artificial Intelligence (AI), Machine Learning (ML), and Data Science (DS) literature with practical implementations using Jupyter Notebook "skill-builders".
Check out our new Reading Rack where we review the important literature in each sub-field of ML. Here we tackle 10 papers a week so that we quickly learn the current state-of-the-art. We will also add tutorials to see the papers applied to real problems using python or matlab code.
The Data Science Boot Camp (DSBC) team has spent years developing tools and training materials for Applied Math and Data Science. In this project you will find the following tools.
- "5 Questions a Data Scientist Should Ask Their Customer" is a one-page list of questions any data scientist should have with them to define the problem they are trying to solve and set expectations with their customer.
- "Notes on ..." these are cheat sheets on various topics of math for Machine Learning, e.g. algebra, probability theory, etc., as well as Machine Learning topics, e.g. linear regression, hypothesis testing, etc.
- "Machine Learning flow-charts" will help you navigate the enormous number of algorithms. These will help you select what algorithm or plot you need to complete your task, based on the data that you have and the desired outcome or story you are trying to tell.
- "Cheat-Sheets" are for various Python toolboxes, e.g. Numpy, MatPlotLib, SciKitLearn, Keras, etc.
- "10 steps to Data Science" is a series of notebooks to teach you the most common tools used in Data Science.
- "10 steps to Machine Learning" is a series of notebooks to teach you some advanced tools used in Machine Learning.
- "Python in 2 days" is a series of notebooks to get you started in Juypter notebooks and Python.
- "Machine Learning in 1 day" is a series of notebooks focused on a basic toolbox for Machine Learning.
- "Deep Learning in 1 day" is a series of notebooks focused on advanced CNNs, RNNs, GANs and Transformers.
- "Examples" is a series of notebooks that most folks will find useful for a variety of real-life applications of Data Science.
In the Top-Down approach, you don't need to know the math, or be a deep expert in Python. We will teach you the tools, using industry best practices and rules-of-thumb, so that you will be a solid contributing member of a Data Science team.
To get started, we recommend the following self-paced tutorials performed in order:
In the Bottom-Up approach, we assume that you already have a solid foundation in math and/or computer science. We will teach you the algorithms, both how to manipulate and optimize them for your application. With these powerful skills, you will be a technical leader enabling the full potential of a Data Science team.
To get started, we recommend that you review the following:
- ML technical notes.
- "Math Refresher for Machine Learning".
- "Machine Learning: A Conceptual Approach".
Have you heard about Jupyter Notebooks, but don't know how to get started? Here is a quick tutorial.
AML links
- https://infosec-conferences.com/events-in-2019/artificial-intelligence-security-summit/
- http://pralab.diee.unica.it/en/node/1121
- https://evademl.org/
- https://github.com/yenchenlin/awesome-adversarial-machine-learning
Google Resources colab.research.google.com - Google Colab Free Jupyter Notebooks and GPUs - great for sharing and training. Spun up resources only last for 12 hours though.
- https://research.google.com/seedbank/seeds - Collection of Interactive Machine Learning Examples and Open Sourced Research
- https://developers.google.com/machine-learning/guides/ - ML Rules Guide for ML Projects
Research Paper Sites
- http://www.arxiv-sanity.com/ - Developed by Andrej Karpathy (Tesla Director of AI) - Nice concise display of the latest research papers in ML and AI. Much better than arxiv.
- https://openai.com/research/#publications - Research done by OpenAI
- https://www.microsoft.com/en-us/research/research-area/artificial-intelligence/publications/ - Microsoft Research Publications
Datasets
- The OG - https://archive.ics.uci.edu/ml/index.php
- Kaggle - https://www.kaggle.com/datasets
- Great List of datasets - https://medium.com/datadriveninvestor/the-50-best-public-datasets-for-machine-learning-d80e9f030279
Conference Videos
- http://scaledml.org/2019/ - Scaled ML - can see all previous years talks. Great speakers like Andrej Karpathy, Jeff Dean, Yangqing Jia, Francois Chollet, etc.
- https://www.ieee-security.org/TC/SPW2018/DLS/# - Deep Learning and Security Workshop
Training Curriculum - All Practical based fast.ai - free, and very good. machinelearningmastery.com - Jason Brownlee does a great job breaking down the latest algorithms and model implementations. His guides are excellent as well.
- https://aws.amazon.com/training/learning-paths/machine-learning/
- https://www.deeplearning.ai/ - Andrew Ng is amazing.
- https://www.udacity.com/school-of-ai - great but the most expensive of this list.
- https://www.coursera.org/learn/machine-learning also consider the deep learning specialization
- https://www.dataquest.io/ - $33.25/month
- https://www.datacamp.com/ - $29/month
GitHub Pages
- Facebook Research - https://github.com/facebookresearch
- OpenAI - https://github.com/openai
- OpenAI Gym is great for reinforcement learning.
- UPenn Research Lab - https://github.com/EpistasisLab
- ML Cheatsheet - https://github.com/bfortuner/ml-cheatsheet
- Udacity Deep Learning Nanodegree Code - https://github.com/udacity/deep-learning
- Good Character Level Language Model in Tensorflow - https://github.com/sherjilozair/char-rnn-tensorflow
Peformance Benchmarks
Other links
This project is licensed under the MIT License - see the LICENSE.md file for details.
This repo was built using material from our private industry and academic experience, as well as material borrowed from:
- Kao - UCLA ECE 239AS.
- Eaton - UPenn CIS 419/519.
- Ungar - UPenn CIS 520.
- Ng - Stanford CS 229.
- Andrew Ng - Machine Learning Yearning
- Robert Tibshirani and Trevor Hastie - An Introduction to Statistical Learning with Applications in R.
- Jake VanderPlas - Python Data Science Handbook.
- Jason Brownlee - Machine Learning Mastery.
- Towards Data Science (various articles).
- Medium (various articles).
- Randy Olson's data analysis and machine learning projects.
- Many thanks to Andreas Mueller for some of his examples in the Machine Learning section. We drew inspiration from several of his excellent examples.
- Many thanks to Kaggle for the datasets.
- Numerous others that we cannot remember.