Skip to content

strates-git/ml_training

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI/ML Training

A carefully-curated repository of Artificial Intelligence (AI), Machine Learning (ML), and Data Science (DS) literature with practical implementations using Jupyter Notebook "skill-builders".


Tip of the Day

Check out our new Reading Rack where we review the important literature in each sub-field of ML. Here we tackle 10 papers a week so that we quickly learn the current state-of-the-art. We will also add tutorials to see the papers applied to real problems using python or matlab code.

1. Project Overview

The Data Science Boot Camp (DSBC) team has spent years developing tools and training materials for Applied Math and Data Science. In this project you will find the following tools.

  • "5 Questions a Data Scientist Should Ask Their Customer" is a one-page list of questions any data scientist should have with them to define the problem they are trying to solve and set expectations with their customer.
  • "Notes on ..." these are cheat sheets on various topics of math for Machine Learning, e.g. algebra, probability theory, etc., as well as Machine Learning topics, e.g. linear regression, hypothesis testing, etc.
  • "Machine Learning flow-charts" will help you navigate the enormous number of algorithms. These will help you select what algorithm or plot you need to complete your task, based on the data that you have and the desired outcome or story you are trying to tell.
  • "Cheat-Sheets" are for various Python toolboxes, e.g. Numpy, MatPlotLib, SciKitLearn, Keras, etc.
  • "10 steps to Data Science" is a series of notebooks to teach you the most common tools used in Data Science.
  • "10 steps to Machine Learning" is a series of notebooks to teach you some advanced tools used in Machine Learning.
  • "Python in 2 days" is a series of notebooks to get you started in Juypter notebooks and Python.
  • "Machine Learning in 1 day" is a series of notebooks focused on a basic toolbox for Machine Learning.
  • "Deep Learning in 1 day" is a series of notebooks focused on advanced CNNs, RNNs, GANs and Transformers.
  • "Examples" is a series of notebooks that most folks will find useful for a variety of real-life applications of Data Science.

2. The DSBC Approach

The DSBC team teaches AI/ML, and more broadly Data Science, with two approaches:

2.1 "Top-Down" Approach

In the Top-Down approach, you don't need to know the math, or be a deep expert in Python. We will teach you the tools, using industry best practices and rules-of-thumb, so that you will be a solid contributing member of a Data Science team.

To get started, we recommend the following self-paced tutorials performed in order:

  1. "Python in 2 days".
  2. "Machine Learning in 1 day".
  3. "Deep Learning in 1 day".

2.2 "Bottom-Up" Approach

In the Bottom-Up approach, we assume that you already have a solid foundation in math and/or computer science. We will teach you the algorithms, both how to manipulate and optimize them for your application. With these powerful skills, you will be a technical leader enabling the full potential of a Data Science team.

To get started, we recommend that you review the following:

  1. ML technical notes.
  2. "Math Refresher for Machine Learning".
  3. "Machine Learning: A Conceptual Approach".

3. New to Python and Jupyter Notebooks?

Have you heard about Jupyter Notebooks, but don't know how to get started? Here is a quick tutorial.

4. Additional Resources

AML links

Google Resources colab.research.google.com - Google Colab Free Jupyter Notebooks and GPUs - great for sharing and training. Spun up resources only last for 12 hours though.

Research Paper Sites

Datasets

Conference Videos

Training Curriculum - All Practical based fast.ai - free, and very good. machinelearningmastery.com - Jason Brownlee does a great job breaking down the latest algorithms and model implementations. His guides are excellent as well.

GitHub Pages

Peformance Benchmarks

Other links

5. License

This project is licensed under the MIT License - see the LICENSE.md file for details.

6. Acknowledgments

This repo was built using material from our private industry and academic experience, as well as material borrowed from:

  • Kao - UCLA ECE 239AS.
  • Eaton - UPenn CIS 419/519.
  • Ungar - UPenn CIS 520.
  • Ng - Stanford CS 229.
  • Andrew Ng - Machine Learning Yearning
  • Robert Tibshirani and Trevor Hastie - An Introduction to Statistical Learning with Applications in R.
  • Jake VanderPlas - Python Data Science Handbook.
  • Jason Brownlee - Machine Learning Mastery.
  • Towards Data Science (various articles).
  • Medium (various articles).
  • Randy Olson's data analysis and machine learning projects.
  • Many thanks to Andreas Mueller for some of his examples in the Machine Learning section. We drew inspiration from several of his excellent examples.
  • Many thanks to Kaggle for the datasets.
  • Numerous others that we cannot remember.


About

Self-Paced Machine Learning Tutorials

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%