Use Cases

Sample Projects for Data Scientists in Training by V Granville, 2018

Audio Recognition

CNN Architectures for Large-Scale Audio Classification, S. Hershey et al, 2017
- vggish model used to generate google's AudioSet
- vggish model adapted for Keras
Audio Set: An ontology and human-labeled dataset for audio events, 2017
Large-Scale Audio Event Discovery in One Million YouTube Videos, A. Jansen et al, ICASSP 2017
How do I listen for a sound that matches a pre-recorded sound?
The Sound Sensor Alert App sentector

Speech to Text

Data Augmentation

Design

Games

Gesture Recognition

Using wearable sensors (phones, watches etc.)

Apps

Code repositories

Image Recognition

Face Recognition

Умные фотографии ВКонтакте, 2018 (Smart photos in Vkontakte)
FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff et al, 2015
- the model: FaceNet

Food Recognition

Image Captioning

Person Detection

Automatic Portrait Segmentation for Image Stylization by Xiaoyong Shen1 et al, 2016

Semantic Segmentation

Interpretability

The Building Blocks of Interpretability, 2018
GoogleNet for image classification is used as an example
Attributing a deep network’s prediction to its input features by MUKUND SUNDARARAJAN, 2017
- Integrated Gradients method
A unified approach to interpreting model predictions by Scott M Lundberg et al, 2017
"Why Should I Trust You?": Explaining the Predictions of Any Classifier by Marco Tulio Ribeiro et al, 2016
- Lime Framework: Explaining the predictions of any machine learning classifier
Monotonic Calibrated Interpolated Look-Up Tables by Maya Gupta et al, 2016
see Decision trees
see Distillation

Programming and ML

Predict defects

To type or not to type: quantifying detectable bugs in JavaScript by Gao et al, 2017
Predicting Defects for Eclipse by T Zimmermann at al, 2007
- used code complexity metrics as features and logistic regression for classification (if file/module has defects) and linear regression for ranking (how many defects)
Predicting Component Failures at Design Time by Adrian Schroter et al, 2006
- showed that design data such as import relationships can predict failures
- used the number of failures in a component as dependent variable and the imported resources used from this component as input features
Mining Version Histories to Guide Software Changes by T Zimmermann at al, 2004
- used apriory algorithm to predict likely changes in files/modules

Predict performance

https://codescene.io
3 ways AI will change project management for the better, 2017
A deep learning model for estimating story points by Morakot Choetkiertikul et al, 2016
- estimating story points based on long short-term memory and recurrent highway network

Searching code

Writing code

https://www.deepcode.ai

NLP

Chatbots

Behind the Chat: How E-commerce Robot Assistant AliMe Works, 2018
How I Used Deep Learning To Train A Chatbot To Talk Like Me (Sorta), 2017
- Short-Text Conversations generative model based on Tensorflow’s embedding_rnn_seq2seq() with custom dataset. Deployed as a Facebook chatbot using heroku (hosting)+express(frontend)+flask(backend)
Deep Learning for Chatbots, Part 1 – Introduction, 2016
Deep Learning for Chatbots, Part 2 – Implementing a Retrieval-Based Model in Tensorflow, 2016
https://github.com/gunthercox/ChatterBot
- Retrieval-based model based on naive Bayesian classification and search algorithms
- see Sequence to sequence
A Persona-Based Neural Conversation Model by Jiwei Li et al, 2016
Smart reply
- Smart Reply: Automated Response Suggestion for Emai by Anjuli Kannan et al, 2016
- Computer, respond to this email, 2015
Chatbot projects: https://github.com/fendouai/Awesome-Chatbot
see Chatbot platforms

Crossword question answerers

see Reverse dictionaries

Database queries

LEARNING A NATURAL LANGUAGE INTERFACE WITH NEURAL PROGRAMMER by Arvind Neelakantan et al, 2017
- weakly supervised, end-to-end neural network model mapping natural language queries to logical forms or programs that provide the desired response when executed on the database

Named entity resolution

Also known as deduplication and record linkage (but not entity recognition which is picking up the names and classifying them in running text)

Collective Entity Resolution in Familial Networks by Pigi Kouki et al, 2017
- combines machine learning (although not NNs) with collective inference
Entity Resolution Using Convolutional Neural Network by Ram DeepakGottapu et al, 2016
Adaptive Blocking: Learning to Scale Up Record Linkage by Mikhail Bilenko et al, 2006
- extremely high recall but low precision
https://stats.stackexchange.com/questions/136755/popular-named-entity-resolution-software

Reverse dictionaries

Other name is concept finders Return the name of a concept given a definition or description:

Learning to Understand Phrases by Embedding the Dictionary by Felix Hill et al, 2016
- used models: Bag-of-Words NLMs and LSTM
comparing definitions in a database to the input query, and returning the word whose definitionis ‘closest’ to that query
see RNNs (with LSTMs)
see bag-of-word

Text to Image

Text to Speech

Personality recognition

Mining Facebook Data for Predictive Personality Modeling (Dejan Markovikj,Sonja Gievska, Michal Kosinski, David Stillwell)
Personality Traits Recognition on Social Network — Facebook (Firoj Alam, Evgeny A. Stepanov, Giuseppe Riccardi)
The Relationship Between Dimensions of Love, Personality, and Relationship Length (Gorkan Ahmetoglu, Viren Swami, Tomas Chamorro-Premuzic)

Search

Neural Architecture Search with Reinforcement Learning by Barret Zoph et al, 2017
Can word2vec be used for search?
- alternative search queries can be built using approximate nearest neighbors in embedding vectors space of terms (using https://github.com/spotify/annoy e.g.)
- Improving Document Ranking with Dual Word Embeddings by Eric Nalisnick et al, 2016

Transfer Learning

Uber

Engineering More Reliable Transportation with Machine Learning and AI at Uber, 2017

Video recognition

Body recognition

Enabling full body AR with Mask R-CNN2Go by Fei Yang et al, 2018

Object detection

Scene Segmentation

Detects when one video (shot/scene/chapter) ends and another begins

Video captioning

Video classification

Learnable pooling with Context Gating for video classification by Antoine Miech et al, 2018
- Rank #1 at Google Cloud & YouTube-8M Video Understanding Challenge
- Slow for inference/training
- NOT a sequential problem
- Needs lots of data for training
- not clear about very long videos
The Monkeytyping Solution to the YouTube-8M Video Understanding Challenge, 2017
- Rank #2 at Google Cloud & YouTube-8M Video Understanding Challenge
Hierarchical Deep Recurrent Architecture for Video Understanding by Luming Tang et al, 2017
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? by Kensho Hara et al, 2017
- https://github.com/kenshohara/video-classification-3d-cnn-pytorch
  - trained on the Kinetics dataset from scratch using only RGB input
  - pretrained ResNeXt-101 achieved 94.5% and 70.2% on UCF-101 and HMDB-51
Appearance-and-Relation Networks for Video Classification by Limin Wang et al, 2017
- https://github.com/wanglimin/ARTNet
  - trained on the Kinetics dataset from scratch using only RGB input
  - 70.9% and 94.3% on HMDB51 UCF101
Five video classification methods implemented in Keras and TensorFlow by Matt Harvey, 2017
- https://github.com/harvitronix/five-video-classification-methods
Video Understanding: From Video Classification to Captioning by Jiajun Sun et al, 2017
Video Classification using Two Stream CNNs, 2016 code based on articles below
- Two-Stream Convolutional Networks for Action Recognition in Videos
- Fusing Multi-Stream Deep Networks for Video Classification
- Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification
- Towards Good Practices for Very Deep Two-Stream ConvNets
Beyond Short Snippets: Deep Networks for Video Classification by Joe Yue-Hei Ng et al, 2015
- In order to learn a global description of the video while maintaining a low computational footprint, we propose processing only one frame per second
Large-scale Video Classification with Convolutional Neural Networks by Andrej Karpathy et al, 2014
- 63.3% on UCF-101

Multiple Modalities

Multimodal Classification for Analysing Social Media by Chi Thang Duong et al, 2017
- Blog post: Detecting Emotions with CNN Fusion Models
- https://emoclassifier.github.io/

Open problems

Recycled goods (not solved, no dataset)
- Recycling symbols explained
- similar to traffic signs recognition
Safety symbols on cardboard boxes (not solved, no dataset)

Tools

50+ Useful Machine Learning & Prediction APIs, 2018
- Face and Image Recognition
- Text Analysis, NLP, Sentiment Analysis
- Language Translation
- Machine Learning and prediction
Command-line tricks data scientists
Deep Video Analytics
- Data-centric platform for Computer Vision
- https://github.com/akshayubhat/deepvideoanalytics

Google Cloud AutoML

Pros:

let users train their own custom machine learning algorithms from scratch, without having to write a single line of code
uses Transfer Learning (the more data and customers, the better results)
is fully integrated with other Google Cloud services (Google Cloud Storage to store data, use Cloud ML or Vision API to customize the model etc.)

Cons:

limited to image recognition (2018-Q1)
doesn't allow to download a trained model

Google Cloud ML Engine

Samples & Tutorials

Google Mobile Vision

Pros:

Detect Faces (finds facial landmarks such as the eyes, nose, and mouth; doesn't identifies a person)
Scan barcodes
Recognize Text

Cons:

Google Video Intelligence

Label Detection - Detect entities within the video, such as "dog", "flower" or "car"
Shot Change Detection - Detect scene changes within the video
Explicit Content Detection - Detect adult content within a video
Video Transcription - Automatically transcribes video content in English

Experiments Frameworks

Tools to help you configure, organize, log and reproduce experiments

https://www.reddit.com/r/MachineLearning/comments/5gyzqj/d_how_do_you_keep_track_of_your_experiments/, 2017
How to Plan and Run Machine Learning Experiments Systematically by Jason Brownlee, 2017
- using a speadsheet with a template
https://github.com/IDSIA/sacred

Jupyter Notebook

Microsoft Azure Bot Service

Microsoft Azure Machine Learning

Microsoft Cognitive Services

Microsoft Cognitive Toolkit

Syn Bot Oscova

finds similarity between the expressions
https://github.com/SynHub/syn-bot-samples
MS Visual Studio is required (doesn't work with VS Code)
activating Deep Learning feature requires license activating
number of requests to the server is limited by the license

TensorFlow

TensorFlow Hub

Playgrounds

IDEs

Repositories

https://github.com/bulutyazilim/awesome-datascience

Models

Decision Trees

Pros:

can model nonlinearities
are highly interpretable
do not require extensive feature preprocessing
do not require enormous data sets

Cons:

tend to overfit
- fixed by building a decision forest with boosting
unstable/undeterministic (generate different results while trained on the same data)
- fixed by using bootstrap aggregation/bagging (a boosted forest)
do mapping directly from the raw input to the label
- better use neural nets that can learn intermediate representations

Hyperparameters:

tree depth
maximum number of leaf nodes

Distillation

trains a model to mimic the behavior of a pretrained model so it can work independently of the pretrained model
can train the smaller model with unlabeled examples
not all target classes need to be represented in the distillation training set
reduces the need for regularization
Distilling the Knowledge in a Neural Network by Geoffrey Hinton et al, 2015
“Why Should I Trust You?” Explaining the Predictions of Any Classifier by Marco Tulio Ribeiro et al, 2016
Detecting Bias in Black-Box Models Using Transparent Model Distillation by Sarah Tan et al, 2017

Embedding models

https://github.com/Hironsan/awesome-embedding-models
gensim's word2vec (embedded words and phrases)
- online vocaburary update tutorial
- How to Develop Word Embeddings in Python with Gensim
gensim's doc2vec
https://github.com/jhlau/doc2vec
see recursive autoencoders
see bag-of-words models

Evolutionary Algorithms

Metrics of dataset quality

Statistical metrics
- descriptive statistics: dimensionality, unique subject counts, systematic replicates counts, pdfs, cdfs (probability and cumulative distribution fx's)
- cohort design
- power analysis
- sensitivity analysis
- multiple testing correction analysis
- dynamic range sensitivity
Numerical analysis metrics
- number of clusters
- PCA dimensions
- MDS space dimensions/distances/curves/surfaces
- variance between buckets/bags/trees/branches
- informative/discriminative indices (i.e. how much does the top 10 features differ from one another and the group)
- feature engineering differnetiators

Neural Networks

Approaches when our model doesn’t work:

Fetch more data
Add more layers to Neural Network
Try some new approach in Neural Network
Train longer (increase the number of iterations)
Change batch size
Try Regularisation
Check Bias Variance trade-off to avoid under and overfitting
Use more GPUs for faster computation

Back-propagation problems:

it requires labeled training data; while almost all data is unlabeled
the learning time does not scale well, which means it is very slow in networks with multiple hidden layers
it can get stuck in poor local optima, so for deep nets they are far from optimal.

Capsule Networks

Convolutional Neural Networks

Deep Residual Networks

Understand Deep Residual Networks — a simple, modular learning framework that has redefined state-of-the-art, 2017

Distributed Neural Networks

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer by Jeff Dean et al
PathNet: Evolution Channels Gradient Descent in Super Neural Networks by deepmind
Feature extraction - uses layers of a pretrained model as inputs to another model, effectively chaining two models together

Feed-Forward Neural Networks

Perceptrons

Gated Recurrent Neural Networks

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling by Junyoung Chung et al, 2014

Generative Adversarial Networks

Progressive Growing of GANs for Improved Quality, Stability, and Variation by Tero Karras et al, 2017

Long-Short Term Memory Networks

Exploring LSTMs, 2017
Understanding LSTM Networks by Christopher Olah, 2015
- “Almost all exciting results based on recurrent neural networks are achieved with [LSTMs].”
Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks by Graves & Schmidhuber, 2009
- showed that RNNs with LSTM are currently the best systems for reading cursive writing
LONG SHORT-TERM MEMORY by Hochreiter & Schmidhuber, 1997

Recurrent Neural Networks

Symmetrically Connected Networks

Hopfield Nets (without hidden units)
- Neural networks and physical systems with emergent collective computational abilities by Hopfield, 1982
Boltzmann machines (stochastic recurrent neural network with hidden units)
Restricted Boltzmann Machines by Salakhutdinov and Hinton, 2014
Deep Boltzmann Machines by Salakhutdinov and Hinton, 2012

Guidelines

AI at Google: our principles, 2018
Rules of Machine Learning: Best Practices for ML Engineering by Martin Zinkevich, 2018
Practical advice for analysis of large, complex data sets by PATRICK RILEY, 2016
What’s your ML test score? A rubric for ML production systems by Eric Breck, 2016
Machine Learning: The High Interest Credit Card of Technical Debt by D. Sculley et al, 2014
- Complex Models Erode Boundaries
  - Entanglement
  - Hidden Feedback Loops
  - Undeclared Consumers
- Data Dependencies Cost More than Code Dependencies
  - Unstable Data Dependencies
  - Underutilized Data Dependencies
  - Static Analysis of Data Dependencies
  - Correction Cascades
- System-level Spaghetti
  - Glue Code
  - Pipeline Jungles
  - Dead Experimental Codepaths
  - Configuration Debt
- Dealing with Changes in the External World
  - Fixed Thresholds in Dynamic Systems
  - When Correlations No Longer Correlate
  - Monitoring and Testing
Principles of Research Code by Charles Sutton, 2012
Patterns for Research in Machine Learning by Ali Eslami, 2012
Lessons learned developing a practical large scale machine learning system by Simon Tong, 2010
The Professional Data Science Manifesto
Machine Learning Glossary

Deep learning

Deep Learning: A Critical Appraisal by Gary Marcus, 2018
- Deep learning thus far is data hungry
- Deep learning thus far is shallow and has limited capacity for transfer
- Deep learning thus far has no natural way to deal with hierarchical structure
- Deep learning thus far has struggled with open-ended inference
- Deep learning thus far is not sufficiently transparent
- Deep learning thus far has not been well integrated with prior knowledge
- Deep learning thus far cannot inherently distinguish causation from correlation
- Deep learning presumes a largely stable world, in ways that may be problematic
- Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted
- Deep learning thus far is difficult to engineer with
Software 2.0 by Andrej Karpathy, 2017

Interview preparation

Acing AI Interviews

MOOC

Google oriented courses

https://developers.google.com/machine-learning/crash-course/
- for beginners, explains hard things with simple words
- from google gurus
- uses TensorFlow and codelabs
https://www.coursera.org/specializations/gcp-data-machine-learning
- shows how to use GCP for machine learning

Books

NLP

Natural Language Processing with Python by Steven Bird et al, 2014

Statistics

Datasets

Microsoft Research Open Data
- users can also copy datasets directly to an Azure based Data Science virtual machine

Audios

The VU sound corpus - based on https://freesound.org/ database
- See article The VU Sound Corpus by Emiel van Miltenburg et al
AudioSet - consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos

Images

Landmarks 2018
ImageNet
COCO
SUN
Caltech 256
Pascal
CIFAR-10 - 60000 32x32 colour images in 10 classes, with 6000 images per class
- commonly used to train image classifiers

Videos

Microsoft multimedia challenge dataset, 2017
- largest dataset in terms of sentence and vocabulary
- challenge: to automatically generate a complete and natural sentence to describe video content
Kinetics, 2017
YouTube-8M, 2017
- large, but annotations are slightly noisy and only video-level labels have been assigned (include frames that do not relate to target actions)
- youtube-dl - Command-line program to download videos from YouTube.com and other video sites
Sports-1M by A. Karpathy, 2016
- large, but annotations are slightly noisy and only video-level labels have been assigned (include frames that do not relate to target actions)
FCVID
ActivityNet
http://crcv.ucf.edu/data/UCF101.php 2013
Hollywood2
HMDB-51
CCV

Research Groups

Cartoons

The Browser of a Data Scientist

Jokes

A statistician drowned crossing a river that was only three feet deep on average

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
LICENSE		LICENSE
readme.md		readme.md

License

CoventryResearch/neuromantic

Folders and files

Latest commit

History

Repository files navigation

TOC

Use Cases

Audio Recognition

Speech to Text

Data Augmentation

Design

Games

Gesture Recognition

Using wearable sensors (phones, watches etc.)

Image Recognition

Face Recognition

Food Recognition

Image Captioning

Person Detection

Semantic Segmentation

Interpretability

Programming and ML

Predict defects

Predict performance

Searching code

Writing code

NLP

Chatbots

Crossword question answerers

Database queries

Named entity resolution

Reverse dictionaries

Sequence to sequence

Semantic analysis

Spelling

Summarization

Text to Image

Text to Speech

Personality recognition

Search

Transfer Learning

Uber

Video recognition

Body recognition

Object detection

Scene Segmentation

Video captioning

Video classification

Multiple Modalities

Open problems

Tools

Experiments Frameworks

Jupyter Notebook

Playgrounds

IDEs

Repositories

Models

Decision Trees

Distillation

Embedding models

Evolutionary Algorithms

Metrics of dataset quality

Neural Networks

Capsule Networks

Convolutional Neural Networks

Deep Residual Networks

Distributed Neural Networks

Feed-Forward Neural Networks

Gated Recurrent Neural Networks

Generative Adversarial Networks

Long-Short Term Memory Networks

Recurrent Neural Networks

Symmetrically Connected Networks

Guidelines

Deep learning

Interview preparation

MOOC

Google oriented courses

Books