- Use Cases
- Tools
- Playgrounds
- IDEs
- Repositories
- Models
- Guidelines
- Interview preparation
- Books
- MOOC
- Datasets
- Research groups
- Cartoons
- CNN Architectures for Large-Scale Audio Classification, S. Hershey et al, 2017
- Audio Set: An ontology and human-labeled dataset for audio events, 2017
- Large-Scale Audio Event Discovery in One Million YouTube Videos, A. Jansen et al, ICASSP 2017
- How do I listen for a sound that matches a pre-recorded sound?
- The Sound Sensor Alert App sentector
- Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone, 2018
- Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation by Ariel Ephrat et al, 2018
- https://github.com/facebookresearch/wav2letter
- Data Augmentation Techniques in CNN using Tensorflow, 2017
- How HBO’s Silicon Valley built “Not Hotdog” with mobile TensorFlow, Keras & React Native, 2017
- My solution for the Galaxy Zoo challenge, 2014
- Facebook Open Sources ELF OpenGo, 2018
- Mastering the game of Go without human knowledge by David Silver et al, 2017
- Physical Human Activity Recognition Using Wearable Sensors by Ferhat Attal et al, 2015
- Activity Recognition with Smartphone Sensors by Xing Su et al, 2014
- Motion gesture detection using Tensorflow on Android
- Run or Walk : Detecting Motion Activity Type with Machine Learning and Core ML
- Android DetectedActivity class
- Android ActivityRecognitionApi
Apps
Code repositories
- https://github.com/droiddeveloper1/android-wear-gestures-recognition
- https://github.com/drejkim/AndroidWearMotionSensors
- MobileNetV2: The Next Generation of On-Device Computer Vision Networks, 2018
- Large-Scale Evolution of Image Classifiers by Esteban Real et al, 2017
- Rethinking the Inception Architecture for Computer Vision by Christian Szegedy et al, 2015
- Inception in TensorFlow - 1.4M images and 1000 classes
- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications by Andrew G. Howard et al, 2017
- Going Deeper with Convolutions by C. Szegedy et al, 2014
- ImageNet Classification with Deep Convolutional Neural Networks by Alex Krizhevsky et al, 2012
- ImageNet
- the model is based on CNN
- Xception: Deep Learning with Depthwise Separable Convolutions by François Chollet, 2017
- ImageNet Classification with Deep Convolutional Neural Networks by Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, 2012
- Умные фотографии ВКонтакте, 2018 (Smart photos in Vkontakte)
- FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff et al, 2015
- the model: FaceNet
- NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment by Simon Mezgec et al, 2017
- uses 520 food and drink items (in Slovene) and the Google Custom Search API to search for these images
- Food Classification with Deep Learning in Keras / Tensorflow, 2017
- Im2Calories: towards an automated mobile vision food diary by Austin Myers et al, 2015
- Food 101 Dataset, 2014
- Calories nutrition dataset
- Exploring the Limits of Weakly Supervised Pretraining by Dhruv Mahajan et al, 2018
- Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge by Oriol Vinyals et al, 2016
- What do we learn from region based object detectors (Faster R-CNN, R-FCN, FPN)? 2018
- What do we learn from single shot object detectors (SSD, YOLOv3), FPN & Focal loss (RetinaNet)? 2018
- Design choices, lessons learned and trends for object detections?
- Semantic Image Segmentation with DeepLab in Tensorflow, 2018
- model DeepLab-v3+ built on top of CNN
- https://github.com/facebookresearch/Detectron, see links to articles at the end of the page
- Rethinking Atrous Convolution for Semantic Image Segmentation by Liang-Chieh Chen et al, 2017
- DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs by Liang-Chieh Chen et al, 2017
- Mask R-CNN by Kaiming He et al, 2017
- The Building Blocks of Interpretability, 2018
- GoogleNet for image classification is used as an example
- Attributing a deep network’s prediction to its input features by MUKUND SUNDARARAJAN, 2017
- Integrated Gradients method
- A unified approach to interpreting model predictions by Scott M Lundberg et al, 2017
- "Why Should I Trust You?": Explaining the Predictions of Any Classifier by Marco Tulio Ribeiro et al, 2016
- Monotonic Calibrated Interpolated Look-Up Tables by Maya Gupta et al, 2016
- see Decision trees
- see Distillation
- TREE-TO-TREE NEURAL NETWORKS FOR PROGRAM TRANSLATION by Xinyun Chen et al, 2018
- Software is eating the world, but ML is going to eat software by Erik Meijer, Facebook, 2018
- A Survey of Machine Learning for Big Code and Naturalness by Miltiadis Allamanis et al, 2017
- To type or not to type: quantifying detectable bugs in JavaScript by Gao et al, 2017
- Predicting Defects for Eclipse by T Zimmermann at al, 2007
- used code complexity metrics as features and logistic regression for classification (if file/module has defects) and linear regression for ranking (how many defects)
- Predicting Component Failures at Design Time by Adrian Schroter et al, 2006
- showed that design data such as import relationships can predict failures
- used the number of failures in a component as dependent variable and the imported resources used from this component as input features
- Mining Version Histories to Guide Software Changes by T Zimmermann at al, 2004
- used apriory algorithm to predict likely changes in files/modules
- https://codescene.io
- 3 ways AI will change project management for the better, 2017
- A deep learning model for estimating story points by Morakot Choetkiertikul et al, 2016
- estimating story points based on long short-term memory and recurrent highway network
- Deep code search by Xiaodong Gu1 et al, 2018
- How To Create Natural Language Semantic Search For Arbitrary Objects With Deep Learning, 2018
- Text Embedding Models Contain Bias. Here's Why That Matters, 2018
- How to Clean Text for Machine Learning with Python
- Behind the Chat: How E-commerce Robot Assistant AliMe Works, 2018
- How I Used Deep Learning To Train A Chatbot To Talk Like Me (Sorta), 2017
- Short-Text Conversations generative model based on Tensorflow’s embedding_rnn_seq2seq() with custom dataset. Deployed as a Facebook chatbot using heroku (hosting)+express(frontend)+flask(backend)
- Deep Learning for Chatbots, Part 1 – Introduction, 2016
- Deep Learning for Chatbots, Part 2 – Implementing a Retrieval-Based Model in Tensorflow, 2016
- https://github.com/gunthercox/ChatterBot
- Retrieval-based model based on naive Bayesian classification and search algorithms
- see Sequence to sequence
- A Persona-Based Neural Conversation Model by Jiwei Li et al, 2016
- Smart reply
- Chatbot projects: https://github.com/fendouai/Awesome-Chatbot
- see Chatbot platforms
- LEARNING A NATURAL LANGUAGE INTERFACE WITH NEURAL PROGRAMMER by Arvind Neelakantan et al, 2017
- weakly supervised, end-to-end neural network model mapping natural language queries to logical forms or programs that provide the desired response when executed on the database
Also known as deduplication and record linkage (but not entity recognition which is picking up the names and classifying them in running text)
- Collective Entity Resolution in Familial Networks by Pigi Kouki et al, 2017
- combines machine learning (although not NNs) with collective inference
- Entity Resolution Using Convolutional Neural Network by Ram DeepakGottapu et al, 2016
- Adaptive Blocking: Learning to Scale Up Record Linkage by Mikhail Bilenko et al, 2006
- extremely high recall but low precision
- https://stats.stackexchange.com/questions/136755/popular-named-entity-resolution-software
Other name is concept finders Return the name of a concept given a definition or description:
- Learning to Understand Phrases by Embedding the Dictionary by Felix Hill et al, 2016
- used models: Bag-of-Words NLMs and LSTM
- comparing definitions in a database to the input query, and returning the word whose definitionis ‘closest’ to that query
- see RNNs (with LSTMs)
- see bag-of-word
- Smart Compose: Using Neural Networks to Help Write Emails, 2018
- Introducing Semantic Experiences with Talk to Books and Semantris by Rey Kurzweil et al, 2018
- Keras LSTM tutorial – How to easily build a powerful deep learning language model by Andy, 2018
- Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models by Louis Shao et al, 2017
- trained on a combined data set of over 2.3B conversation messages mined from the web
- The model: LSTM on tensorflow
- Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features by Matteo Pagliardini et al, 2017
- the model: Sent2Vec based on vec2vec
- Skip-Thought Vectors by Ryan Kiros et al, 2015
- based on RNN encoder-decoder models
- Sequence to Sequence Learning with Neural Networks by Ilya Sutskever et al, 2014
- the model: seq2seq based on LSTM
- Distributed Representations of Sentences and Documents by Quoc V. Le, Mikolov, 2014
- Distributed Representations of Words and Phrases and their Compositionality by Tomas Mikolov et al, 2013
- word2vec based on Mikolov's Skip-gram model
- Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks by Richard Socher et al, 2010
- based on context-sensitive recursive neural networks (CRNN)
- see Reverse dictionaries
- How to calculate the sentence similarity using word2vec model
- Doc2Vec
- Average w2v vectors
- Weighted average w2v vectors (e.g. tf-idf)
- RNN-based embeddings (e.g. deep LSTM networks)
- Document Similarity With Word Movers Distance
- A SIMPLE BUT TOUGH-TO-BEAT BASELINE FOR SENTENCE EMBEDDINGS by Sanjeev Arora et al, 2017
- uses smooth inverse frequency
- computing the weighted average of word vectors in the sentence and then remove the projections of the average vectors on their first principal component
- example
- https://github.com/peter3125/sentence2vec - requires writing the get_word_frequency() method which can be easily accomplished by using Python's Counter() and returning a dict with keys: unique words w, values: #w/#total doc len
- Advances in Semantic Textual Similarity, 2018
- Semantic Textual Similarity Wiki, 2017
- A Deeper Look into Sarcastic Tweets Using Deep Convolutional Neural Networks by Soujanya Poria et al, 2017
- Twitter Sentiment Analysis Using Combined LSTM-CNN Models by SOSAVPM, 2018
- https://github.com/pmsosa/CS291K
- used pre-trained embeddings with LSTM-CNN model with dropouts
- 75.2% accuracy for binary classification (positive-negative tweet)
- doc2vec example, 2015
- ChatPainter: Improving text-to-image generation by using dialogue
- AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu et al, 2017
- Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention by Hideyuki Tachibana et al, 2017
- https://github.com/r9y9/ (Ryuichi Yamamoto)
- https://github.com/keithito/
- WaveNet: A Generative Model for Raw Audio, 2016
- Mining Facebook Data for Predictive Personality Modeling (Dejan Markovikj,Sonja Gievska, Michal Kosinski, David Stillwell)
- Personality Traits Recognition on Social Network — Facebook (Firoj Alam, Evgeny A. Stepanov, Giuseppe Riccardi)
- The Relationship Between Dimensions of Love, Personality, and Relationship Length (Gorkan Ahmetoglu, Viren Swami, Tomas Chamorro-Premuzic)
- Neural Architecture Search with Reinforcement Learning by Barret Zoph et al, 2017
- Can word2vec be used for search?
- alternative search queries can be built using approximate nearest neighbors in embedding vectors space of terms (using https://github.com/spotify/annoy e.g.)
- Improving Document Ranking with Dual Word Embeddings by Eric Nalisnick et al, 2016
- Deep Learning & Art: Neural Style Transfer – An Implementation with Tensorflow in Python
- Image Classification using Flowers dataset on Cloud ML Enginge
- YOLO: Real-Time Object Detection
- Mobile Real-time Video Segmentation, 2018
- integrated into Youtube stories
- Supercharge your Computer Vision models with the TensorFlow Object Detection API, 2017
- Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks by Michael Gygli, 2017
- Video Shot Boundary Detection based on Color Histogram by J. Mas and G. Fernandez, 2003
Detects when one video (shot/scene/chapter) ends and another begins
- Recurrent Switching Linear Dynamical Systems by Scott W. Linderman et al, 2016
- Video Scene Segmentation Using Markov Chain Monte Carlo by Yun Zha et al, 2006
- Automatic Video Scene Segmentation based on Spatial-Temporal Clues and Rhythm by Walid Mahdi et al, 2000
- DeepStory: Video Story QA by Deep Embedded Memory Networks by Kyung-Min Kim et al, 2017
- Video Understanding: From Video Classification to Captioning by Jiajun Sun et al, 2017
- Unsupervised Learning from Narrated Instruction Videos by Jean-Baptiste Alayrac et al, 2015
- Learnable pooling with Context Gating for video classification by Antoine Miech et al, 2018
- Rank #1 at Google Cloud & YouTube-8M Video Understanding Challenge
- Slow for inference/training
- NOT a sequential problem
- Needs lots of data for training
- not clear about very long videos
- The Monkeytyping Solution to the YouTube-8M Video Understanding Challenge, 2017
- Hierarchical Deep Recurrent Architecture for Video Understanding by Luming Tang et al, 2017
- Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? by Kensho Hara et al, 2017
- https://github.com/kenshohara/video-classification-3d-cnn-pytorch
- trained on the Kinetics dataset from scratch using only RGB input
- pretrained ResNeXt-101 achieved 94.5% and 70.2% on UCF-101 and HMDB-51
- https://github.com/kenshohara/video-classification-3d-cnn-pytorch
- Appearance-and-Relation Networks for Video Classification by Limin Wang et al, 2017
- https://github.com/wanglimin/ARTNet
- trained on the Kinetics dataset from scratch using only RGB input
- 70.9% and 94.3% on HMDB51 UCF101
- https://github.com/wanglimin/ARTNet
- Five video classification methods implemented in Keras and TensorFlow by Matt Harvey, 2017
- Video Understanding: From Video Classification to Captioning by Jiajun Sun et al, 2017
- Video Classification using Two Stream CNNs, 2016 code based on articles below
- Two-Stream Convolutional Networks for Action Recognition in Videos
- Fusing Multi-Stream Deep Networks for Video Classification
- Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification
- Towards Good Practices for Very Deep Two-Stream ConvNets
- Beyond Short Snippets: Deep Networks for Video Classification by Joe Yue-Hei Ng et al, 2015
- In order to learn a global description of the video while maintaining a low computational footprint, we propose processing only one frame per second
- Large-scale Video Classification with Convolutional Neural Networks by Andrej Karpathy et al, 2014
- 63.3% on UCF-101
- Recycled goods (not solved, no dataset)
- Recycling symbols explained
- similar to traffic signs recognition
- Safety symbols on cardboard boxes (not solved, no dataset)
- 50+ Useful Machine Learning & Prediction APIs, 2018
- Face and Image Recognition
- Text Analysis, NLP, Sentiment Analysis
- Language Translation
- Machine Learning and prediction
- Command-line tricks data scientists
- Deep Video Analytics
- Data-centric platform for Computer Vision
- https://github.com/akshayubhat/deepvideoanalytics
Pros:
- let users train their own custom machine learning algorithms from scratch, without having to write a single line of code
- uses Transfer Learning (the more data and customers, the better results)
- is fully integrated with other Google Cloud services (Google Cloud Storage to store data, use Cloud ML or Vision API to customize the model etc.)
Cons:
- limited to image recognition (2018-Q1)
- doesn't allow to download a trained model
Pros:
- Detect Faces (finds facial landmarks such as the eyes, nose, and mouth; doesn't identifies a person)
- Scan barcodes
- Recognize Text
Cons:
- Label Detection - Detect entities within the video, such as "dog", "flower" or "car"
- Shot Change Detection - Detect scene changes within the video
- Explicit Content Detection - Detect adult content within a video
- Video Transcription - Automatically transcribes video content in English
Tools to help you configure, organize, log and reproduce experiments
- https://www.reddit.com/r/MachineLearning/comments/5gyzqj/d_how_do_you_keep_track_of_your_experiments/, 2017
- How to Plan and Run Machine Learning Experiments Systematically by Jason Brownlee, 2017
- using a speadsheet with a template
- https://github.com/IDSIA/sacred
- finds similarity between the expressions
- https://github.com/SynHub/syn-bot-samples
- MS Visual Studio is required (doesn't work with VS Code)
- activating Deep Learning feature requires license activating
- number of requests to the server is limited by the license
Pros:
- can model nonlinearities
- are highly interpretable
- do not require extensive feature preprocessing
- do not require enormous data sets
Cons:
- tend to overfit
- fixed by building a decision forest with boosting
- unstable/undeterministic (generate different results while trained on the same data)
- fixed by using bootstrap aggregation/bagging (a boosted forest)
- do mapping directly from the raw input to the label
- better use neural nets that can learn intermediate representations
Hyperparameters:
- tree depth
- maximum number of leaf nodes
- trains a model to mimic the behavior of a pretrained model so it can work independently of the pretrained model
- can train the smaller model with unlabeled examples
- not all target classes need to be represented in the distillation training set
- reduces the need for regularization
- Distilling the Knowledge in a Neural Network by Geoffrey Hinton et al, 2015
- “Why Should I Trust You?” Explaining the Predictions of Any Classifier by Marco Tulio Ribeiro et al, 2016
- Detecting Bias in Black-Box Models Using Transparent Model Distillation by Sarah Tan et al, 2017
- https://github.com/Hironsan/awesome-embedding-models
- gensim's word2vec (embedded words and phrases)
- gensim's doc2vec
- https://github.com/jhlau/doc2vec
- see recursive autoencoders
- see bag-of-words models
- Using Evolutionary AutoML to Discover Neural Network Architectures by by Esteban Real, 2018
- Regularized Evolution for Image Classifier Architecture Search by Esteban Real et al, 2018
- Welcoming the Era of Deep Neuroevolution by Jeff Clune, 2017
- Hierarchical Representations for Efficient Architecture Search by Hanxiao Liu et al, 2017
- Learning Transferable Architectures for Scalable Image Recognition by Barret Zoph et al, 2017
- Large-Scale Evolution of Image Classifiers by Esteban Real et al, 2017
- Evolving Neural Networks through Augmenting Topologies by Stanley and Miikkulainen, 2002
- Statistical metrics
- descriptive statistics: dimensionality, unique subject counts, systematic replicates counts, pdfs, cdfs (probability and cumulative distribution fx's)
- cohort design
- power analysis
- sensitivity analysis
- multiple testing correction analysis
- dynamic range sensitivity
- Numerical analysis metrics
- number of clusters
- PCA dimensions
- MDS space dimensions/distances/curves/surfaces
- variance between buckets/bags/trees/branches
- informative/discriminative indices (i.e. how much does the top 10 features differ from one another and the group)
- feature engineering differnetiators
Approaches when our model doesn’t work:
- Fetch more data
- Add more layers to Neural Network
- Try some new approach in Neural Network
- Train longer (increase the number of iterations)
- Change batch size
- Try Regularisation
- Check Bias Variance trade-off to avoid under and overfitting
- Use more GPUs for faster computation
Back-propagation problems:
- it requires labeled training data; while almost all data is unlabeled
- the learning time does not scale well, which means it is very slow in networks with multiple hidden layers
- it can get stuck in poor local optima, so for deep nets they are far from optimal.
- Understanding Hinton’s Capsule Networks by Max Pechyonkin, 2017
- Capsule Networks (CapsNets) – Tutorial, 2017
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer by Jeff Dean et al
- PathNet: Evolution Channels Gradient Descent in Super Neural Networks by deepmind
- Feature extraction - uses layers of a pretrained model as inputs to another model, effectively chaining two models together
- Perceptrons
- Exploring LSTMs, 2017
- Understanding LSTM Networks by Christopher Olah, 2015
- “Almost all exciting results based on recurrent neural networks are achieved with [LSTMs].”
- Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks by Graves & Schmidhuber, 2009
- showed that RNNs with LSTM are currently the best systems for reading cursive writing
- LONG SHORT-TERM MEMORY by Hochreiter & Schmidhuber, 1997
- The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy, 2015
- see Long-Short Term Memory Networks
- Hopfield Nets (without hidden units)
- Boltzmann machines (stochastic recurrent neural network with hidden units)
- Restricted Boltzmann Machines by Salakhutdinov and Hinton, 2014
- Deep Boltzmann Machines by Salakhutdinov and Hinton, 2012
- AI at Google: our principles, 2018
- Rules of Machine Learning: Best Practices for ML Engineering by Martin Zinkevich, 2018
- Practical advice for analysis of large, complex data sets by PATRICK RILEY, 2016
- What’s your ML test score? A rubric for ML production systems by Eric Breck, 2016
- Machine Learning: The High Interest Credit Card of Technical Debt by D. Sculley et al, 2014
- Complex Models Erode Boundaries
- Entanglement
- Hidden Feedback Loops
- Undeclared Consumers
- Data Dependencies Cost More than Code Dependencies
- Unstable Data Dependencies
- Underutilized Data Dependencies
- Static Analysis of Data Dependencies
- Correction Cascades
- System-level Spaghetti
- Glue Code
- Pipeline Jungles
- Dead Experimental Codepaths
- Configuration Debt
- Dealing with Changes in the External World
- Fixed Thresholds in Dynamic Systems
- When Correlations No Longer Correlate
- Monitoring and Testing
- Complex Models Erode Boundaries
- Principles of Research Code by Charles Sutton, 2012
- Patterns for Research in Machine Learning by Ali Eslami, 2012
- Lessons learned developing a practical large scale machine learning system by Simon Tong, 2010
- The Professional Data Science Manifesto
- Machine Learning Glossary
- Deep Learning: A Critical Appraisal by Gary Marcus, 2018
- Deep learning thus far is data hungry
- Deep learning thus far is shallow and has limited capacity for transfer
- Deep learning thus far has no natural way to deal with hierarchical structure
- Deep learning thus far has struggled with open-ended inference
- Deep learning thus far is not sufficiently transparent
- Deep learning thus far has not been well integrated with prior knowledge
- Deep learning thus far cannot inherently distinguish causation from correlation
- Deep learning presumes a largely stable world, in ways that may be problematic
- Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted
- Deep learning thus far is difficult to engineer with
- Software 2.0 by Andrej Karpathy, 2017
- https://developers.google.com/machine-learning/crash-course/
- for beginners, explains hard things with simple words
- from google gurus
- uses TensorFlow and codelabs
- https://www.coursera.org/specializations/gcp-data-machine-learning
- shows how to use GCP for machine learning
- Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference by Cameron Davidson-Pilon, 2015
- Statistics is Easy! by Dennis Shasha, 2010
- Microsoft Research Open Data
- users can also copy datasets directly to an Azure based Data Science virtual machine
- The VU sound corpus - based on https://freesound.org/ database
- AudioSet - consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos
- Landmarks 2018
- ImageNet
- COCO
- SUN
- Caltech 256
- Pascal
- CIFAR-10 - 60000 32x32 colour images in 10 classes, with 6000 images per class
- commonly used to train image classifiers
- Microsoft multimedia challenge dataset, 2017
- largest dataset in terms of sentence and vocabulary
- challenge: to automatically generate a complete and natural sentence to describe video content
- Kinetics, 2017
- YouTube-8M, 2017
- large, but annotations are slightly noisy and only video-level labels have been assigned (include frames that do not relate to target actions)
- youtube-dl - Command-line program to download videos from YouTube.com and other video sites
- Sports-1M by A. Karpathy, 2016
- large, but annotations are slightly noisy and only video-level labels have been assigned (include frames that do not relate to target actions)
- FCVID
- ActivityNet
- http://crcv.ucf.edu/data/UCF101.php 2013
- Hollywood2
- HMDB-51
- CCV
- DeepMind
- Facebook AI Research (FAIR)
- Google Brain
- Microsoft Research AI
- OpenAI
- Sentient Labs
- Uber Labs
The Browser of a Data Scientist
A statistician drowned crossing a river that was only three feet deep on average