Automated MTL supports two generalized multi-tasking, and recurrent deep learning architectures. Automated MTL uses the statistical regularities within the original dataset itself to reinforce the representations learned for the primary task. Automated MTL comes in two flavors: the CRNN (Cascaded Recurrent Neural Network) and the MRNN (Multi-tasking Recurrent Neural Network).
The automated MTL architectures have achieved state-of-the-art performance in sentiment analysis, topic prediction, and hashtag recommendation using a diverse set of text corpuses including Twitter, Rotten Tomatoes, and IMDB.
A side project of automated MTL resulted in the Infinite Data Pipeline which is built on Java, Apache Storm, Kafka, and the Twitter API. The Infinite Data Pipeline streams and preprocesses Twitter data online and directly injects the streamed data into a running Tensorflow topology.
- CUDNN (tested on cuDNN 5105)
- CUDA Drivers + NVIDIA Graphics Card with 5.0+ support (tested on GTX 1080)
- Apache Zookeeper (tested on version 3.4.6)
- Apache Storm (tested on version 0.9.5)
- Twitter API + Developer Credentials (tested on version 4.0.4)
- Theano (tested on version 0.8.2)
- Keras (tested on latest version as of January 9, 2017)
- Linux Based OS (tested on Ubuntu 16.04LTS)
- Install CUDA and cuDNN
- Apache Storm and Twitter API Setup
- Install keras and Theano
- Download Kafka 2.10
- Run systemStartMac.sh to start your Storm instance. Make sure
KAFKAHOME
is set correctly inscripts/startKafkaServer.sh
. - Edit
src/storm/pom.xml
with the appropriate Twitter credentials. Runmvn install
insidesrc/storm
to compile andmvn exec:java
to start the data collection and streaming.
- Run systemStartUbuntu.sh to start your Storm instance.
- Run runAPI.sh to open the Twitter stream and start collection. (Requires you to edit runAPI.sh with correct Twitter API credentials).
- Run tweetnet.py.
Note: The system start script opens five new terminals; Apache Zookeeper, the Nimbus, the Supervisor, StormUI, and the Kafka server. Each new open terminal requires sudo access and will request for the user's password. To view StormUI you can navigate to localhost:8080.
Note: In the CUDA setup, the section where you link cuda to cuda-7.5 is outdated.
Intead of following this step:
export CUDA_HOME=/usr/local/cuda-7.5
Make sure you using and linking CUDA v8.0:
export CUDA_HOME=/usr/local/cuda-8.0
Note: You will need to register for Twitter Developer credentials to run the data miner.