This repository has been archived by the owner on Mar 31, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 71
Development status
justheuristic edited this page May 4, 2016
·
4 revisions
The library is currently in active development and there is much to be done yet.
[priority] Component; no priority means "done"
-
Core components
-
Environment
-
Objective
-
Agent architecture
- MDP (RL) agent
- Generator
- Fully customizable agent
-
Experiment platform
- [high] Experiment setup zoo
- [medium] Pre-trained model zoo
- [medium] quick experiment running (
- experiment is defined as (environment, objective function, NN architecture, training algorithm)
-
Layers
-
Memory
- Simple RNN
- One-step GRU memory
- Custom GateLayer
- LSTM as GRU + output GateLayer
- Window augmentation (K last states)
- Stack Augmentation
- [low] List augmentation
- [low] Neural Turing Machine controller
-
Resolvers
- Greedy resolver (as BaseResolver)
- Epsilon-greedy resolver
- Probablistic resolver
- [High] Adversarial resolver (test if it works)
-
Learning objectives algorithms
- Q-learning
- SARSA
- k-step Q-learning
- k-step Advantage Actor-critic methods
- Can use any theano/lasagne expressions for loss, gradients and updates
- Experience replay pool
-
Experiment setups
- boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
- Wikicat - guessing person's traits based on wikipedia biographies
- [high, in progress] openAI gym training/evaluation api and demos
- [medium] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
- [medium] 2048-in-a-browser with Selenium
-
Visualization tools
- basic monitoring tools
- [medium] generic tunable session visualizer
-
Technical stuff
- [high] Ensuring Python3 compatibility
- including examples
- [medium] TensorFlow
- [high] Making tests out examples
- [high] Ensuring Python3 compatibility
-
Explanatory material
-
[medium] readthedocs pages
-
[global] MOAR sensible examples
-
[medium] report on prior basic research (optimizer comparison, training algorithm comparison, layers, etc)