Development status

Current state & priorities

The library is currently in active development and there is much to be done yet.

[priority] Component; no priority means "done"

Core components
Environment
Objective
Agent architecture
- MDP (RL) agent
- Generator
- Fully customizable agent
Experiment platform
- [high] Experiment setup zoo
- [medium] Pre-trained model zoo
- [medium] quick experiment running (
  - experiment is defined as (environment, objective function, NN architecture, training algorithm)
Layers
Memory
- Simple RNN
- One-step GRU memory
- Custom GateLayer
  - LSTM as GRU + output GateLayer
- Window augmentation (K last states)
- Stack Augmentation
- [low] List augmentation
- [low] Neural Turing Machine controller
Resolvers
- Greedy resolver (as BaseResolver)
- Epsilon-greedy resolver
- Probablistic resolver
- [High] Adversarial resolver (test if it works)
Learning objectives algorithms
- Q-learning
- SARSA
- k-step Q-learning
- k-step Advantage Actor-critic methods
- Can use any theano/lasagne expressions for loss, gradients and updates
- Experience replay pool
Experiment setups
- boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
- Wikicat - guessing person's traits based on wikipedia biographies
- [high, in progress] openAI gym training/evaluation api and demos
- [medium] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
- [medium] 2048-in-a-browser with Selenium
Visualization tools
- basic monitoring tools
- [medium] generic tunable session visualizer
Technical stuff
- [high] Ensuring Python3 compatibility
  - including examples
- [medium] TensorFlow
- [high] Making tests out examples
Explanatory material
[medium] readthedocs pages
[global] MOAR sensible examples
[medium] report on prior basic research (optimizer comparison, training algorithm comparison, layers, etc)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development status

Current state & priorities

Clone this wiki locally