Development status

Current state & priorities

The library is currently in active development and there is much to be done yet.

Below is the list of library components and development vectors. All of these are open to your ideas and contributions.

[priority] Component; no priority means "done"

Core components
Environment
Objective
Agent architecture
- MDP (RL) agent
- Generator
- Fully customizable agent
Experiment platform
- [high] Experiment setup zoo
- [medium] Pre-trained model zoo
- [medium] quick experiment running (
  - experiment is defined as (environment, objective function, NN architecture, training algorithm)
Layers
Memory
- Simple RNN
- One-step GRU memory
- Custom GateLayer
  - LSTM as GRU + output GateLayer
- Window augmentation (K last states)
- Stack Augmentation
- [low] List augmentation
- [low] Neural Turing Machine controller
Resolvers
- Greedy resolver (as BaseResolver)
- Epsilon-greedy resolver
- Probablistic resolver
- [High] Adversarial resolver (test if it works)
Learning objectives algorithms
- Q-learning
- SARSA
- k-step Q-learning
- k-step Advantage Actor-critic methods
- Can use any theano/lasagne expressions for loss, gradients and updates
- Experience replay pool
Experiment setups
- boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
- Wikicat - guessing person's traits based on wikipedia biographies
- [high, in progress] openAI gym training/evaluation api and demos
- [medium] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
- [medium] 2048-in-a-browser with Selenium
Visualization tools
- basic monitoring tools
- [medium] generic tunable session visualizer
Technical stuff
- [high] Ensuring Python3 compatibility
  - including examples
- [medium] TensorFlow
- [high] Making tests out examples
Explanatory material
[medium] readthedocs pages
[global] MOAR sensible examples
[medium] report on prior basic research (optimizer comparison, training algorithm comparison, layers, etc)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development status

Current state & priorities

Clone this wiki locally