Skip to content
This repository has been archived by the owner on Mar 31, 2019. It is now read-only.

Development status

justheuristic edited this page May 4, 2016 · 4 revisions

Current state & priorities

The library is currently in active development and there is much to be done yet.

[priority] Component; no priority means "done"

  • Core components

  • Environment

  • Objective

  • Agent architecture

    • MDP (RL) agent
    • Generator
    • Fully customizable agent
  • Experiment platform

    • [high] Experiment setup zoo
    • [medium] Pre-trained model zoo
    • [medium] quick experiment running (
      • experiment is defined as (environment, objective function, NN architecture, training algorithm)
  • Layers

  • Memory

    • Simple RNN
    • One-step GRU memory
    • Custom GateLayer
      • LSTM as GRU + output GateLayer
    • Window augmentation (K last states)
    • Stack Augmentation
    • [low] List augmentation
    • [low] Neural Turing Machine controller
  • Resolvers

    • Greedy resolver (as BaseResolver)
    • Epsilon-greedy resolver
    • Probablistic resolver
    • [High] Adversarial resolver (test if it works)
  • Learning objectives algorithms

    • Q-learning
    • SARSA
    • k-step Q-learning
    • k-step Advantage Actor-critic methods
    • Can use any theano/lasagne expressions for loss, gradients and updates
    • Experience replay pool
  • Experiment setups

    • boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
    • Wikicat - guessing person's traits based on wikipedia biographies
    • [high, in progress] openAI gym training/evaluation api and demos
    • [medium] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
    • [medium] 2048-in-a-browser with Selenium
  • Visualization tools

    • basic monitoring tools
    • [medium] generic tunable session visualizer
  • Technical stuff

    • [high] Ensuring Python3 compatibility
      • including examples
    • [medium] TensorFlow
    • [high] Making tests out examples
  • Explanatory material

  • [medium] readthedocs pages

  • [global] MOAR sensible examples

  • [medium] report on prior basic research (optimizer comparison, training algorithm comparison, layers, etc)

Clone this wiki locally