Skip to content
This repository has been archived by the owner on Mar 31, 2019. It is now read-only.

Development status

justheuristic edited this page May 4, 2016 · 4 revisions

Current state & priorities

The library is currently in active development and there is much to be done yet.

Below is the list of library components and development vectors. All of these are open to your ideas and contributions.

[priority] Component; no priority means "done"

  • Core components

  • Environment

  • Objective

  • Agent architecture

    • MDP (RL) agent
    • Generator
    • Fully customizable agent
  • Experiment platform

    • [high] Experiment setup zoo
    • [medium] Pre-trained model zoo
    • [medium] quick experiment running (
      • experiment is defined as (environment, objective function, NN architecture, training algorithm)
  • Layers

  • Memory

    • Simple RNN
    • One-step GRU memory
    • Custom GateLayer
      • LSTM as GRU + output GateLayer
    • Window augmentation (K last states)
    • Stack Augmentation
    • [low] List augmentation
    • [low] Neural Turing Machine controller
  • Resolvers

    • Greedy resolver (as BaseResolver)
    • Epsilon-greedy resolver
    • Probablistic resolver
    • [High] Adversarial resolver (test if it works)
  • Learning objectives algorithms

    • Q-learning
    • SARSA
    • k-step Q-learning
    • k-step Advantage Actor-critic methods
    • Can use any theano/lasagne expressions for loss, gradients and updates
    • Experience replay pool
  • Experiment setups

    • boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
    • Wikicat - guessing person's traits based on wikipedia biographies
    • [high, in progress] openAI gym training/evaluation api and demos
    • [medium] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
    • [medium] 2048-in-a-browser with Selenium
  • Visualization tools

    • basic monitoring tools
    • [medium] generic tunable session visualizer
  • Technical stuff

    • [high] Ensuring Python3 compatibility
      • including examples
    • [medium] TensorFlow
    • [high] Making tests out examples
  • Explanatory material

  • [medium] readthedocs pages

  • [global] MOAR sensible examples

  • [medium] report on prior basic research (optimizer comparison, training algorithm comparison, layers, etc)

Clone this wiki locally