Skip to content
This repository has been archived by the owner on Nov 1, 2021. It is now read-only.

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
korymath authored Sep 16, 2016
1 parent 7b124e7 commit b4cd817
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/twitter/torch-twrl/blob/master/LICENSE)
[![Build Status](https://travis-ci.com/twitter/torch-twrl.svg?token=JUyATyLn3rqyEx2nzMk9&branch=master)](https://travis-ci.com/twitter/torch-twrl) [![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/twitter/torch-twrl/blob/master/LICENSE)

# torch-twrl: Reinforcement Learning in Torch
torch-twrl is an RL framework built in Lua/Torch by Twitter.
Expand Down Expand Up @@ -123,7 +123,7 @@ Agents are defined by a model, policy, and learning update.
* learningUpdate: tdLambda - implements temporal difference (Q-learning or SARSA) learning with eligibility traces (replacing or accumulating)
* __Policy Gradient__ [Williams, 1992](http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf):
* model: mlp - multilayer perceptron, final layeer: tanh for continuous, softmax for discrete
* policy: normal for continuous actions, categorical for discrete
* policy: [stochasticModelPolicy](https://github.com/twitter/torch-twrl/blob/master/src/agent/policy/stochasticModelPolicy.lua), normal for continuous actions, categorical for discrete
* learningUpdate: reinforce

## Important note about agent/environment compatibility:
Expand Down

0 comments on commit b4cd817

Please sign in to comment.