ReinforcementLearning

DL4J comes with RL4J. But the code is very dense java-esq production code that is hard for beginners to examine. On top of that, python and tensorflow have been a moving target both migrating to 3.* and 2.* respectively since RL4J release

So I made a gym server fork and this repo to try some RL myself.

Easy cartpole experiments with no optimization

DDQN

Vanilla Policy gradient

Actor Critic

PPO

reinforcement learning summaries

Q Learning is a broad category of learning algos that return the value of the state or the state action Conceptually easier. Less likely to find local maxima. easily used in deterministic and discrete action environments. Deep Q network DQN, Double Deep Q networks DDQN etc

On Policy algos return the best action without the concept of vlaue or reward. Requires hand crafted reward function. often combined with q network to overcome local maxima. Easier implementation of stochastic and continuous return values can lead to better exploration vanilla policy, actor critic, A2C, TRPO/PPO

Model based

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.idea		.idea
assets		assets
src/main/java		src/main/java
README.md		README.md
ReinforcmentLearning.iml		ReinforcmentLearning.iml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReinforcementLearning

About

Releases

Packages

Languages

cagneymoreau/ReinforcmentLearning

Folders and files

Latest commit

History

Repository files navigation

ReinforcementLearning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages