Hands-On Reinforcement Learning With Python

Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

About the book

Reinforcement Learning with Python will help you to master basic reinforcement learning algorithms to the advanced deep reinforcement learning algorithms.

The book starts with an introduction to Reinforcement Learning followed by OpenAI and Tensorflow. You will then explore various RL algorithms and concepts such as the Markov Decision Processes, Monte-Carlo methods, and dynamic programming, including value and policy iteration. This example-rich guide will introduce you to deep learning, covering various deep learning algorithms. You will then explore deep reinforcement learning in depth, which is a combination of deep learning and reinforcement learning. You will master various deep reinforcement learning algorithms such as DQN, Double DQN. Dueling DQN, DRQN, A3C, DDPG, TRPO, and PPO. You will also learn about recent advancements in reinforcement learning such as imagination augmented agents, learn from human preference, DQfD, HER and many more.

Get the book

For effective reading and better rendering, check all the notebooks here

1. Introduction to Reinforcement Learning

1.1. What is Reinforcement Learning?
1.2. Reinforcement Learning Cycle
1.3. How RL differs from other ML Paradigms?
1.4. Elements of Reinforcement Learning
1.5. Agent Environment Interface
1.6. Types of RL Environments
1.7. Reinforcement Learning Platforms
1.8. Applications of Reinforcement Learning

2. Getting Started with OpenAI and Tensorflow

2.1. Setting Up Your Machine
2.2. Installing Anaconda
2.3. Installing Docker
2.4. Installing OpenAI Gym and Universe
2.5. Common Error Fixes
2.6. OpenAI Gym
2.7. Basic Simulations
2.8. Training a Robot to walk
2.9. Building a Video Game Bot
2.10. Tensorflow Fundamentals
2.11. Tensorboard

3. Markov Decision Process and Dynamic Programming

3.1. Markov Chain and Markov Process
3.2. Markov Decision Process
3.3. Rewards and Returns
3.4. Episodic and Continous Tasks
3.5. Policy Function
3.6. State Value Function
3.7. State-Action Value Function (Q Function)
3.8. Bellman Equation and Optimality
3.9. Deriving Bellman Equation for Value and Q functions
3.10. Solving the Bellman Equation
3.11. Dynamic Programming
3.12. Solving Frozen Lake Problem using Value Iteration
3.13. Solving Frozen Lake Problem using Policy Iteration

4. Gaming with Monte Carlo Methods

4.1. Monte Carlo Methods
4.2. Estimating Value of Pi Using Monte Carlo
4.3. Monte Carlo Prediction
4.4. First visit Monte Carlo
4.5. Every visit Monte Carlo
4.6. BlackJack with Monte Carlo
4.7. Monte Carlo Control
4.8. Monte Carlo Exploration Starts
4.9. On Policy Monte Carlo Control
4.10. Off Policy Monte Carlo Control

5. Temporal Difference Learning

5.1. Temporal Difference Learning
5.2. TD Prediction
5.3. TD Control
5.4. Q Learning
5.5. Solving the Taxi Problem using Q learning
5.6. SARSA
5.7. Solving the Taxi Problem using SARSA
5.8. Difference Between Q learning and SARSA

6. Multi-Armed Bandit Problem

6.1. Multi-armed Bandit Problem
6.2. Epsilon-Greedy Algorithm
6.3. Softmax Exploration Algorithm
6.4. Upper Confidence Bound Algorithm
6.5. Thompson Sampling Algorithm
6.6. Applications of MAB
6.7. Identifying Right Advertisement Banner Using MAB
6.8. Contextual Bandits

7. Deep Learning Fundamentals

7.1. Artificial Neurons
7.2. Artificial Neural Network
7.3. Activation Functions
7.4. Deep Dive into ANN
7.5. Gradient Descent
7.6. Neural Networks in Tensorflow
7.7. Recurrent Neural Network
7.8. Backpropagation Through Time
7.9. Long Short Term Memory RNN
7.10. Generating Song Lyrics using LSTM RNN
7.11. Convolutional Neural Networks
7.12. CNN Architecture
7.13. Classifying Fashion Products Using CNN

9. Playing Doom With Deep Recurrent Q Network

9.1. Deep Recurrent Q Network
9.2. Partially Observable MDP
9.3. Architecture of DRQN
9.4. Basic Doom Game
9.5. Build an Agent to Play Doom Game using DRQN
9.6. Deep Attention Recurrent Q Network

10. Asynchronous Advantage Actor Critic Network

10.1. Asynchronous Actor Critic Algorithm
10.2. The three A's
10.3. Architecture of A3C
10.4. Working of A3C
10.5. Drive up the Mountain with A3C
10.6. Visualization in Tensorboard

11. Policy Gradients and Optimization

11.1. Policy Gradient
11.2. Lunar Lander Using Policy Gradient
11.3. Deep Deterministic Policy Gradient
11.4. Swinging up the Pendulum using DDPG
11.5. Trust Region Policy Optimizatio
11.6. Proximal Policy Optimization

12. Capstone Project: Car Racing using DQN

13. Recent Advancements and Next Steps

13.1. Imagination Augmented Agents
13.2. Learning From Human Preference
13.3. Deep Q Learning From Demonstrations
13.4. Hindsight Experience Replay
13.5. Hierarchical Reinforcement Learning
13.6. Inverse Reinforcement Learning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hands-On Reinforcement Learning With Python

Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

About the book

Get the book

1. Introduction to Reinforcement Learning

2. Getting Started with OpenAI and Tensorflow

3. Markov Decision Process and Dynamic Programming

4. Gaming with Monte Carlo Methods

5. Temporal Difference Learning

6. Multi-Armed Bandit Problem

7. Deep Learning Fundamentals

8. Atari Games With Deep Q Network

9. Playing Doom With Deep Recurrent Q Network

10. Asynchronous Advantage Actor Critic Network

11. Policy Gradients and Optimization

12. Capstone Project: Car Racing using DQN

13. Recent Advancements and Next Steps

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
01. Introduction to Reinforcement Learning		01. Introduction to Reinforcement Learning
02. Getting Started with OpenAI and Tensorflow		02. Getting Started with OpenAI and Tensorflow
03. Markov Decision Process and Dynamic Programming		03. Markov Decision Process and Dynamic Programming
04. Gaming with Monte Carlo Methods		04. Gaming with Monte Carlo Methods
05. Temporal Difference Learning		05. Temporal Difference Learning
06. Multi-Armed Bandit Problem		06. Multi-Armed Bandit Problem
07. Deep Learning Fundamentals		07. Deep Learning Fundamentals
08. Atari Games with DQN		08. Atari Games with DQN
09. Playing Doom Game using DRQN		09. Playing Doom Game using DRQN
10. Aysnchronous Advantage Actor Critic Network		10. Aysnchronous Advantage Actor Critic Network
11. Policy Gradients and Optimization		11. Policy Gradients and Optimization
12. Capstone Project: Car Racing using DQN		12. Capstone Project: Car Racing using DQN
13. Recent Advancements and Next Steps		13. Recent Advancements and Next Steps
images		images
README.md		README.md

huzijian1996/Hands-On-Reinforcement-Learning-With-Python

Folders and files

Latest commit

History

Repository files navigation

Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

About the book

Get the book

About

Resources

Stars

Watchers

Forks

Languages