- Deep Q Network(DQN) (off-policy)
- Double Deep Q Network(Double DQN) (off-policy)
- Dueling Deep Q Network(Dueling DQN) (off-policy)
- Duelling Double Deep Q Network(D3QN) (off-policy)
- Noisy Networks for Exploration(NoisyDQN) (off-policy)
- Advantage Actor-Critic(A2C) (on-policy)
- Asynchronous Advantage Actor-Critic(A3C) (on-policy)
- Proximal Policy Optimization(PPO)(GAE) (on-policy)(Nearing off-policy)
- Proximal Policy Gradient(PPG) (on-policy PPO + off-policy Critic[Let it share parameters with PPO's Critic])
- Deep Deterministic Policy Gradient(DDPG) (off-policy)
- Twin Delayed Deep Deterministic policy gradient(TD3) (off-policy)
- Soft Actor-Critic(SAC) (off-policy)
- Truncated Quantile Critics(TQC) (off-policy)
- Distribution Correction(DisCor) based on Soft Actor-Critic(DisCor)
- Randomized Ensembled Double Q-Learning(REDQ)
- Stochastic Latent Actor-Critic(SLAC)
- SAC with AutoEncoder(SAC_AE)
- Data regularized Q(DrQ-v1)
- Data regularized Q(DrQ-v2)
- Behavior Cloning(BC)
- Generative Adversarial Imitation Learning(GAIL)
- Prioritized Experience Replay(PER)
- Hindsight Experience Replay(HER)
- Deep Dense Architectures in reinforcement learning(D2RL)
- Intrinsic Curiosity Module(ICM)
- APEX(resemblance)
- MPI
- Multi Agent Deep Deterministic Policy Gradient(MADDPG)
- MADDPG Method TD3, SAC
- Multi Agent Proximal Policy Optimization(MAPPO)
- COMA
- QMIX
- Clone the repo and cd into it:
git clone https://github.com/namjiwon1023/Code_With_RL cd Code_With_RL
- If you don't have Pytorch installed already, install your favourite flavor of Pytorch. In most cases, you may use
or
pip3 install torch torchvision torchaudio -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html # pytorch 1.8.1 LTS CUDA 10.2 version. if you have GPU.
to install Pytorch GPU or CPU version.pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html # pytorch 1.8.1 LTS CPU version. if you don`t have GPU.
- Hyperparameter # Algorithm Hyperparameters
- dqn.yaml
- doubledqn.yaml
- duelingdqn.yaml
- d3qn.yaml
- noisydqn.yaml
- ddpg.yaml
- td3.yaml
- sac.yaml
- ppo.yaml
- a2c.yaml
- behaviorcloning.yaml
- etc.
- agent.py
- reinforcement learning algorithm
- network.py
- QNetwork
- NoisyLinear
- ActorNetwork
- CriticNetwork
- replaybuffer.py
- Simple PPO Rollout Buffer
- Off-Policy Experience Replay
- runner.py
- Training loop
- Evaluator
- main.py
- Start training
- Start evaluation
- utils.py
- Make gif image
- Drawing
- Basic tools
To train a new network : run python main.py --algorithm=selection algorithm
To test a preTrained network : run python main.py --algorithm=selection algorithm --evaluate=True
Reinforcement learning algorithms that can now be selected:
- DQN
- Double_DQN
- Dueling_DQN
- D3QN
- Noisy_DQN
- DDPG
- TD3
- SAC
- PPO
- A2C
- BC_SAC
Discrete action space recommendation: Dueling DoubleQN (D3QN)
Continuous action space recommendation: use TD3 if you are good at tuning parameters, use PPO or SAC if you are not good at tuning parameters, if the training environment Reward function is written by beginners, then use PPO .
Discrete action :
Continuous action :
Multi-Agent Training Environment:
Value Based Algorithm Compare Result:
Policy Based Algorithm Compare Result:
Python 3.6+ : conda create -n icsl_rl python=3.6
Pytorch 1.6+ : https://pytorch.org/get-started/locally/
Numpy : pip install numpy
openai gym : https://github.com/openai/gym
matplotlib : pip install matplotlib
tensorboard : pip install tensorboard
To cite this repository:
@misc{algorithms_drl,
author = {Zhiyuan Nan},
title = {Code With Deep Reinforcement Learning},
year = {2021},
publisher = {Github},
journal = {GitHub repository},
howpublished = {\url{https://github.com/namjiwon1023/Code_With_RL}},
}