Skip to content

honghaow/Triple-Q

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Triple-Q

Author's implementation of the paper:

Triple-Q: A Model-Free Algorithm for Constrained Reinforcement Learning with Sublinear Regret and Zero Constraint Violation

In this paper we proposed the first model-free, simulator-free reinforcement learning algorithm for Constrained Markov Decision Processes (CMDPs) with sublinear regret and zero constraint violation.

A Tabular Case

In the tabular case we evaluated our algorithm using a grid-world environment.

Train Triple-Q on this environment by simply running the file Triple_Q_tabular.ipynb on Google Colab.

Deep-Triple-Q

The codes for Deep-Triple-Q are adapted from Safety Starter Agent and WCSAC.

Triple-Q can also be implemented with neural network approximations and the actor-critic method.

Train Deep-Triple-Q on the Dynamic Gym benchmark (DynamicEnv) (Yang et al. (2021)) by simply running

python ./deep_tripleq/sac/triple_q.py --env 'DynamicEnv-v0' -s 1234 --cost_lim 15 --logger_kwargs_str '{"output_dir":"./temp"}'

Warning: If you want to use the Triple-Q algorithm in Safety Gym, make sure to install Safety Gym according to the instructions on the Safety Gym repo.

Deep-Triple-Q on safe RL with hard constraints

Train Deep-Triple-Q on Pendulum environment with hard safety constraints (details can be found in Cheng et al. ) by running

python ./saferl/main_triple_q.py --seed 1234

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published