GitHub - honghaow/Triple-Q

Triple-Q

Author's implementation of the paper:

Triple-Q: A Model-Free Algorithm for Constrained Reinforcement Learning with Sublinear Regret and Zero Constraint Violation

In this paper we proposed the ﬁrst model-free, simulator-free reinforcement learning algorithm for Constrained Markov Decision Processes (CMDPs) with sublinear regret and zero constraint violation.

A Tabular Case

In the tabular case we evaluated our algorithm using a grid-world environment.

Train Triple-Q on this environment by simply running the file Triple_Q_tabular.ipynb on Google Colab.

Deep-Triple-Q

The codes for Deep-Triple-Q are adapted from Safety Starter Agent and WCSAC.

Triple-Q can also be implemented with neural network approximations and the actor-critic method.

Train Deep-Triple-Q on the Dynamic Gym benchmark (DynamicEnv) (Yang et al. (2021)) by simply running

python ./deep_tripleq/sac/triple_q.py --env 'DynamicEnv-v0' -s 1234 --cost_lim 15 --logger_kwargs_str '{"output_dir":"./temp"}'

Warning: If you want to use the Triple-Q algorithm in Safety Gym, make sure to install Safety Gym according to the instructions on the Safety Gym repo.

Deep-Triple-Q on safe RL with hard constraints

Train Deep-Triple-Q on Pendulum environment with hard safety constraints (details can be found in Cheng et al. ) by running

python ./saferl/main_triple_q.py --seed 1234

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
deep_tripleq		deep_tripleq
env		env
saferl		saferl
README.md		README.md
Triple_Q_tabular.ipynb		Triple_Q_tabular.ipynb
average_cmdp.pdf		average_cmdp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Triple-Q

A Tabular Case

Deep-Triple-Q

About

Releases

Packages

Languages

honghaow/Triple-Q

Folders and files

Latest commit

History

Repository files navigation

Triple-Q

A Tabular Case

Deep-Triple-Q

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages