Skip to content

Commit

Permalink
merged master
Browse files Browse the repository at this point in the history
  • Loading branch information
DenSumy committed Nov 24, 2023
2 parents 825bdb1 + a5d788a commit 005ec17
Show file tree
Hide file tree
Showing 176 changed files with 4,709 additions and 1,956 deletions.
265 changes: 163 additions & 102 deletions README.md

Large diffs are not rendered by default.

32 changes: 32 additions & 0 deletions docs/DEEPMIND_ENVPOOL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Deepmind Control (https://github.com/deepmind/dm_control)

* I could not find any ppo deepmind_control benchmark. It is a first version only. Will be updated later.

## How to run:
* **Humanoid (Stand, Walk or Run)**
```
poetry install -E envpool
poetry run pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
poetry run python runner.py --train --file rl_games/configs/dm_control/humanoid_walk.yaml
```

## Results:

* No tuning. I just run it on a couple of envs.
* I used 4000 epochs which is ~32M steps for almost all envs except HumanoidRun. But a few millions of steps was enough for the most of the envs.
* Deepmind used a pretty strange reward and training rules. A simple reward transformation: log(reward + 1) achieves best scores faster.

| Env | Rewards |
| ------------- | ------------- |
| Ball In Cup Catch | 938 |
| Cartpole Balance | 988 |
| Cheetah Run | 685 |
| Fish Swim | 600 |
| Hopper Stand | 557 |
| Humanoid Stand | 653 |
| Humanoid Walk | 621 |
| Humanoid Run | 200 |
| Pendulum Swingup | 706 |
| Walker Stand | 907 |
| Walker Walk | 917 |
| Walker Run | 702 |
492 changes: 24 additions & 468 deletions notebooks/brax_training.ipynb

Large diffs are not rendered by default.

35 changes: 9 additions & 26 deletions notebooks/mujoco_envpool_training.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -44,33 +44,27 @@
"metadata": {},
"outputs": [],
"source": [
"!nvidia-smi -L"
"!pip show rl-games"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "6qvHCGgpxrvZ"
},
"metadata": {},
"outputs": [],
"source": [
"%load_ext tensorboard"
"!nvidia-smi -L"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "GFv1FDtJyC0z",
"outputId": "4082ccf2-139d-415a-c832-8b39f622e899"
"id": "6qvHCGgpxrvZ"
},
"outputs": [],
"source": [
"!pip show rl-games"
"%load_ext tensorboard"
]
},
{
Expand Down Expand Up @@ -367,17 +361,6 @@
"%tensorboard --logdir 'runs/'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "fyvlWdM_abGR"
},
"outputs": [],
"source": [
"from rl_games.torch_runner import Runner"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -500,9 +483,10 @@
"outputs": [],
"source": [
"import yaml\n",
"from rl_games.torch_runner import Runner\n",
"\n",
"config = walker_config\n",
"config['params']['config']['full_experiment_name'] = 'mujoco'\n",
"config['params']['config']['full_experiment_name'] = 'Walker2d_mujoco'\n",
"config['params']['config']['max_epochs'] = 500\n",
"config['params']['config']['horizon_length'] = 512\n",
"config['params']['config']['num_actors'] = 8\n",
Expand Down Expand Up @@ -531,11 +515,10 @@
"config = player_walker_config\n",
"config['params']['config']['player']['render'] = False\n",
"config['params']['config']['player']['games_num'] = 2\n",
" \n",
"runner = Runner()\n",
"\n",
"runner.load(config)\n",
"agent = runner.create_player()\n",
"agent.restore('runs/mujoco/nn/Walker2d-v4.pth')"
"agent.restore('runs/Walker2d_mujoco/nn/Walker2d-v4.pth')"
]
},
{
Expand Down
180 changes: 0 additions & 180 deletions notebooks/train_and_export_onnx_example.ipynb

This file was deleted.

Loading

0 comments on commit 005ec17

Please sign in to comment.