Skip to content

Commit 55e6955

Browse files
authored
Merge pull request coreylynch#20 from bryant1410/master
Fix broken headings in Markdown files
2 parents 7324e10 + b11509b commit 55e6955

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -16,14 +16,14 @@ It uses Keras to define the deep q network (see model.py), OpenAI's gym library
1616
* Keras
1717

1818
## Usage
19-
###Training
19+
### Training
2020
To kick off training, run:
2121
```
2222
python async_dqn.py --experiment breakout --game "Breakout-v0" --num_concurrent 8
2323
```
2424
Here we're organizing the outputs for the current experiment under a folder called 'breakout', choosing "Breakout-v0" as our gym environment, and running 8 actor-learner threads concurrently. See [this](https://gym.openai.com/envs#atari) for a full list of possible game names you can hand to --game.
2525

26-
###Visualizing training with tensorboard
26+
### Visualizing training with tensorboard
2727
We collect episode reward stats and max q values that can be vizualized with tensorboard by running the following:
2828
```
2929
tensorboard --logdir /tmp/summaries/breakout
@@ -32,7 +32,7 @@ This is what my per-episode reward and average max q value curves looked like ov
3232
![](https://github.com/coreylynch/async-rl/blob/master/resources/episode_reward.png)
3333
![](https://github.com/coreylynch/async-rl/blob/master/resources/max_q_value.png)
3434

35-
###Evaluation
35+
### Evaluation
3636
To run a gym evaluation, turn the testing flag to True and hand in a current checkpoint file:
3737
```
3838
python async_dqn.py --experiment breakout --testing True --checkpoint_path /tmp/breakout.ckpt-2690000 --num_eval_episodes 100
@@ -44,7 +44,7 @@ gym.upload('/tmp/breakout/eval', api_key='YOUR_API_KEY')
4444
```
4545
Now we can find the eval at https://gym.openai.com/evaluations/eval_uwwAN0U3SKSkocC0PJEwQ
4646

47-
###Next Steps
47+
### Next Steps
4848
See a3c.py for a WIP async advantage actor critic implementation.
4949

5050
## Resources

0 commit comments

Comments
 (0)