-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nethack to sf_examples #289
Add nethack to sf_examples #289
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #289 +/- ##
=======================================
Coverage 77.96% 77.96%
=======================================
Files 101 101
Lines 7759 7759
=======================================
Hits 6049 6049
Misses 1710 1710 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff!
A few minor comments, but overall looks super clean! 🔥
UPDATE:Training results as well as comparison with Dungeons and Data APPO are present in docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Overview
This PR introduces NetHack to Sample Factory.
The code is based on three repositories:
Key Contributions
Easy installation and experiments
By adding NetHack to examples in Sample Factory my main goal is to improve reproducibility and allow for easy experimentation with NetHack as I found that there were many issues with installation and experiments. For example when trying to reproduce experiments in D&D repository Dockerfile required fixing moolib library and Cmake.txt file. Additionally since D&D repo implemented APPO from scratch (implementation details in RL matter) I found that just my moving to SF I've managed to increase the score from 2k to about 2.8k.
Additional metrics
Sample Factory supports logging of additional policy stats if any are found in
info["episode_extra_stats"]
. I've added wrappers which log blstats and additional auxiliary scores. Look atsf_examples/nethack/utils/task_rewards.py]
.render_mode=rgb_array
NLE natively doesn't support rgb_array. I've added rgb_array mode for rendering by using
RenderCharImagesWithNumpyWrapperV2
wrapper introduced by https://github.com/Miffyli/nle-sample-factory-baseline.By using
rgb_array
in enjoy we can save and examine episodes. Example: https://github.com/BartekCupial/sample-factory/assets/92169405/47884b73-beeb-4303-a72f-75d202aa87a8Evaluation
NetHack can have very long episodes (100k env steps) and additionally since the environment is highly stochastic we usually need a lot of evaluation episodes to measure the policy performance (usually 1024). I highly recommend using recently introduced
eval.py
since using enjoy would be really long.