Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nethack to sf_examples #289

Merged

Conversation

BartekCupial
Copy link
Collaborator

@BartekCupial BartekCupial commented Dec 29, 2023

Overview

This PR introduces NetHack to Sample Factory.
The code is based on three repositories:

Key Contributions

Easy installation and experiments

By adding NetHack to examples in Sample Factory my main goal is to improve reproducibility and allow for easy experimentation with NetHack as I found that there were many issues with installation and experiments. For example when trying to reproduce experiments in D&D repository Dockerfile required fixing moolib library and Cmake.txt file. Additionally since D&D repo implemented APPO from scratch (implementation details in RL matter) I found that just my moving to SF I've managed to increase the score from 2k to about 2.8k.

Additional metrics

Sample Factory supports logging of additional policy stats if any are found in info["episode_extra_stats"]. I've added wrappers which log blstats and additional auxiliary scores. Look at sf_examples/nethack/utils/task_rewards.py].

render_mode=rgb_array

NLE natively doesn't support rgb_array. I've added rgb_array mode for rendering by using RenderCharImagesWithNumpyWrapperV2 wrapper introduced by https://github.com/Miffyli/nle-sample-factory-baseline.
By using rgb_array in enjoy we can save and examine episodes. Example: https://github.com/BartekCupial/sample-factory/assets/92169405/47884b73-beeb-4303-a72f-75d202aa87a8

Evaluation

NetHack can have very long episodes (100k env steps) and additionally since the environment is highly stochastic we usually need a lot of evaluation episodes to measure the policy performance (usually 1024). I highly recommend using recently introduced eval.py since using enjoy would be really long.

@codecov-commenter
Copy link

codecov-commenter commented Dec 29, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f69cf46) 77.96% compared to head (e97c18a) 77.96%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #289   +/-   ##
=======================================
  Coverage   77.96%   77.96%           
=======================================
  Files         101      101           
  Lines        7759     7759           
=======================================
  Hits         6049     6049           
  Misses       1710     1710           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@klyuchnikova-ana klyuchnikova-ana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

A few minor comments, but overall looks super clean! 🔥

docs/09-environment-integrations/nethack.md Show resolved Hide resolved
docs/09-environment-integrations/nethack.md Outdated Show resolved Hide resolved
docs/09-environment-integrations/nethack.md Outdated Show resolved Hide resolved
setup.py Show resolved Hide resolved
sf_examples/nethack/README.md Outdated Show resolved Hide resolved
@BartekCupial
Copy link
Collaborator Author

UPDATE:

Training results as well as comparison with Dungeons and Data APPO are present in docs docs/09-environment-integrations/nethack.md. I've also created model card in hugging face https://huggingface.co/LLParallax/sample_factory_human_monk (its also linked in docs).

Copy link
Collaborator

@klyuchnikova-ana klyuchnikova-ana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@klyuchnikova-ana klyuchnikova-ana merged commit d76dd09 into alex-petrenko:master Jan 9, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants