RTX 30xx compatibility #115

alex404 · 2021-10-13T09:15:09Z

Hello, I was wondering if you could update the environment.yaml file to more recent versions of pytorch and cuda (i.e. 1.9.* and 11.*) to "officially" support RTX 30xx cards.

I have an AMD 5950x and an RTX 3090, and I've been able to get sample-factory running with vizdoom by installing vizdoom with pip in a conda environment and running sample-factory straight from the repository. The performance is great (I'm getting about 100K FPS), but I guess because I'm not using your vizdoom branch I'm locked out of the multi-agent framework. I'm also hoping to rely on sample-factory for a longer-term research project, and I worry that other issues might arise if I have to use it in this "unofficial" way.

Thanks for your hard work. This is an impressive piece of software!

alex-petrenko · 2021-10-15T22:01:13Z

Thank you for this inquiry. I guess there are several separate issues here.

First of all, the most straightforward way to install SF is via PyPI: pip install sample-factory
This will pull all the most recent versions of Pytorch, etc. Since this just requires a Python environment, you are free to setup your CUDA and CuDNN any way you like. I recommend initializing a conda env with the latest version of CUDA libraries, but you can use a system-wide installation as well. Let me know if you need any help or assistance with this. Generally, you just need an installation that supports GPU-accelerated Pytorch on your machine. Should not be any different than any other deep learning environment. For example, consult Pytorch installation guide for that: https://pytorch.org/get-started/locally/

VizDoom is a separate issue. If you want to replicate the results from the article exactly, you are advised to use the custom fork:
pip install git+https://github.com/alex-petrenko/ViZDoom@doom_bot_project#egg=vizdoom

If you just want to work with the latest Vizdoom, you should install it from pip normally. I believe some important fixes from doom_bot_project were merged into master Vizdoom as a part of this pull request: Farama-Foundation/ViZDoom#486
And so they should be included into the latest PyPI release 1.1.9.
I perhaps should update the installation instructions to reflect this.

Let me know if the multiplayer stuff works with the latest VizDoom from PyPI! If it does not, I can help to debug the issue.

alex-petrenko · 2021-10-15T22:02:39Z

Also, for 100K FPS, how many CPU cores are you using?
And what are the parameters such as number of workers, environments per worker, and batch size?

alex404 · 2021-10-16T07:47:18Z

Thanks for your reply and general advice. I haven't actually tried to run the multi-agent simulations yet because we haven't needed for our project, but it's good to hear that you think it should be generally supported without using your fork. If you have the time I think it would be helpful to update the README.md, because right now it strongly encourages new users to use your environment.yml and vizdoom fork, which isn't viable for people with the newest generation of cards, and doesn't seem strictly necessary.

Anyway, running

python -m sample_factory.algorithms.appo.train_appo --env=doom_benchmark --algo=APPO --env_frameskip=4 --use_rnn=True --num_workers=36 --num_envs_per_worker=24 --num_policies=1 --ppo_epochs=1 --rollout=32 --recurrence=32 --batch_size=8192 --wide_aspect_ratio=False --experiment=doom_5950x_RTX3090_bench --policy_workers_per_policy=2

for half an hour or so resulting in the program spitting out FPS: 100948.4. While running it maxes out the 16-cores/32 threads of my CPU, and seems to run around 50-60% utilization on my GPU.

I'll let you know when I get around to trying out the multi-agent simulations, but I'm pretty swamped for the next month, so it'll be a bit of time.

Thanks again!

alex-petrenko · 2021-10-17T02:00:50Z

Sounds good. I think I generally found batch sizes around 2048 to be more sample efficient, but overall your command line looks okay. It is also advised to set the num_workers to the number of logical CPU cores you have. сб, 16 окт. 2021 г. в 00:47, Sacha Sokoloski ***@***.***>:

…

Thanks for your reply. I haven't actually tried to run the multi-agent simulations yet because we haven't needed for our project, but it's good to hear that you think it should be generally supported without using your fork. If you have the time I think it would be helpful to update the README.md, because right now it strongly encourages new users to use your environment.yml and vizdoom fork, which isn't viable for people with the newest generation of cards, and doesn't seem strictly necessary. Anyway, running python -m sample_factory.algorithms.appo.train_appo --env=doom_benchmark --algo=APPO --env_frameskip=4 --use_rnn=True --num_workers=36 --num_envs_per_worker=24 --num_policies=1 --ppo_epochs=1 --rollout=32 --recurrence=32 --batch_size=8192 --wide_aspect_ratio=False --experiment=doom_5950x_RTX3090_bench --policy_workers_per_policy=2 for half an hour or so resulting in the program spitting out FPS: 100948.4 I'll let you know when I get around to trying out the multi-agent simulations, but I'm pretty swamped for the next month, so it'll be a bit of time. Thanks again! — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#115 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABJ6HL2ERW5Y3LK7W5RAWNTUHEUZBANCNFSM5F4VZTIQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

alex404 · 2021-10-17T07:04:14Z

FYI, I tried running the same command with --batch_size=2048 and --num_works=16, and only got FPS: 85642.7

I then tried with --num_works=32 and --batch_size=2048, and got FPS: 101767.2, so a slight boost over what I had before.

alex-petrenko · 2021-10-18T18:06:50Z

Yes, with smaller batch you can lose some throughput, but you do a lot more backprop steps within the same amount of time, which can lead to overall faster convergence. Optimal batch size greatly depends on the task, but 2048 is what I found to be good for VizDoom envs. вс, 17 окт. 2021 г. в 00:04, Sacha Sokoloski ***@***.***>:

…

FYI, I tried running the same command with --batch_size=2048 and --num_works=16, and only got FPS: 85642.7 I then tried with --num_works=32 and --batch_size=2048, and got FPS: 101767.2, so a slight boost over what I had before. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#115 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABJ6HL3A2WKLK4EFRHMOKITUHJYPTANCNFSM5F4VZTIQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

alex404 · 2021-10-19T08:27:43Z

Cool, thanks for the tip!

alex-petrenko · 2021-11-10T01:31:16Z

Thanks for your reply and general advice. I haven't actually tried to run the multi-agent simulations yet because we haven't needed for our project, but it's good to hear that you think it should be generally supported without using your fork. If you have the time I think it would be helpful to update the README.md, because right now it strongly encourages new users to use your environment.yml and vizdoom fork, which isn't viable for people with the newest generation of cards, and doesn't seem strictly necessary.

Anyway, running

python -m sample_factory.algorithms.appo.train_appo --env=doom_benchmark --algo=APPO --env_frameskip=4 --use_rnn=True --num_workers=36 --num_envs_per_worker=24 --num_policies=1 --ppo_epochs=1 --rollout=32 --recurrence=32 --batch_size=8192 --wide_aspect_ratio=False --experiment=doom_5950x_RTX3090_bench --policy_workers_per_policy=2

for half an hour or so resulting in the program spitting out FPS: 100948.4. While running it maxes out the 16-cores/32 threads of my CPU, and seems to run around 50-60% utilization on my GPU.

I'll let you know when I get around to trying out the multi-agent simulations, but I'm pretty swamped for the next month, so it'll be a bit of time.

Thanks again!

I updated the README which now recommends installing VizDoom>=1.1.9 from PyPI. Everything works on it including the multi-agent stuff.

I still found that my environment.yml gives better performance than starting from an empty conda env and installing sample factory in it. I believe this is due to some performance regressions in the latest versions of MKL or Numpy. So I kept both options for now. Other than performance, everything seems to work with just pip install.

alex-petrenko closed this as completed Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RTX 30xx compatibility #115

RTX 30xx compatibility #115

alex404 commented Oct 13, 2021

alex-petrenko commented Oct 15, 2021

alex-petrenko commented Oct 15, 2021

alex404 commented Oct 16, 2021 •

edited

Loading

alex-petrenko commented Oct 17, 2021 via email

alex404 commented Oct 17, 2021

alex-petrenko commented Oct 18, 2021 via email

alex404 commented Oct 19, 2021

alex-petrenko commented Nov 10, 2021

RTX 30xx compatibility #115

RTX 30xx compatibility #115

Comments

alex404 commented Oct 13, 2021

alex-petrenko commented Oct 15, 2021

alex-petrenko commented Oct 15, 2021

alex404 commented Oct 16, 2021 • edited Loading

alex-petrenko commented Oct 17, 2021 via email

alex404 commented Oct 17, 2021

alex-petrenko commented Oct 18, 2021 via email

alex404 commented Oct 19, 2021

alex-petrenko commented Nov 10, 2021

alex404 commented Oct 16, 2021 •

edited

Loading