[RLlib] Support for mps (Apple Metal) GPUs in torch #28321

mgerstgrasser · 2022-09-07T03:31:08Z

Description

Torch on MacOS supports GPU acceleration on Metal GPUs (AMD GPUs on Intel Macs and Apple GPUs on M1/M2) through the mps backend now. It would be nice if ray / rllib could make use of this.

As far as I understand, the only change that is needed as far as torch is concerned is to do torch.device("mps") instead of torch.device("cuda..."), so this would be a relatively small addition in rllib's torch_policy_v2. I'm less clear on what would be needed for other parts of ray to recognise mps devices as GPU resources.

As a side note, in case anyone comes across this looking for GPU support on MacOS, it seems this already works for tf2, using tensorflow-metal. Just pip install tensorflow-metal and set framework to tf2 (not just tf), and rllib should see and use your AMD or Apple GPU.

Use case

It would be nice to have GPU acceleration for quick local debugging sessions.

The text was updated successfully, but these errors were encountered:

mgerstgrasser · 2022-09-07T17:42:04Z

As per this discussion, even if ray doesn't detect metal GPUs as resources, that's pretty easy to work around. So even just adding support in rllib itself would be useful. Could we simply try mps devices if there aren't any CUDA devices found? If so, this might just be a couple of additional lines of code.

visuallization · 2023-02-27T16:36:02Z

GPU support for m1 would be great!

ChaceAshcraft · 2023-05-09T23:56:50Z

Would also like to see this happen!

qazi0 · 2023-05-10T00:28:04Z

Waiting for this too:)

hippotilt · 2023-08-14T15:23:10Z

Same, that would help a lot :)

ersinakinci · 2023-09-25T23:09:00Z

Would love to see this happen!

sams-data · 2023-10-27T18:00:24Z

+1 this would be great

arnaudlenain · 2024-07-15T19:09:41Z

Hey, any update for rlib/Ray? Anything we can do to help?

duburcqa · 2024-11-11T15:13:25Z

It would be nice to address this issue. I think it should be quite straightforward. After playing around, currently it does not work only because rllib relies on torch.cuda.device_count to check whether the desired GPU index is available. Instead, one should first check if MPS is available. If so, it means there is exactly one GPU device. If not, then torch.cuda.device_count should be checked.

I'm currently monkey-patching torch to make it work:

import torch

device_count_orig = torch.cuda.device_count

def device_count():
    if torch.backends.mps.is_available():
        return 1
    return device_count_orig()

torch.cuda.device_count = device_count

sashless · 2024-11-14T17:16:03Z

I was eager to try that and added GPU to my learner configs.

 .learners(
            num_gpus_per_learner=1,  # Set this to 1 to enable GPU training.
 )
.resources(num_gpus=1 )

and adding GPU to my resources

trainable_with_cpu_gpu = tune.with_resources(PPO, {"cpu": 4, "gpu": 1})

tuner = tune.Tuner(
    trainable_with_cpu_gpu,

and using GPU in ray.init

ray.init(
    num_gpus=1,

Then i see Logical resource usage: 4.0/10 CPUs, 1.0/1 GPUs when starting the tune job.

Unfortunately it doesn't tune and "idles" in Pending Status. What else do i need to do @duburcqa ?

mgerstgrasser added the enhancement Request for new feature and/or capability label Sep 7, 2022

krfricke added rllib RLlib related issues air P2 Important issue, but not time-critical labels Sep 7, 2022

visuallization mentioned this issue Feb 24, 2023

Feature: MacOs Support edbeeching/godot_rl_agents#74

Merged

anyscalesam removed the air label Oct 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Support for mps (Apple Metal) GPUs in torch #28321

[RLlib] Support for mps (Apple Metal) GPUs in torch #28321

mgerstgrasser commented Sep 7, 2022

mgerstgrasser commented Sep 7, 2022

visuallization commented Feb 27, 2023

ChaceAshcraft commented May 9, 2023

qazi0 commented May 10, 2023

hippotilt commented Aug 14, 2023

ersinakinci commented Sep 25, 2023

sams-data commented Oct 27, 2023

arnaudlenain commented Jul 15, 2024

duburcqa commented Nov 11, 2024 •

edited

Loading

sashless commented Nov 14, 2024 •

edited

Loading

[RLlib] Support for mps (Apple Metal) GPUs in torch #28321

[RLlib] Support for mps (Apple Metal) GPUs in torch #28321

Comments

mgerstgrasser commented Sep 7, 2022

Description

Use case

mgerstgrasser commented Sep 7, 2022

visuallization commented Feb 27, 2023

ChaceAshcraft commented May 9, 2023

qazi0 commented May 10, 2023

hippotilt commented Aug 14, 2023

ersinakinci commented Sep 25, 2023

sams-data commented Oct 27, 2023

arnaudlenain commented Jul 15, 2024

duburcqa commented Nov 11, 2024 • edited Loading

sashless commented Nov 14, 2024 • edited Loading

duburcqa commented Nov 11, 2024 •

edited

Loading

sashless commented Nov 14, 2024 •

edited

Loading