Avoid fallback on CPU if no devices are provided #12410

rohitgr7 · 2022-03-22T12:39:29Z

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

cc @Borda @justusschock @kaushikb11 @awaelchli @ninginthecloud @akihironitta @rohitgr7

tests/accelerators/test_accelerator_connector.py

kaushikb11

Could you update the docs as well?
For example: https://pytorch-lightning.readthedocs.io/en/latest/accelerators/gpu.html#select-gpu-devices

tests/accelerators/test_accelerator_connector.py

DuYicong515 · 2022-03-25T22:44:33Z

Hi @rohitgr7, it makes sense to me Trainer(accelerator="gpu", devices=[]/0/"0") throws instead of falling back to cpu.

However, currently Trainer(gpus=[]/0/"0") still falls back to CPU.
Trainer(accelerator="gpu", gpus=[]/0/"0") will work as "auto" and use all GPU devices when GPUs are available , and throws if GPU is not available.

Even gpus is being deprecated as a Trainer argument, shall we still make the behaviour consistent here? I feel it's confused that those arguments behaves differently on similar settings.

awaelchli · 2022-03-27T16:01:20Z

However, currently Trainer(gpus=[]/0/"0") still falls back to CPU.

That's ok, it is documented here.

Trainer(accelerator="gpu", gpus=[]/0/"0") will work as "auto" and use all GPU devices when GPUs are available

Agreed, this is probably not intended and should be changed.

kaushikb11 · 2022-04-06T09:20:08Z

tests/accelerators/test_accelerator_connector.py

@@ -506,6 +506,15 @@ def test_accelerator_cpu(_):
        trainer = Trainer(accelerator="cpu", gpus=1)


+@mock.patch("torch.cuda.is_available", return_value=False)


@rohitgr7

Here, we assumed the accelerator is not available. What if it's available and user pass devices="0"/0/ []]?

It leads to such errors

self._parallel_devices = self.accelerator.get_parallel_devices(self._devices_flag) File "/home/jovyan/pytorch-lightning/pytorch_lightning/accelerators/gpu.py", line 82, in get_parallel_devices return [torch.device("cuda", i) for i in devices] TypeError: 'NoneType' object is not iterable

It is addressed in #12633

Avoid cpu fallback if no devices are found

f457c02

rohitgr7 added trainer: connector accelerator labels Mar 22, 2022

rohitgr7 added this to the 1.6 milestone Mar 22, 2022

chlog

0c88e91

rohitgr7 marked this pull request as ready for review March 22, 2022 12:51

rohitgr7 requested review from tchaton, SeanNaren, carmocca, Borda, williamFalcon, awaelchli, justusschock and kaushikb11 as code owners March 22, 2022 12:51

carmocca approved these changes Mar 22, 2022

View reviewed changes

tests/accelerators/test_accelerator_connector.py Outdated Show resolved Hide resolved

carmocca added the breaking change Includes a breaking change label Mar 22, 2022

ananthsub approved these changes Mar 22, 2022

View reviewed changes

kaushikb11 suggested changes Mar 22, 2022

View reviewed changes

akihironitta approved these changes Mar 23, 2022

View reviewed changes

update docs

cf1e257

rohitgr7 requested a review from edenlightning as a code owner March 23, 2022 09:43

rohitgr7 requested a review from kaushikb11 March 23, 2022 09:43

kaushikb11 approved these changes Mar 23, 2022

View reviewed changes

mergify bot added the ready PRs ready to be merged label Mar 23, 2022

rohitgr7 enabled auto-merge (squash) March 23, 2022 10:42

Borda approved these changes Mar 23, 2022

View reviewed changes

Merge remote-tracking branch 'origin/master' into ref/avoid_cpu_fallback

d6a0b45

mergify bot added has conflicts and removed ready PRs ready to be merged labels Mar 23, 2022

rohitgr7 requested a review from carmocca March 23, 2022 14:49

Merge branch 'master' into ref/avoid_cpu_fallback

73dd1b5

mergify bot added ready PRs ready to be merged and removed has conflicts ready PRs ready to be merged labels Mar 24, 2022

update test

e42299d

mergify bot added has conflicts and removed ready PRs ready to be merged labels Mar 25, 2022

Merge branch 'master' into ref/avoid_cpu_fallback

af54f2f

mergify bot added ready PRs ready to be merged and removed has conflicts ready PRs ready to be merged labels Mar 25, 2022

rohitgr7 commented Mar 25, 2022

View reviewed changes

tests/accelerators/test_accelerator_connector.py Outdated Show resolved Hide resolved

Apply suggestions from code review

6bb945c

rohitgr7 commented Mar 25, 2022

View reviewed changes

tests/accelerators/test_accelerator_connector.py Outdated Show resolved Hide resolved

Apply suggestions from code review

cdd2d9a

rohitgr7 merged commit 48f1710 into master Mar 25, 2022

rohitgr7 deleted the ref/avoid_cpu_fallback branch March 25, 2022 15:59

four4fish mentioned this pull request Apr 5, 2022

Fallback to CPUAccelerator when ([], 0, "0") are passed to the devices flag #12619

Closed

12 tasks

This was referenced Apr 6, 2022

Raise Misconfiguration exception for xpus=0/ []/ "0" #12638

Closed

Throw MisconfigurationException when ([], 0, "0") is passed to devices or device specific flags #12633

Closed

kaushikb11 reviewed Apr 6, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid fallback on CPU if no devices are provided #12410

Avoid fallback on CPU if no devices are provided #12410

rohitgr7 commented Mar 22, 2022 •

edited by github-actions bot

Loading

kaushikb11 left a comment •

edited

Loading

DuYicong515 commented Mar 25, 2022

awaelchli commented Mar 27, 2022

kaushikb11 Apr 6, 2022 •

edited

Loading

		@@ -506,6 +506,15 @@ def test_accelerator_cpu(_):
		trainer = Trainer(accelerator="cpu", gpus=1)


		@mock.patch("torch.cuda.is_available", return_value=False)

Avoid fallback on CPU if no devices are provided #12410

Avoid fallback on CPU if no devices are provided #12410

Conversation

rohitgr7 commented Mar 22, 2022 • edited by github-actions bot Loading

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

Did you have fun?

kaushikb11 left a comment • edited Loading

Choose a reason for hiding this comment

DuYicong515 commented Mar 25, 2022

awaelchli commented Mar 27, 2022

kaushikb11 Apr 6, 2022 • edited Loading

Choose a reason for hiding this comment

rohitgr7 commented Mar 22, 2022 •

edited by github-actions bot

Loading

kaushikb11 left a comment •

edited

Loading

kaushikb11 Apr 6, 2022 •

edited

Loading