guard against None in pytorch get_xla_supported_devices #9572

ckchow · 2021-09-16T21:01:55Z

What does this PR do?

Guard against None when checking if TPU's are available. This happens (e.g.) if you use google's standard deep learning containers on CPU only machines.

Fixes #9552

Does your PR introduce any breaking changes? If yes, please list them.

None

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

ckchow · 2021-09-16T21:04:41Z

I'm actually not sure how to write a proper test for this. Seems like it would have to pull an XLA container.

kaushikb11 · 2021-09-16T21:47:28Z

I'm actually not sure how to write a proper test for this. Seems like it would have to pull an XLA container.

We have TPU CI tests in place, But for this, we would require to pull an XLA container with no TPUs attached. Could you please add a comment for this in the code?

Do you think "the guard against None and return an empty list" should be added to the XLA's API itself? cc: @JackCaoG

SeanNaren

Thanks for this! At the least I think add a todo saying that this needs to be tested in a XLA supported container without TPUs!

kaushikb11

Blocking this merge for now. I have asked the XLA team if this change could instead be introduced on the pt/xla's side

JackCaoG · 2021-09-17T17:31:45Z

@kaushikb11 so get_xla_supported_devices will never return a empty list, so maybe we can just check if get_xla_supported_devices('TPU') != None

ckchow · 2021-09-17T17:41:34Z

I feel like somebody might change get_xla_supported_devices to return empty lists in the future (i.e. if I were a contributor to that codebase I would feel tempted to do so), so I think checking for None and len() > 0 explicitly is safer. I guess it's also correct to check tpus != None and tpus, but I always forget that empty lists are false in python so that feels a little unexpected.

JackCaoG · 2021-09-17T17:53:48Z

I could try to change the pt/xla code to return an empty list, but

That will also require me to change a couple other places in pt/xla
This is BC breaking, if there are users/frameworks that explicitly checking for !=None that will also have to change.

so I will prefer to leave the return None if no device found logic

pytorch_lightning/utilities/xla_device.py

tchaton

LGTM !

kaushikb11 · 2021-10-12T11:27:15Z

Thanks @ckchow for the contribution! We always look forward to improving our compatibility with TPUs! :)

codecov · 2021-10-12T11:45:28Z

Codecov Report

Merging #9572 (a3e9139) into master (b530b7a) will decrease coverage by 4%.
The diff coverage is 0%.

@@           Coverage Diff           @@
##           master   #9572    +/-   ##
=======================================
- Coverage      93%     89%    -4%     
=======================================
  Files         178     178            
  Lines       15652   15650     -2     
=======================================
- Hits        14508   13899   -609     
- Misses       1144    1751   +607

…#9572) Co-authored-by: Chris Chow <[email protected]> Co-authored-by: thomas chaton <[email protected]>

Chris Chow added 2 commits September 16, 2021 13:46

guard against None in pytorch get_xla_supported_devices

9e45358

changelog

2edaad9

ckchow marked this pull request as ready for review September 16, 2021 21:04

ckchow requested review from awaelchli, Borda, carmocca, justusschock, kaushikb11, SeanNaren, tchaton and williamFalcon as code owners September 16, 2021 21:04

awaelchli added accelerator: tpu Tensor Processing Unit feature Is an improvement or enhancement labels Sep 17, 2021

awaelchli added this to the v1.5 milestone Sep 17, 2021

awaelchli added bug Something isn't working and removed feature Is an improvement or enhancement labels Sep 17, 2021

awaelchli modified the milestones: v1.5, v1.4.x Sep 17, 2021

awaelchli approved these changes Sep 17, 2021

View reviewed changes

SeanNaren approved these changes Sep 17, 2021

View reviewed changes

mergify bot added ready PRs ready to be merged has conflicts labels Sep 17, 2021

kaushikb11 suggested changes Sep 17, 2021

View reviewed changes

mergify bot removed the ready PRs ready to be merged label Sep 17, 2021

carmocca reviewed Sep 17, 2021

View reviewed changes

pytorch_lightning/utilities/xla_device.py Outdated Show resolved Hide resolved

Merge branch 'master' into cchow/xla-devices-guard

8af18d0

tchaton requested a review from rohitgr7 as a code owner October 12, 2021 11:10

mergify bot removed the has conflicts label Oct 12, 2021

update on comments

a3e9139

tchaton approved these changes Oct 12, 2021

View reviewed changes

tchaton requested review from kaushikb11 and carmocca October 12, 2021 11:16

tchaton enabled auto-merge (squash) October 12, 2021 11:17

rohitgr7 approved these changes Oct 12, 2021

View reviewed changes

kaushikb11 approved these changes Oct 12, 2021

View reviewed changes

mergify bot added the ready PRs ready to be merged label Oct 12, 2021

tchaton merged commit f14a47a into Lightning-AI:master Oct 12, 2021

rohitgr7 pushed a commit to Tshimanga/pytorch-lightning that referenced this pull request Oct 18, 2021

guard against None in pytorch get_xla_supported_devices (Lightning-AI…

487473b

…#9572) Co-authored-by: Chris Chow <[email protected]> Co-authored-by: thomas chaton <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

guard against None in pytorch get_xla_supported_devices #9572

guard against None in pytorch get_xla_supported_devices #9572

ckchow commented Sep 16, 2021 •

edited

Loading

ckchow commented Sep 16, 2021

kaushikb11 commented Sep 16, 2021

SeanNaren left a comment

kaushikb11 left a comment

JackCaoG commented Sep 17, 2021

ckchow commented Sep 17, 2021

JackCaoG commented Sep 17, 2021

tchaton left a comment

kaushikb11 commented Oct 12, 2021

codecov bot commented Oct 12, 2021

guard against None in pytorch get_xla_supported_devices #9572

guard against None in pytorch get_xla_supported_devices #9572

Conversation

ckchow commented Sep 16, 2021 • edited Loading

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

Did you have fun?

ckchow commented Sep 16, 2021

kaushikb11 commented Sep 16, 2021

SeanNaren left a comment

Choose a reason for hiding this comment

kaushikb11 left a comment

Choose a reason for hiding this comment

JackCaoG commented Sep 17, 2021

ckchow commented Sep 17, 2021

JackCaoG commented Sep 17, 2021

tchaton left a comment

Choose a reason for hiding this comment

kaushikb11 commented Oct 12, 2021

codecov bot commented Oct 12, 2021

Codecov Report

ckchow commented Sep 16, 2021 •

edited

Loading