-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch for device placement (Reduce host and device syncs) #17334
Conversation
for more information, see https://pre-commit.ci
The main idea is to reduce host and device sync. |
# https://pytorch.org/docs/stable/notes/cuda.html#id5 | ||
ctx = torch.cuda.stream(torch.cuda.Stream()) if torch.cuda.is_available() else nullcontext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we avoid using the stream if not running on CUDA? Even if there are no known side effects, for sanity I would prefer that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, yes. We can check the root device type instead
What does this PR do?
Fixes #<issue_number>
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist