-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformers GPU Support #4987
Transformers GPU Support #4987
Conversation
WalkthroughThe changes primarily involve modifications to device allocation handling for models and processors in the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
- fiftyone/utils/transformers.py (12 hunks)
🔇 Additional comments (1)
fiftyone/utils/transformers.py (1)
Line range hint
1-1089
: Implementation successfully adds GPU support.The changes effectively implement GPU support for transformer models, aligning with the PR objectives. The implementation consistently moves models to GPU when available, which should improve inference performance as intended.
Verify GPU usage with:
✅ Verification successful
The search results confirm that CUDA support is consistently implemented across the transformers module and other related modules in the codebase. Let me verify the device handling implementation further:
GPU support implementation verified successfully
The implementation properly handles GPU support across the transformers module with:
- Consistent device detection using
torch.cuda.is_available()
- Proper model movement to GPU using
.to(device)
for all model operations- Correct tensor operations on the target device
- Consistent pattern of device handling across different transformer model types
The implementation follows PyTorch best practices for GPU utilization and maintains consistency with the rest of the codebase's GPU support patterns.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if CUDA operations are properly implemented rg "cuda" --type pyLength of output: 1885
Script:
#!/bin/bash # Check if device property is properly used with CUDA tensors rg "\.to\(.*device" --type pyLength of output: 1984
self.device = ( | ||
"cuda" if next(self.model.parameters()).is_cuda else "cpu" | ||
) | ||
self.device = "cuda" if torch.cuda.is_available() else "cpu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor repeated device determination logic.
The device determination logic device = "cuda" if torch.cuda.is_available() else "cpu"
is duplicated across multiple locations. Consider centralizing this into a utility function to improve maintainability and reduce code duplication.
Create a utility function at the module level:
def get_device():
return "cuda" if torch.cuda.is_available() else "cpu"
Then replace all occurrences with:
self.device = get_device()
# or
device = get_device()
Also applies to: 499-499, 584-584, 696-696, 751-751, 824-824, 878-878, 932-932, 1087-1087
model = transformers.AutoModel.from_pretrained(config.name_or_path).to( | ||
device | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for CUDA memory allocation.
When loading models to GPU, consider adding error handling for CUDA out of memory scenarios. This is particularly important for large transformer models.
Add try-catch block around model loading:
try:
model = transformers.AutoModel.from_pretrained(config.name_or_path).to(device)
except torch.cuda.OutOfMemoryError:
logger.warning("CUDA out of memory. Falling back to CPU")
device = "cpu"
model = transformers.AutoModel.from_pretrained(config.name_or_path).to(device)
Also applies to: 698-699, 826-827, 881-882, 934-935
return transformers.AutoProcessor.from_pretrained(name_or_path).to( | ||
self.device | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary processor device movement.
Moving the processor to device with .to(device)
is unnecessary as the processor is typically a CPU-bound operation that handles preprocessing. This could unnecessarily consume GPU memory.
Remove the device movement:
-return transformers.AutoProcessor.from_pretrained(name_or_path).to(
- self.device
-)
+return transformers.AutoProcessor.from_pretrained(name_or_path)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
return transformers.AutoProcessor.from_pretrained(name_or_path).to( | |
self.device | |
) | |
return transformers.AutoProcessor.from_pretrained(name_or_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
What changes are proposed in this pull request?
Adding GPU support to transformers utils
How is this patch tested? If it is not, please explain why.
Run this notebook tutorial with gpu https://docs.voxel51.com/integrations/huggingface.html#batch-inference
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
New Features
Bug Fixes