-
Notifications
You must be signed in to change notification settings - Fork 33.6k
[pipeline] A simple fix for half-precision & 8bit models
#21479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
53052d8
bdcfb26
dab8d80
420940a
62cf7df
9bbbaea
8d730f8
80e50c9
8714b5e
e5b3dc0
adf3ca4
6bea432
e57d8f8
23f0608
e80be11
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -749,7 +749,7 @@ def __init__( | |
| framework: Optional[str] = None, | ||
| task: str = "", | ||
| args_parser: ArgumentHandler = None, | ||
| device: Union[int, str, "torch.device"] = -1, | ||
| device: Union[int, str, "torch.device"] = None, | ||
| torch_dtype: Optional[Union[str, "torch.dtype"]] = None, | ||
| binary_output: bool = False, | ||
| **kwargs, | ||
|
|
@@ -769,18 +769,41 @@ def __init__( | |
| self.device = device | ||
| elif isinstance(device, str): | ||
| self.device = torch.device(device) | ||
| elif device < 0: | ||
| elif device is None or device < 0: | ||
| self.device = torch.device("cpu") | ||
| else: | ||
| self.device = torch.device(f"cuda:{device}") | ||
| else: | ||
| self.device = device | ||
| self.device = device if device is not None else -1 | ||
| self.torch_dtype = torch_dtype | ||
| self.binary_output = binary_output | ||
|
|
||
| # Special handling | ||
| if self.framework == "pt" and self.device.type != "cpu": | ||
| self.model = self.model.to(self.device) | ||
| if self.framework == "pt" and device is not None: | ||
| self.model = self.model.to(device=self.device) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't mix The proposed change I made was at least explicit about it's default value.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I somehow didn't fully considered your proposition in #21479 (comment) - I think it's wiser to revert my changes with yours! |
||
|
|
||
| hf_device_map = getattr(self.model, "hf_device_map", None) | ||
| if hf_device_map is not None: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's probably a way to structure code where this is written only once. I think directly in Essentially, when users use And to be even purer, we could modify the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💯 on this @Narsil |
||
| logger.warning( | ||
| "The model has been loaded with `accelerate` using `device_map=xxx` in `from_pretrained`" | ||
| " method, you should not pass a device when initializing your pipeline." | ||
| ) | ||
|
|
||
| if device is None and self.framework == "pt": | ||
| # `accelerate` device map | ||
| hf_device_map = getattr(self.model, "hf_device_map", None) | ||
| if hf_device_map is not None: | ||
| # Take the main device used by `accelerate`. | ||
| # adapted from: https://github.com/huggingface/transformers/pull/21479#issuecomment-1420833512 | ||
| if set(hf_device_map.values()) == {"cpu"} or set(hf_device_map.values()) == {"cpu", "disk"}: | ||
| accelerate_device = torch.device("cpu") | ||
| else: | ||
| main_device = [d for d in hf_device_map.values() if d not in ["cpu", "disk"]][0] | ||
| accelerate_device = torch.device(f"cuda:{main_device}") | ||
|
|
||
| self.device = accelerate_device | ||
| else: | ||
| self.device = torch.device("cpu") | ||
|
|
||
| # Update config with task specific parameters | ||
| task_specific_params = self.model.config.task_specific_params | ||
|
|
@@ -1048,8 +1071,8 @@ def __call__(self, inputs, *args, num_workers=None, batch_size=None, **kwargs): | |
| self.call_count += 1 | ||
| if self.call_count > 10 and self.framework == "pt" and self.device.type == "cuda": | ||
| warnings.warn( | ||
| "You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a" | ||
| " dataset", | ||
| "You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please" | ||
| " use a dataset", | ||
| UserWarning, | ||
| ) | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use a smaller example and say in a note the user can replace it by BLOOM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!