-
Notifications
You must be signed in to change notification settings - Fork 33.8k
[OPT] Run test in lower precision on GPU #17353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
db27c99
77b2f6e
a533e06
cccd907
6002a65
f72ae0d
3db3ac2
8f1cc48
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -268,17 +268,22 @@ def _long_tensor(tok_lst): | |
| @require_torch | ||
| class OPTModelIntegrationTests(unittest.TestCase): | ||
| @slow | ||
| @unittest.skipIf(torch_device != "cpu", "Cannot make deterministic on GPU") | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Getting different logits results on GPU depending on PyTorch version (1.10+cu11.0 vs. 1.11+cu11.4) and results also differ between CPU and GPU. Only on CPU it seems to be deterministic. It's not because the weights are saved & loaded in FP16 - checked that the same happens when weights are stored in fp32 and loaded in fp32.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Upgrading precision to 1e-2 so that tests pass on GPU - think that's the best we can do
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting observation:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. getting logits to exact match is very difficult at times. Perhaps using a much longer input and checking that softmax output matches - after all logits fluctuations don't matter at the end but 100% softmax match does.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you use a very high tolerance then it becomes questionable whether the test is doing anything. |
||
| def test_inference_no_head(self): | ||
| # model is not deterministic on GPU, not sure why | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (some) non-deterministic is mentioned The diff
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually my comment is not great - the model is deterministic on GPU, but it's just that results differ depending on the CUDA vesion and CUDA != CPU
patrickvonplaten marked this conversation as resolved.
Outdated
|
||
| model = OPTModel.from_pretrained("facebook/opt-350m").to(torch_device) | ||
| input_ids = _long_tensor([[0, 31414, 232, 328, 740, 1140, 12695, 69, 46078, 1588, 2]]) | ||
|
|
||
| with torch.no_grad(): | ||
| output = model(input_ids=input_ids).last_hidden_state | ||
|
|
||
| expected_shape = torch.Size((1, 11, 512)) | ||
| self.assertEqual(output.shape, expected_shape) | ||
| expected_slice = torch.tensor( | ||
| [[-0.2867, -1.9256, -0.3062], [-1.2711, -0.1337, -0.1897], [0.4109, 0.1187, -1.3142]], device=torch_device | ||
| [[-0.2873, -1.9242, -0.3059], [-1.2738, -0.1333, -0.1877], [0.4116, 0.1192, -1.3107]], | ||
| device=torch_device, | ||
| ) | ||
| self.assertTrue(torch.allclose(output[:, :3, :3], expected_slice, atol=1e-3)) | ||
| assert_tensors_close(output[0, :3, :3], expected_slice, atol=1e-3) | ||
|
|
||
|
|
||
| @require_torch | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned this up a bit. Think we should try to align it as much as possible to Bart here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks :) , I have to update the tf code based on that I think