-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROCm changes on uvm. #1142
ROCm changes on uvm. #1142
Changes from all commits
59267a9
a223936
4610075
a506c52
f596bde
0cfb792
f13af44
25e5b71
dcbe19f
fda048e
00abba1
cf307b6
2d66ea8
958679b
e642a48
d0d294a
146f2df
eb0cf36
0c86f2b
6e7f13e
edd3306
9a45f4a
309a3a1
358eaf5
3a915a8
40928ba
69abf78
5c0096e
bfac874
1cf7e84
5b33287
2c514c5
0d5a012
ae14a47
1718605
bc902a3
0d95948
9a5a33b
6490dbc
77627ae
18b48e9
99a70e1
fed56ff
4b39a70
785afb8
bbd0ad1
9db83d8
eabd0a8
2038008
0202078
9cf8856
b885322
0e3dfdb
1f926e9
adefcc0
b96bd9a
3a1c2a3
f664267
ebb0154
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,7 +19,7 @@ | |
|
||
if open_source: | ||
# pyre-ignore[21] | ||
from test_utils import gpu_available, gpu_unavailable | ||
from test_utils import gpu_available, gpu_unavailable, skipIfRocm | ||
else: | ||
from fbgemm_gpu.test.test_utils import gpu_available, gpu_unavailable | ||
|
||
|
@@ -80,6 +80,7 @@ def test_enum(self) -> None: | |
# pyre-ignore[16] | ||
assert cudaMemoryAdvise.cudaMemAdviseSetAccessedBy.value == 5 | ||
|
||
@skipIfRocm | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are all these unit tests skipped? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It complains "out of memory" on some of the ROCm devices. We're investigating it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Issue is because a large amount of memory (>1TB) was allocated in the tests. We are checking if this is valid. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there any progress here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, not yet. I've submitted a JIRA ticket internally, but it may take some time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I got it. Thanks for your updates. |
||
@unittest.skipIf(*gpu_unavailable) | ||
@given( | ||
sizes=st.lists( | ||
|
@@ -123,6 +124,7 @@ def test_cudaMemPrefetchAsync(self, sizes: List[int], vanilla: bool) -> None: | |
|
||
torch.cuda.synchronize(torch.device("cuda:0")) | ||
|
||
@skipIfRocm | ||
@unittest.skipIf(*gpu_unavailable or torch.cuda.device_count() < 2) | ||
@given( | ||
sizes=st.lists( | ||
|
@@ -154,6 +156,7 @@ def test_uvm_to_device(self, sizes: List[int], vanilla: bool) -> None: | |
assert torch.ops.fbgemm.uvm_storage(second_t) | ||
assert second_t.device == device_prototype.device | ||
|
||
@skipIfRocm | ||
@unittest.skipIf(*gpu_unavailable) | ||
@given( | ||
sizes=st.lists( | ||
|
@@ -183,6 +186,7 @@ def test_uvm_slice(self, sizes: List[int], vanilla: bool) -> None: | |
assert torch.ops.fbgemm.is_uvm_tensor(uvm_slice) | ||
assert torch.ops.fbgemm.uvm_storage(cpu_slice) | ||
|
||
@skipIfRocm | ||
@unittest.skipIf(*gpu_unavailable) | ||
@given( | ||
sizes=st.lists( | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this in the first review pass. I might be just missing the context, but why do we need to update the quantization test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our discussions require us testing specific versions of
num_rows
andnum_columns
, not in the standard spectrum tested here. The old implementation of this file hadnum_rows
andnum_columns
as arguments. This is bringing some of that functionality back.