-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move PyTorch test into standalone tests #6397
Conversation
It doesn't need to be internal. Also simplified to use only public API, updated the expected correctness, and avoided the need to have cuda present on the system to test for cuda output (since we can cross-compile to generate the C++ output anywhere).
test/correctness/pytorch.cpp
Outdated
Internal::ensure_no_file_exists(pytorch_out); | ||
|
||
std::vector<Argument> args{alpha, beta}; | ||
buf.compile_to({{Output::pytorch_wrapper, pytorch_out}}, args, "test2", Target("host-cuda-user_context")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "host-cuda" going to do on systems that don't support cuda? I guess it just won't turn on any flags after failing to find the cuda drivers.
Can we just use something explicit like linux-x86-64-cuda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "host-cuda" going to do on systems that don't support cuda?
I tested it on my MacBook -- which doesn't support cuda -- and the test worked fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...but it's failing weirdly on Windows :-/
EDIT: \n
vs \r\n
, grr
I think the issue I encountered occurs when CUDA libraries are installed on the system but no CUDA-capable device exists. Can't we just skip the test on machines without |
I don't think we need cuda to be present to test the cuda crosscompile. AFAICT it wasn't required for the internal test, either -- it was just written that way. |
This is not really testing a pure cross-compile though, because it's "host" cuda, so some assumptions are being made by sniffing the host system. It's failing gracefully enough when cuda doesn't exist, but I'd be more comfortable if it were some explicit target so that we're not relying on graceful fallback behavior when "host-cuda" makes no sense. I think "linux-x86-64-cuda" or some other explicit cuda-using target would be better. |
I guess I don't understand, because I don't know what assumptions are being made that are relevant here; CodeGen_PyTorch only cares about whether the given Target has CUDA set or not -- it never attempts to make any checks into Cuda capabilities. This is 100% AOT compilation. Sure, using a fixed target like |
...sigh, we call |
Yeah, just resolving the Target "host-cuda" into what it really means requires loading the cuda drivers to see what cuda capabilities the host has. |
It's the cuda-equivalent of resolving "host" by checking cpuid on the host system. |
Bots are green. |
test/correctness/pytorch.cpp
Outdated
@@ -149,11 +148,21 @@ inline int test1_th_(float _alpha, int32_t _beta, at::Tensor &_buf) { | |||
} | |||
|
|||
{ | |||
// No: parsing a target string with 'cuda' in it will always call `calculate_host_cuda_capability()`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is right, it's a bug. It's supposed to only call calculate_host_cuda_capability if the target string contains both "host" and "cuda". The target string I suggested earlier shouldn't call that function.
"host" means "please sniff my host system".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditions under which we call it are:
is_host &&
t.has_feature(Target::CUDA) &&
!t.has_feature(Target::CUDACapability30) &&
!t.has_feature(Target::CUDACapability32) &&
!t.has_feature(Target::CUDACapability35) &&
!t.has_feature(Target::CUDACapability50) &&
!t.has_feature(Target::CUDACapability61) &&
!t.has_feature(Target::CUDACapability70) &&
!t.has_feature(Target::CUDACapability75) &&
!t.has_feature(Target::CUDACapability80))
So neither "linux-x86-64-cuda" nor "host-cuda-cuda_capability_30" will call that function.
You are correct! I missed that subtlety. I'll update the test and add comments. |
It doesn't need to be internal. Also simplified to use only public API, updated the expected correctness, and avoided the need to have cuda present on the system to test for cuda output (since we can cross-compile to generate the C++ output anywhere).