Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move PyTorch test into standalone tests #6397

Merged
merged 5 commits into from
Nov 9, 2021
Merged

Conversation

steven-johnson
Copy link
Contributor

It doesn't need to be internal. Also simplified to use only public API, updated the expected correctness, and avoided the need to have cuda present on the system to test for cuda output (since we can cross-compile to generate the C++ output anywhere).

It doesn't need to be internal. Also simplified to use only public API, updated the expected correctness, and avoided the need to have cuda present on the system to test for cuda output (since we can cross-compile to generate the C++ output anywhere).
Internal::ensure_no_file_exists(pytorch_out);

std::vector<Argument> args{alpha, beta};
buf.compile_to({{Output::pytorch_wrapper, pytorch_out}}, args, "test2", Target("host-cuda-user_context"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "host-cuda" going to do on systems that don't support cuda? I guess it just won't turn on any flags after failing to find the cuda drivers.

Can we just use something explicit like linux-x86-64-cuda?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "host-cuda" going to do on systems that don't support cuda?

I tested it on my MacBook -- which doesn't support cuda -- and the test worked fine.

Copy link
Contributor Author

@steven-johnson steven-johnson Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but it's failing weirdly on Windows :-/

EDIT: \n vs \r\n, grr

@shoaibkamil
Copy link
Contributor

I think the issue I encountered occurs when CUDA libraries are installed on the system but no CUDA-capable device exists. Can't we just skip the test on machines without -cuda in the target?

@steven-johnson
Copy link
Contributor Author

I don't think we need cuda to be present to test the cuda crosscompile. AFAICT it wasn't required for the internal test, either -- it was just written that way.

@abadams
Copy link
Member

abadams commented Nov 8, 2021

This is not really testing a pure cross-compile though, because it's "host" cuda, so some assumptions are being made by sniffing the host system. It's failing gracefully enough when cuda doesn't exist, but I'd be more comfortable if it were some explicit target so that we're not relying on graceful fallback behavior when "host-cuda" makes no sense. I think "linux-x86-64-cuda" or some other explicit cuda-using target would be better.

@steven-johnson
Copy link
Contributor Author

This is not really testing a pure cross-compile though, because it's "host" cuda, so some assumptions are being made by sniffing the host system.

I guess I don't understand, because I don't know what assumptions are being made that are relevant here; CodeGen_PyTorch only cares about whether the given Target has CUDA set or not -- it never attempts to make any checks into Cuda capabilities. This is 100% AOT compilation. Sure, using a fixed target like linux-x86-64-cuda would make things more predictable, I guess, but I don't understand why what we're doing is wrong in any fashion.

@steven-johnson
Copy link
Contributor Author

...sigh, we call calculate_host_cuda_capability() no matter what, and this fails on arm systems because reasons. Standby.

@abadams
Copy link
Member

abadams commented Nov 8, 2021

Yeah, just resolving the Target "host-cuda" into what it really means requires loading the cuda drivers to see what cuda capabilities the host has.

@abadams
Copy link
Member

abadams commented Nov 8, 2021

It's the cuda-equivalent of resolving "host" by checking cpuid on the host system.

@steven-johnson
Copy link
Contributor Author

Bots are green.

@@ -149,11 +148,21 @@ inline int test1_th_(float _alpha, int32_t _beta, at::Tensor &_buf) {
}

{
// No: parsing a target string with 'cuda' in it will always call `calculate_host_cuda_capability()`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is right, it's a bug. It's supposed to only call calculate_host_cuda_capability if the target string contains both "host" and "cuda". The target string I suggested earlier shouldn't call that function.

"host" means "please sniff my host system".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditions under which we call it are:

is_host &&
        t.has_feature(Target::CUDA) &&
        !t.has_feature(Target::CUDACapability30) &&
        !t.has_feature(Target::CUDACapability32) &&
        !t.has_feature(Target::CUDACapability35) &&
        !t.has_feature(Target::CUDACapability50) &&
        !t.has_feature(Target::CUDACapability61) &&
        !t.has_feature(Target::CUDACapability70) &&
        !t.has_feature(Target::CUDACapability75) &&
        !t.has_feature(Target::CUDACapability80))

So neither "linux-x86-64-cuda" nor "host-cuda-cuda_capability_30" will call that function.

@steven-johnson
Copy link
Contributor Author

It's supposed to only call calculate_host_cuda_capability if the target string contains both "host" and "cuda"

You are correct! I missed that subtlety. I'll update the test and add comments.

@steven-johnson steven-johnson merged commit b021f87 into master Nov 9, 2021
@steven-johnson steven-johnson deleted the srj/pytorch-test branch November 9, 2021 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants