Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDAExtension Tests Failed #99

Open
somefunAgba opened this issue Aug 28, 2024 · 1 comment
Open

CUDAExtension Tests Failed #99

somefunAgba opened this issue Aug 28, 2024 · 1 comment

Comments

@somefunAgba
Copy link

somefunAgba commented Aug 28, 2024

Hi
@zou3519,
@goldsborough
@drisspg

  • OS: Win 11
  • PyTorch version: 2.4
  • How you installed PyTorch (conda, pip, source): pip
  • Python version: 3.10.6
  • CUDA/cuDNN version: 12.4
  • GPU models and configuration: NVIDIA GeForce RTX 3050
  • GCC version (if compiling from source):
  1. The README says to benchmark Python vs. C++ vs. CUDA:
python test/benchmark.py

But benchmark.py is missing in the repo.

  1. The README says to test the extension, run:
python test/test_extension.py

Outcome: The CPU tests pass but CUDA tests fail

  • Error messages and/or stack traces of the bug

To see which function was able to compute values, to the _test_correctness and _test_gradients function, I added.

print(device, result)
print(device,out)

ERROR MESSAGE

FAILED (errors=4)
R:\ai535py\tfcns\supplmat_asgm_nips [main ≡ +12 ~4 -0 !]> & R:/Programs/Python/Python310/python.exe r:/ai535py/tfcns/supplmat_asgm_nips/cppext/extension-cpp/test/test_extension.py
Fail to import hypothesis in common_utils, tests are not derandomized
.Ecpu tensor([1.0100, 0.7536, 0.4906])
cpu tensor([0.9194, 3.1691, 3.1769, 5.2353, 2.6696, 2.2265, 2.8924, 2.5661, 3.1713,
        3.3098, 1.5024, 2.8705, 3.8594, 3.1391, 3.4359, 2.2678, 3.3913, 5.5138,
        5.0064, 3.1829])
cpu tensor([-122.9332, -123.1868, -123.1781, -123.0917, -122.7786, -121.0263,
        -124.1490, -122.3633, -122.7848, -123.0149, -123.0245, -121.5838,
        -123.4751, -120.8471, -123.1614, -122.9177, -123.1612, -123.0329,
        -124.1176, -123.4624])
cpu tensor([[-0.8069,  1.7519, -0.1634],
        [ 0.9399, -1.4778, -0.5409]])
.Ecpu tensor([1.0100, 0.7536, 0.4906],
       grad_fn=<GeneratedBackwardFor_extension_cpp_mymuladd_defaultBackward>)
cpu tensor([0.9194, 3.1691, 3.1769, 5.2353, 2.6696, 2.2265, 2.8924, 2.5661, 3.1713,
        3.3098, 1.5024, 2.8705, 3.8594, 3.1391, 3.4359, 2.2678, 3.3913, 5.5138,
        5.0064, 3.1829],
       grad_fn=<GeneratedBackwardFor_extension_cpp_mymuladd_defaultBackward>)
cpu tensor([-122.9332, -123.1868, -123.1781, -123.0917, -122.7786, -121.0263,
        -124.1490, -122.3633, -122.7848, -123.0149, -123.0245, -121.5838,
        -123.4751, -120.8471, -123.1614, -122.9177, -123.1612, -123.0329,
        -124.1176, -123.4624],
       grad_fn=<GeneratedBackwardFor_extension_cpp_mymuladd_defaultBackward>)
cpu tensor([[-0.8069,  1.7519, -0.1634],
        [ 0.9399, -1.4778, -0.5409]],
       grad_fn=<GeneratedBackwardFor_extension_cpp_mymuladd_defaultBackward>)
.E.E
======================================================================
ERROR: test_opcheck_cuda (__main__.TestMyAddOut)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\optests\generate_tests.py", line 660, in opcheck
    tester(op, args, kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\optests\generate_tests.py", line 60, in safe_schema_check
    result = op(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 667, in __call__
    return self_._op(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_subclasses\schema_check_mode.py", line 156, in __torch_dispatch__
    out = func(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 667, in __call__
    return self_._op(*args, **kwargs)
NotImplementedError: Could not run 'extension_cpp::myadd_out' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::myadd_out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at extension_cpp\csrc\muladd.cpp:75 [kernel]
Meta: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\MetaFallbackKernel.cpp:23 [backend fallback]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:53 [backend fallback]
AutogradCPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:57 [backend fallback]
AutogradCUDA: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:65 [backend fallback]
AutogradXLA: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:69 [backend fallback]
AutogradMPS: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:77 [backend fallback]
AutogradXPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:61 [backend fallback]
AutogradHPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:90 [backend fallback]
AutogradLazy: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:73 [backend fallback]
AutogradMeta: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:81 [backend fallback]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastXPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:351 [backend fallback]
AutocastCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:157 [backend fallback]


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\common_utils.py", line 2744, in wrapper
    method(*args, **kwargs)
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 114, in test_opcheck_cuda
    self._opcheck("cuda")
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 107, in _opcheck
    opcheck(torch.ops.extension_cpp.myadd_out.default, args)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\optests\generate_tests.py", line 664, in opcheck
    raise OpCheckError(
torch.testing._internal.optests.generate_tests.OpCheckError: opcheck(op, ...): test_schema failed with Could not run 'extension_cpp::myadd_out' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::myadd_out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at extension_cpp\csrc\muladd.cpp:75 [kernel]
Meta: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\MetaFallbackKernel.cpp:23 [backend fallback]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:53 [backend fallback]
AutogradCPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:57 [backend fallback]
AutogradCUDA: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:65 [backend fallback]
AutogradXLA: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:69 [backend fallback]
AutogradMPS: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:77 [backend fallback]
AutogradXPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:61 [backend fallback]
AutogradHPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:90 [backend fallback]
AutogradLazy: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:73 [backend fallback]
AutogradMeta: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:81 [backend fallback]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastXPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:351 [backend fallback]
AutocastCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:157 [backend fallback]
 (scroll up for stack trace)

To execute this test, run the following from the base repo dir:
     python test\test_extension.py -k TestMyAddOut.test_opcheck_cuda

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

======================================================================
ERROR: test_correctness_cuda (__main__.TestMyMulAdd)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\common_utils.py", line 2744, in wrapper
    method(*args, **kwargs)
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 43, in test_correctness_cuda
    self._test_correctness("cuda")
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 33, in _test_correctness
    result = extension_cpp.ops.mymuladd(*args)
  File "R:\Programs\Python\Python310\lib\site-packages\extension_cpp\ops.py", line 9, in mymuladd
    return torch.ops.extension_cpp.mymuladd.default(a, b, c)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 667, in __call__
    return self_._op(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_library\autograd.py", line 98, in autograd_impl
    result = Generated.apply(*args, Metadata(keyset, keyword_only_args))  # type: ignore[attr-defined]
  File "R:\Programs\Python\Python310\lib\site-packages\torch\autograd\function.py", line 574, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_library\autograd.py", line 40, in forward
    result = op.redispatch(keyset & _C._after_autograd_keyset, *args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 672, in redispatch
    return self_._handle.redispatch_boxed(keyset, *args, **kwargs)
NotImplementedError: Could not run 'extension_cpp::mymuladd' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::mymuladd' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at extension_cpp\csrc\muladd.cpp:75 [kernel]
Meta: registered at /dev/null:154 [kernel]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at /dev/null:48 [autograd kernel]
AutogradCPU: registered at /dev/null:48 [autograd kernel]
AutogradCUDA: registered at /dev/null:48 [autograd kernel]
AutogradHIP: registered at /dev/null:48 [autograd kernel]
AutogradXLA: registered at /dev/null:48 [autograd kernel]
AutogradMPS: registered at /dev/null:48 [autograd kernel]
AutogradIPU: registered at /dev/null:48 [autograd kernel]
AutogradXPU: registered at /dev/null:48 [autograd kernel]
AutogradHPU: registered at /dev/null:48 [autograd kernel]
AutogradVE: registered at /dev/null:48 [autograd kernel]
AutogradLazy: registered at /dev/null:48 [autograd kernel]
AutogradMTIA: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse1: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse2: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse3: registered at /dev/null:48 [autograd kernel]
AutogradMeta: registered at /dev/null:48 [autograd kernel]
AutogradNestedTensor: registered at /dev/null:48 [autograd kernel]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastXPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:351 [backend fallback]
AutocastCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:157 [backend fallback]


To execute this test, run the following from the base repo dir:
     python test\test_extension.py -k TestMyMulAdd.test_correctness_cuda

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

======================================================================
ERROR: test_gradients_cuda (__main__.TestMyMulAdd)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\common_utils.py", line 2744, in wrapper
    method(*args, **kwargs)
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 64, in test_gradients_cuda
    self._test_gradients("cuda")
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 49, in _test_gradients
    out = extension_cpp.ops.mymuladd(*args)
  File "R:\Programs\Python\Python310\lib\site-packages\extension_cpp\ops.py", line 9, in mymuladd
    return torch.ops.extension_cpp.mymuladd.default(a, b, c)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 667, in __call__
    return self_._op(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_library\autograd.py", line 98, in autograd_impl
    result = Generated.apply(*args, Metadata(keyset, keyword_only_args))  # type: ignore[attr-defined]
  File "R:\Programs\Python\Python310\lib\site-packages\torch\autograd\function.py", line 574, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_library\autograd.py", line 40, in forward
    result = op.redispatch(keyset & _C._after_autograd_keyset, *args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 672, in redispatch
    return self_._handle.redispatch_boxed(keyset, *args, **kwargs)
NotImplementedError: Could not run 'extension_cpp::mymuladd' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::mymuladd' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at extension_cpp\csrc\muladd.cpp:75 [kernel]
Meta: registered at /dev/null:154 [kernel]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at /dev/null:48 [autograd kernel]
AutogradCPU: registered at /dev/null:48 [autograd kernel]
AutogradCUDA: registered at /dev/null:48 [autograd kernel]
AutogradHIP: registered at /dev/null:48 [autograd kernel]
AutogradXLA: registered at /dev/null:48 [autograd kernel]
AutogradMPS: registered at /dev/null:48 [autograd kernel]
AutogradIPU: registered at /dev/null:48 [autograd kernel]
AutogradXPU: registered at /dev/null:48 [autograd kernel]
AutogradHPU: registered at /dev/null:48 [autograd kernel]
AutogradVE: registered at /dev/null:48 [autograd kernel]
AutogradLazy: registered at /dev/null:48 [autograd kernel]
AutogradMTIA: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse1: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse2: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse3: registered at /dev/null:48 [autograd kernel]
AutogradMeta: registered at /dev/null:48 [autograd kernel]
AutogradNestedTensor: registered at /dev/null:48 [autograd kernel]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastXPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:351 [backend fallback]
AutocastCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:157 [backend fallback]


To execute this test, run the following from the base repo dir:
     python test\test_extension.py -k TestMyMulAdd.test_gradients_cuda

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

======================================================================
ERROR: test_opcheck_cuda (__main__.TestMyMulAdd)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\optests\generate_tests.py", line 660, in opcheck
    tester(op, args, kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\optests\generate_tests.py", line 60, in safe_schema_check
    result = op(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 667, in __call__
    return self_._op(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_library\autograd.py", line 98, in autograd_impl
    result = Generated.apply(*args, Metadata(keyset, keyword_only_args))  # type: ignore[attr-defined]
  File "R:\Programs\Python\Python310\lib\site-packages\torch\autograd\function.py", line 574, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_library\autograd.py", line 40, in forward
    result = op.redispatch(keyset & _C._after_autograd_keyset, *args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 672, in redispatch
    return self_._handle.redispatch_boxed(keyset, *args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_subclasses\schema_check_mode.py", line 156, in __torch_dispatch__
    out = func(*args, **kwargs)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 667, in __call__
    return self_._op(*args, **kwargs)
NotImplementedError: Could not run 'extension_cpp::mymuladd' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::mymuladd' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at extension_cpp\csrc\muladd.cpp:75 [kernel]
Meta: registered at /dev/null:154 [kernel]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at /dev/null:48 [autograd kernel]
AutogradCPU: registered at /dev/null:48 [autograd kernel]
AutogradCUDA: registered at /dev/null:48 [autograd kernel]
AutogradHIP: registered at /dev/null:48 [autograd kernel]
AutogradXLA: registered at /dev/null:48 [autograd kernel]
AutogradMPS: registered at /dev/null:48 [autograd kernel]
AutogradIPU: registered at /dev/null:48 [autograd kernel]
AutogradXPU: registered at /dev/null:48 [autograd kernel]
AutogradHPU: registered at /dev/null:48 [autograd kernel]
AutogradVE: registered at /dev/null:48 [autograd kernel]
AutogradLazy: registered at /dev/null:48 [autograd kernel]
AutogradMTIA: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse1: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse2: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse3: registered at /dev/null:48 [autograd kernel]
AutogradMeta: registered at /dev/null:48 [autograd kernel]
AutogradNestedTensor: registered at /dev/null:48 [autograd kernel]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastXPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:351 [backend fallback]
AutocastCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:157 [backend fallback]


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\common_utils.py", line 2744, in wrapper
    method(*args, **kwargs)
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 78, in test_opcheck_cuda
    self._opcheck("cuda")
  File "r:\ai535py\tfcns\supplmat_asgm_nips\cppext\extension-cpp\test\test_extension.py", line 71, in _opcheck
    opcheck(torch.ops.extension_cpp.mymuladd.default, args)
  File "R:\Programs\Python\Python310\lib\site-packages\torch\testing\_internal\optests\generate_tests.py", line 664, in opcheck
    raise OpCheckError(
torch.testing._internal.optests.generate_tests.OpCheckError: opcheck(op, ...): test_schema failed with Could not run 'extension_cpp::mymuladd' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::mymuladd' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at extension_cpp\csrc\muladd.cpp:75 [kernel]
Meta: registered at /dev/null:154 [kernel]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at /dev/null:48 [autograd kernel]
AutogradCPU: registered at /dev/null:48 [autograd kernel]
AutogradCUDA: registered at /dev/null:48 [autograd kernel]
AutogradHIP: registered at /dev/null:48 [autograd kernel]
AutogradXLA: registered at /dev/null:48 [autograd kernel]
AutogradMPS: registered at /dev/null:48 [autograd kernel]
AutogradIPU: registered at /dev/null:48 [autograd kernel]
AutogradXPU: registered at /dev/null:48 [autograd kernel]
AutogradHPU: registered at /dev/null:48 [autograd kernel]
AutogradVE: registered at /dev/null:48 [autograd kernel]
AutogradLazy: registered at /dev/null:48 [autograd kernel]
AutogradMTIA: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse1: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse2: registered at /dev/null:48 [autograd kernel]
AutogradPrivateUse3: registered at /dev/null:48 [autograd kernel]
AutogradMeta: registered at /dev/null:48 [autograd kernel]
AutogradNestedTensor: registered at /dev/null:48 [autograd kernel]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastXPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:351 [backend fallback]
AutocastCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:157 [backend fallback]
 (scroll up for stack trace)

To execute this test, run the following from the base repo dir:
     python test\test_extension.py -k TestMyMulAdd.test_opcheck_cuda

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

----------------------------------------------------------------------
Ran 8 tests in 1.676s

FAILED (errors=4)

@somefunAgba
Copy link
Author

I tried implementing the code in a new folder, and this is what I get, when using CUDAExtension.
CppExtension works perfectly though.

Traceback (most recent call last):
  File "r:\tut_cpp_extension\test\test.py", line 10, in <module>
    import extension_cpp
  File "R:\Programs\Python\Python310\lib\site-packages\extension_cpp\__init__.py", line 2, in <module>
    from . import _C, ops
ImportError: DLL load failed while importing _C: A dynamic link library (DLL) initialization routine failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant