We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example: https://github.com/pytorch/xla/runs/39089457932, https://github.com/pytorch/xla/runs/39089458228
Error excerpt from Google Cloud Build:
Loading: 1 packages loaded Analyzing: target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (2 packages loaded, 0 targets configured) WARNING: Download from https://mirror.bazel.build/github.com/bazelbuild/platforms/releases/download/0.0.9/platforms-0.0.7.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found Analyzing: target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (36 packages loaded, 9 targets configured) Analyzing: target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (36 packages loaded, 9 targets configured) Analyzing: target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (36 packages loaded, 9 targets configured) Analyzing: target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (100 packages loaded, 730 targets configured) Analyzing: target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (202 packages loaded, 7959 targets configured) Analyzing: target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (236 packages loaded, 18188 targets configured) INFO: Analyzed target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so (239 packages loaded, 20620 targets configured). INFO: Found 1 target... [0 / 951] [Prepa] BazelWorkspaceStatusAction stable-status.txt ... (10 actions, 0 running) [2,169 / 7,504] Compiling llvm/lib/Support/VersionTuple.cpp [for tool]; 0s remote-cache ... (31 actions, 0 running) [5,563 / 10,195] Compiling xla/service/gpu/kernels/topk_kernel_bfloat16.cu.cc; 0s remote-cache ... (31 actions, 0 running) [6,304 / 12,193] Compiling xla/service/cpu/runtime_single_threaded_matmul_f16.cc; 0s local, remote-cache ... (46 actions, 35 running) [6,711 / 12,239] Compiling xla/service/cpu/runtime_single_threaded_matmul_f16.cc; 1s local, remote-cache ... (56 actions, 48 running) [6,974 / 12,467] Compiling xla/service/cpu/runtime_single_threaded_matmul_f16.cc; 2s local, remote-cache ... (58 actions, 52 running) ERROR: /root/.cache/bazel/_bazel_root/2ba57cc32d8c1f12152416615363d16d/external/xla/xla/stream_executor/cuda/BUILD:1896:11: Compiling xla/stream_executor/cuda/tma_util.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @xla//xla/stream_executor/cuda:tma_util) external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-opt/bin/external/xla/xla/stream_executor/cuda/_objs/tma_util/tma_util.pic.d ... (remaining 158 arguments skipped) In file included from external/xla/xla/stream_executor/cuda/tma_util.cc:16: external/xla/xla/stream_executor/cuda/tma_util.h:25:16: error: ‘CUtensorMapDataType’ was not declared in this scope 25 | absl::StatusOr<CUtensorMapDataType> GetTensorMapDataType(int element_size); | ^~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.h:25:35: error: template argument 1 is invalid 25 | absl::StatusOr<CUtensorMapDataType> GetTensorMapDataType(int element_size); | ^ external/xla/xla/stream_executor/cuda/tma_util.h:27:1: error: ‘CUtensorMapSwizzle’ does not name a type 27 | CUtensorMapSwizzle GetTensorMapSwizzle(TmaDescriptor::TmaSwizzle swizzle); | ^~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.h:29:1: error: ‘CUtensorMapL2promotion’ does not name a type 29 | CUtensorMapL2promotion GetTensorMapL2Promotion( | ^~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.h:32:1: error: ‘CUtensorMapFloatOOBfill’ does not name a type 32 | CUtensorMapFloatOOBfill GetTensorMapFloatOOBFill( | ^~~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.h:35:1: error: ‘CUtensorMapInterleave’ does not name a type 35 | CUtensorMapInterleave GetTensorMapInterleave( | ^~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:26:16: error: ‘CUtensorMapDataType’ was not declared in this scope; did you mean ‘GetTensorMapDataType’? 26 | absl::StatusOr<CUtensorMapDataType> GetTensorMapDataType(int element_size) { | ^~~~~~~~~~~~~~~~~~~ | GetTensorMapDataType external/xla/xla/stream_executor/cuda/tma_util.cc:26:35: error: template argument 1 is invalid 26 | absl::StatusOr<CUtensorMapDataType> GetTensorMapDataType(int element_size) { | ^ external/xla/xla/stream_executor/cuda/tma_util.cc: In function ‘int stream_executor::gpu::GetTensorMapDataType(int)’: external/xla/xla/stream_executor/cuda/tma_util.cc:29:14: error: ‘CU_TENSOR_MAP_DATA_TYPE_UINT8’ was not declared in this scope 29 | return CU_TENSOR_MAP_DATA_TYPE_UINT8; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:31:14: error: ‘CU_TENSOR_MAP_DATA_TYPE_UINT16’ was not declared in this scope 31 | return CU_TENSOR_MAP_DATA_TYPE_UINT16; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:33:14: error: ‘CU_TENSOR_MAP_DATA_TYPE_UINT32’ was not declared in this scope 33 | return CU_TENSOR_MAP_DATA_TYPE_UINT32; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:35:14: error: ‘CU_TENSOR_MAP_DATA_TYPE_UINT64’ was not declared in this scope 35 | return CU_TENSOR_MAP_DATA_TYPE_UINT64; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:37:40: error: cannot convert ‘absl::lts_20230802::Status’ to ‘int’ in return 37 | return absl::InvalidArgumentError( | ~~~~~~~~~~~~~~~~~~~~~~~~~~^ | | | absl::lts_20230802::Status 38 | absl::StrFormat("unsupported element size: %d", element_size)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc: At global scope: external/xla/xla/stream_executor/cuda/tma_util.cc:42:1: error: ‘CUtensorMapSwizzle’ does not name a type 42 | CUtensorMapSwizzle GetTensorMapSwizzle(TmaDescriptor::TmaSwizzle swizzle) { | ^~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:55:1: error: ‘CUtensorMapL2promotion’ does not name a type 55 | CUtensorMapL2promotion GetTensorMapL2Promotion( | ^~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:69:1: error: ‘CUtensorMapFloatOOBfill’ does not name a type 69 | CUtensorMapFloatOOBfill GetTensorMapFloatOOBFill( | ^~~~~~~~~~~~~~~~~~~~~~~ external/xla/xla/stream_executor/cuda/tma_util.cc:79:1: error: ‘CUtensorMapInterleave’ does not name a type 79 | CUtensorMapInterleave GetTensorMapInterleave( | ^~~~~~~~~~~~~~~~~~~~~ Target @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 45.902s, Critical Path: 4.05s INFO: 7037 processes: 2405 remote cache hit, 4611 internal, 21 local. FAILED: Build did NOT complete successfully INFO: Streaming build results to: https://source.cloud.google.com/results/invocations/7d4b02f3-cb2b-4ea6-85bd-b93f1ac969dd Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module> main() File "/usr/local/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel return hook(config_settings) ^^^^^^^^^^^^^^^^^^^^^ File "/tmp/pip-build-env-yehbnrwl/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 334, in get_requires_for_build_wheel return self._get_build_requires(config_settings, requirements=[]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/pip-build-env-yehbnrwl/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 304, in _get_build_requires self.run_setup() File "/tmp/pip-build-env-yehbnrwl/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 320, in run_setup exec(code, locals()) File "<string>", line 11, in <module> File "/src/pytorch/xla/plugins/cuda/../../build_util.py", line 67, in bazel_build subprocess.check_call(bazel_argv, stdout=sys.stdout, stderr=sys.stderr) File "/usr/local/lib/python3.11/subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['bazel', 'build', '@xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so', '--symlink_prefix=/src/pytorch/xla/plugins/cuda/bazel-', '--config=remote_cache', '--config=cuda', '--remote_default_exec_properties=cache-silo-key=cache-silo-amd64-cuda-17']' returned non-zero exit status 1. error: subprocess-exited-with-error
The text was updated successfully, but these errors were encountered:
This looks like a CUDA version problem. Maybe we need to update OpenXLA into a version that supports CUDA 12.8.
Sorry, something went wrong.
ysiraichi
No branches or pull requests
Example: https://github.com/pytorch/xla/runs/39089457932, https://github.com/pytorch/xla/runs/39089458228
Error excerpt from Google Cloud Build:
The text was updated successfully, but these errors were encountered: