Conversation
The `-target` parameter of the offload wrapper was renamed to `-host` a while back on the SYCL branch. This fixes one of the issues seen in intel#5087
| // CHECK: llc{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-linked-{{.*}}.bc" "-mtriple=amdgcn-amd-amdhsa" "-mcpu=gfx906" "-filetype=obj" "-o"{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-{{.*}}.o" | ||
| // CHECK: lld{{.*}}"-flavor" "gnu" "--no-undefined" "-shared" "-o"{{.*}}amdgpu-openmp-toolchain-{{.*}}.out" "{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-{{.*}}.o" | ||
| // CHECK: clang-offload-wrapper{{.*}}"-target" "x86_64-unknown-linux-gnu" "-o" "{{.*}}a-{{.*}}.bc" {{.*}}amdgpu-openmp-toolchain-{{.*}}.out" | ||
| // CHECK: clang-offload-wrapper{{.*}}"-host" "x86_64-unknown-linux-gnu" "-o" "{{.*}}a-{{.*}}.bc" {{.*}}amdgpu-openmp-toolchain-{{.*}}.out" |
There was a problem hiding this comment.
Would we want to double-check the device target here? (a non-insisting suggestion)
| // CHECK: clang-offload-wrapper{{.*}}"-host" "x86_64-unknown-linux-gnu" "-o" "{{.*}}a-{{.*}}.bc" {{.*}}amdgpu-openmp-toolchain-{{.*}}.out" | |
| // CHECK: clang-offload-wrapper{{.*}}"-host" "x86_64-unknown-linux-gnu" "-o" "{{.*}}a-{{.*}}.bc" {{.*}}"-target=amdgcn-amd-amdhsa" {{.*}}amdgpu-openmp-toolchain-{{.*}}.out" |
There was a problem hiding this comment.
I suggest we don't make this change in sycl branch to minimize the divergence with the main branch.
We can consider improving the test in llorg instead.
@sndmitriev, do you plan to upstream this change to llorg? |
Shouldn't it be done by the author of that change? |
I thought that you as code owner should know about that. Do know know who made this change in the |
I think this option was renamed in the following commit 0738099 |
@kbobrovs, when do you plan to commit this patch to llorg? We need this to reduce the maintenance cost for the |
|
@npmiller, thanks a lot for fixing this issue! |
I did not have such plans until now. Will see if I can do this sometime in January. |
* upstream/sycl: (725 commits) [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122) [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109) [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928) [CI] Disable pack and upload steps (intel#5119) [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780) [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076) [SYCL] Fix typo in the name of the host-visible pool (intel#5073) [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983) [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095) [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107) [SYCL] Fix a few warnings during build scripts configuration (intel#5082) [SYCL] Fix amdgpu openmp test (intel#5103) [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066) [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099) Remove PR disable template (intel#5102) [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078) [SYCL] Fix the test to not depend on a specific line. (intel#5092) [CI] Provide libclc targets to build and test (intel#5091) Fix build of `check-llvm-spirv` target after 8f8001a Force opt to use new pass manager in pr52289 test after c34d157 ...
* upstream/sycl: [CI] Add container users to video group (intel#5101) [CI] More typo fixes in Nightly build (intel#5088) Revert "[CI] Disable pack and upload steps (intel#5119)" (intel#5125) [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122) [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109) [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928) [CI] Disable pack and upload steps (intel#5119) [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780) [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076) [SYCL] Fix typo in the name of the host-visible pool (intel#5073) [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983) [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095) [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107) [SYCL] Fix a few warnings during build scripts configuration (intel#5082) [SYCL] Fix amdgpu openmp test (intel#5103) [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066) [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099) Remove PR disable template (intel#5102) [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078)
|
@bader, no, not yet unfortunately. It is quite a bit of work, no resources for now. |
The
-targetparameter of the offload wrapper was renamed to-hostawhile back on the SYCL branch.
This fixes one of the issues seen in #5087