[SYCL] Switch to using blocking USM free for OpenCL GPU#4928
Merged
dm-vodopyanov merged 6 commits intointel:syclfrom Dec 10, 2021
Merged
[SYCL] Switch to using blocking USM free for OpenCL GPU#4928dm-vodopyanov merged 6 commits intointel:syclfrom
dm-vodopyanov merged 6 commits intointel:syclfrom
Conversation
Whenever a kernel is enqueued on GPU, the GPU driver records the state of all USM pointers that might be used in an indirect fashion. Because of this, these pointers cannot be freed until the execution of the kernel is finished. This change addresses this problem for OpenCL by using a blocking version of free, while Level Zero already handles this by deferring USM release.
Contributor
Author
|
/summary:run |
Contributor
Author
|
Test failures are caused by an OpenCL CPU runtime bug, changing this pull request to draft for now. |
Contributor
Author
|
Since it'll be a while until OpenCL CPU runtime is uplifted in CI, I've limited the changes in this PR to OpenCL GPU. @smaslov-intel @againull please, take a look. |
Contributor
Author
|
@smaslov-intel Applied what we've discussed yesterday. Please, take a look. |
Contributor
Author
|
/verify with intel/llvm-test-suite#561 |
Contributor
Author
|
/verify with intel/llvm-test-suite#561 |
alexbatashev
added a commit
to alexbatashev/llvm
that referenced
this pull request
Dec 11, 2021
* upstream/sycl: (725 commits) [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122) [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109) [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928) [CI] Disable pack and upload steps (intel#5119) [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780) [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076) [SYCL] Fix typo in the name of the host-visible pool (intel#5073) [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983) [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095) [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107) [SYCL] Fix a few warnings during build scripts configuration (intel#5082) [SYCL] Fix amdgpu openmp test (intel#5103) [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066) [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099) Remove PR disable template (intel#5102) [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078) [SYCL] Fix the test to not depend on a specific line. (intel#5092) [CI] Provide libclc targets to build and test (intel#5091) Fix build of `check-llvm-spirv` target after 8f8001a Force opt to use new pass manager in pr52289 test after c34d157 ...
alexbatashev
added a commit
to alexbatashev/llvm
that referenced
this pull request
Dec 12, 2021
* upstream/sycl: [CI] Add container users to video group (intel#5101) [CI] More typo fixes in Nightly build (intel#5088) Revert "[CI] Disable pack and upload steps (intel#5119)" (intel#5125) [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122) [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109) [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928) [CI] Disable pack and upload steps (intel#5119) [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780) [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076) [SYCL] Fix typo in the name of the host-visible pool (intel#5073) [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983) [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095) [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107) [SYCL] Fix a few warnings during build scripts configuration (intel#5082) [SYCL] Fix amdgpu openmp test (intel#5103) [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066) [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099) Remove PR disable template (intel#5102) [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Whenever a kernel is enqueued on GPU, the GPU driver records the state
of all USM pointers that might be used in an indirect fashion. Because
of this, these pointers cannot be freed until the execution of the kernel
is finished.
This change addresses this problem for OpenCL by using a blocking version
of free, while Level Zero already handles this by deferring USM release.
The change is temporarily limited to OpenCL GPU until a bug in OpenCL CPU
runtime is resolved.