-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Subbuffer CopyBack Data - (REVIEW) #2085
Conversation
optimization so that when (only) a subbuffer is used, the entire buffer is not copied back. Signed-off-by: Chris Perkins <[email protected]>
q.wait(); | ||
|
||
// change readBuffer behind its back. | ||
clear_arr(baseData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the SYCL 2020 provisional specs, it says on page 123, second paragraph that "2. A buffer can be constructed with associated host memory and a default buffer allocator. The buffer will use
this host memory for its full lifetime, but the contents of this host memory are unspecified for the lifetime
of the buffer. If the host memory is modified by the host, or mapped to another buffer or image during the
lifetime of this buffer, then the results are undefined. The initial contents of the buffer will be the contents
of the host memory at the time of construction."
Does this mean that this test has undefined behaviour by SYCL specs because it doesn't use an accessor to access and modify host data that is still in use by a buffer?
I am also wondering whether it is still an undefined behavior to read the host data when its buffer is only accessed via read-accessors throughout its lifetime. The specs say that it is undefined behavior when we try to modify it but it doesn't state whether my use case is allowed or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's UB to access host data passed to the buffer c'tor until the buffer is destroyed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cperkinsintel Please, rework the test to avoid UB.
if (Mode == access::mode::read || Mode == access::mode::discard_write || | ||
Mode == access::mode::discard_read_write) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why discard_write
mode isn't a write mode? According to SYCL spec:
access::mode::discard_write | Write-only access. Previous contents discarded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't an exported function. The question isAWriteMode
is answering here is "Is this mode one that would have written and that we now need to worry about?". And even though discard_writes do write, we don't need to worry about them - they don't need to propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the function should be renamed then because current name is very confusing.
Signed-off-by: Chris Perkins <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of the comments.
@@ -60,81 +60,91 @@ class buffer { | |||
buffer(const range<dimensions> &bufferRange, | |||
const property_list &propList = {}) | |||
: Range(bufferRange) { | |||
size_t SizeInBytes = get_count() * sizeof(T); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t SizeInBytes = get_count() * sizeof(T); | |
const size_t SizeInBytes = get_count() * sizeof(T); |
make_unique_ptr<detail::SYCLMemObjAllocatorHolder<AllocatorT>>()); | ||
impl->recordBufferUsage(((void *)this), SizeInBytes, 0, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add a comment for what literal values stand for, like /*ImprovePerformance2x*/false
IsSubBuffer = rhs.IsSubBuffer; | ||
OffsetInBytes = rhs.OffsetInBytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, move to initializer list.
}; | ||
|
||
// need to track information about a sub/buffer, | ||
// even after its destruction, we may need to know about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want to know about it after sub/buffer destruction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sub-buffers handle the copy-back of their own data. They will have been destroyed before the buffer_impl is destroyed. And when it is destroyed it needs to know if it has to initiate a copy back or not. So before it can answer that question it needs to know if there were any sub-buffers and if they actually performed any copy operations (ie had any write accessors declared).
@@ -108,6 +108,7 @@ class __SYCL_EXPORT stream_impl { | |||
GlobalOffsetAccessorT accessGlobalOffset(handler &CGH) { | |||
auto OffsetSubBuf = buffer<char, 1>(Buf, id<1>(0), range<1>(OffsetSize)); | |||
auto ReinterpretedBuf = OffsetSubBuf.reinterpret<unsigned, 1>(range<1>(1)); | |||
ReinterpretedBuf.set_write_back(false); // Buf handles write back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this change? The spec says:
The reinterpreted SYCL buffer that is constructed must behave as though it were a copy of the SYCL buffer...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. And it's a copy of a sub-buffer which, with this new addition, would initiate a copy back on its destruction. The stream implementation is using two accessors, one on the base buffer, and one on the sub-buffer/reinterp. And the reinterp is going out of scope at the end of this function. Don't need it's destructor to initiate a copy back - that's already handled by the base in this case.
I've been thinking about changing how this new sub-buffer dtor code works with reinterprted buffers, such that if a sub-buffer is reinterpreted, then we just go back to the existing system where everything is handled by the base buffer.
for (int i = offset; i < offset + subbuf_size; ++i) | ||
assert(vec[i] == (i < offset + offset_inside_subbuf ? i * 10 : i * -10) && | ||
"Invalid result in 1d sub buffer"); | ||
for (int i = 0; i < size; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the test fail without the patch?
|
||
void ensureNoUnecessaryCopyBack(queue &q) { | ||
|
||
std::cout << "start ensureNoUnecessaryCopyBack" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, include iostream if this is used.
std::cout << "start ensureNoUnecessaryCopyBack" << std::endl; | ||
|
||
//allocate memory | ||
int *baseData = (int *)(malloc(total * sizeof(int))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use std::vector or std::array.
// But we don't care about the write buffer. We only care that the read buffer was NOT copied-back. | ||
{ // closure | ||
//setup and clear memory | ||
setup_arr(baseData); // [0, 1, 2, ..., total] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, use std::iota instead.
q.wait(); | ||
|
||
// change readBuffer behind its back. | ||
clear_arr(baseData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cperkinsintel Please, rework the test to avoid UB.
@sergey-semenov Could you review? |
This version is the "shortened" one, in that it only deals with copy-backs (not limiting sub-buffer memcpy, for example). So I think it might be a bit overengineered for what it is. I'm going to make a pass at simplifying it. |
optimization so that when (only) a subbuffer is used, the entire buffer is not copied back.
Signed-off-by: Chris Perkins [email protected]