Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements event_notify using safe operations #87

Merged
merged 2 commits into from
Nov 17, 2015
Merged

Conversation

vchuravy
Copy link
Member

@vchuravy vchuravy commented Nov 7, 2015

Fixes #86.

julia> using OpenCL

julia> code_llvm(OpenCL.event_notify, (OpenCL.CL_event, OpenCL.CL_int,Ptr{Ptr{Void}}))

define void @julia_event_notify_21517(i8*, i32, i8**) {
top:
  %3 = load i8** %2, align 1
  %4 = getelementptr i8** %2, i64 1
  %5 = load i8** %4, align 1
  %6 = bitcast i8* %5 to i8**
  %7 = getelementptr i8** %2, i64 2
  %8 = load i8** %7, align 1
  %9 = bitcast i8* %8 to i32*
  store i8* %0, i8** %6, align 1
  store i32 %1, i32* %9, align 1
  call void inttoptr (i64 140256579765648 to void (i8*)*)(i8* inreg %3)
  ret void
}

evt_id = Ref{CL_event}(0)
status = Ref{CL_int}(0)
cb = Base.SingleAsyncWork(data -> callback(evt_id[], status[]))
ptrs = [cb.handle, Base.unsafe_convert(Ptr, evt_id), Base.unsafe_convert(Ptr, status)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure that this doesn't get GC'd so I am contemplating adding an ObjectIdDict to CLEvent. @jakebolewski Any better Ideas?

@vchuravy
Copy link
Member Author

vchuravy commented Nov 7, 2015

A problem that I see with this implementation is that each callback has only one set of return values, but it is entirely possible that event_notify is called multiple times before or during data -> callback(evt_id[], status[]) is run, thus overwriting the return values.

@vchuravy
Copy link
Member Author

vchuravy commented Nov 7, 2015

This fix only works on Julia v0.4. @jakebolewski Any ideas how to achieve the same with 0.3?

I currently don't have a computer with OpenCL around so I am not quite sure why the tests are hanging on the events. Will see if I can try it out during work.

@vchuravy vchuravy mentioned this pull request Nov 7, 2015
@vchuravy
Copy link
Member Author

vchuravy commented Nov 8, 2015

I am running into a weird seqfault,

julia -i -L test_ev.jl

signal (11): Segmentation fault
jl_field_index at /usr/bin/../lib/julia/libjulia.so (unknown line)
jl_f_get_field at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62ab466459)
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62ab46554c)
unknown function (ip: 0x7f62ab464480)
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62ab463d31)
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62ab45dd49)
unknown function (ip: 0x7f62ab4603e0)
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62ab45a51a)
unknown function (ip: 0x7f62ab45ab46)
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62ab45a0bb)
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62aeaf6e28)
unknown function (ip: 0x7f62aeaf7724)
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
julia_notify_21330 at  (unknown line)
jlcall_notify_21330 at  (unknown line)
task_done_hook at task.jl:145
jl_apply_generic at /usr/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f62aeb57756)
unknown function (ip: 0x7f62aeb5836c)
unknown function (ip: (nil))
[1]    9941 segmentation fault (core dumped)  julia -i -L test_ev.jl

Where:

using OpenCL
const cl = OpenCL

ctx = cl.create_some_context()
q = cl.CmdQueue(ctx)

callback_called = false
function test_callback(evt, status)
  callback_called = true
  println("Test Callback $evt, $status")
end

usr_evt = cl.UserEvent(ctx)
cl.enqueue_wait_for_events(q, usr_evt)
mkr_evt = cl.enqueue_marker(q)

cl.add_callback(mkr_evt, test_callback)

cl.complete(usr_evt)
julia> code_llvm(OpenCL.event_notify, (OpenCL.CL_event, OpenCL.CL_int,Ptr{Void}))

define void @julia_event_notify_21517(i8*, i32, i8*) {
top:
  %3 = bitcast i8* %2 to i8**
  %4 = load i8** %3, align 1
  %5 = getelementptr i8* %2, i64 8
  %6 = bitcast i8* %5 to i8**
  %7 = load i8** %6, align 1
  %8 = bitcast i8* %7 to i8**
  %9 = getelementptr i8* %2, i64 16
  %10 = bitcast i8* %9 to i8**
  %11 = load i8** %10, align 1
  %12 = bitcast i8* %11 to i32*
  store i8* %0, i8** %8, align 1
  store i32 %1, i32* %12, align 1
  call void inttoptr (i64 140438188430736 to void (i8*)*)(i8* inreg %4)
  ret void
}

@vtjnash Do you have any insight in what I might be doing wrong/or if this is the right direction/
The issue I am facing is to pass data from the callback back to the original Julia process.

p_status = Base.unsafe_convert(Ptr{CL_int}, r_status)

cb = Base.SingleAsyncWork(data -> begin
println("Received callback")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

println isn't really allowed here. the SingleAsyncWork callback is being run directly from libuv's callback, so anything that may try to block (such as the flush at the end of printing) could result in a recursive call to the libuv event loop. a better pattern is for the SingleAsyncWork callback to notify a Condition variable, and do the real work on a thread that monitors that variable (a PR to Base to replace the callback with a Condition variable would be a useful improvement to this API).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So instead I would have:

cond = Condition()
cb = Base.SingleAsyncWork(data -> notify(cond))
@async begin
  wait(cond)
  callback(r_evt_id[], r_status[])
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be that the main issue I am running into is that I am trying to write memory in a protected address space?

@vchuravy
Copy link
Member Author

vchuravy commented Nov 9, 2015

@vtjnash Could you take another look?

This is my most promising attempt yet and the callback compiles to:

julia> code_llvm(OpenCL.event_notify, (OpenCL.CL_event, OpenCL.CL_int,Ptr{Void}))

define void @julia_event_notify_21521(i8*, i32, i8*) {
top:
  %3 = bitcast i8* %2 to %_EventCB*
  %4 = bitcast i8* %2 to i8**
  %5 = load i8** %4, align 8
  %6 = insertvalue %_EventCB undef, i8* %5, 0
  %7 = insertvalue %_EventCB %6, i8* %0, 1
  %8 = insertvalue %_EventCB %7, i32 %1, 2
  store %_EventCB %8, %_EventCB* %3, align 1
  call void inttoptr (i64 139700456842640 to void (i8*)*)(i8* inreg %5)
  ret void
}

@jakebolewski
Copy link
Member

I believe this solution should work, although should there be a corresponding unpreserve_callback? otherwise all registered callbacks would never get GC'd.

@vchuravy
Copy link
Member Author

@jakebolewski Thanks. Before this goes in we would need to bump the minimal version to 0.4 so I will start a PR for that.

@jakebolewski
Copy link
Member

I think dropping support for 0.4 is reasonable. Some parts of the API can be simplified with call overloading.

@vchuravy
Copy link
Member Author

I hope you meant dropping support for 0.3 ;)

@vchuravy
Copy link
Member Author

@vchuravy
Copy link
Member Author

So rebased and squashed and I will start to fix the callback in context.jl, because baased on the docs [1] the callback can/will happen asynchronously.

[1] https://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clCreateContext.html

The callback itself should only use basic and thread-safe operations and
should also not block.
The result from the callback is stored in a reference to an immutable
and a waiting task notify to get the result value and dispatch the
user-supplied callback.
@vchuravy
Copy link
Member Author

@jakebolewski Take a look at this and if you agree we can merge, after fixing the oom travis error we can go ahead and tag a new release.

vchuravy added a commit that referenced this pull request Nov 17, 2015
Implements  event_notify using safe operations
@vchuravy vchuravy merged commit 712ba87 into master Nov 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants