Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid using cfunction runtime closures in iteration callback #812

Merged
merged 7 commits into from
Feb 10, 2021

Conversation

jmert
Copy link
Contributor

@jmert jmert commented Jan 24, 2021

Necessary on e.g. aarch64 architecture where @cfunction runtime closures aren't supported (see JuliaLang/julia#27174). Fixes #811.

In the second commit I've added some documentation to make it clearer what the callback function's form must be, but I'm not entirely happy with the wording yet.

Closes: #813 #811

src/api_helpers.jl Outdated Show resolved Hide resolved
src/api_helpers.jl Outdated Show resolved Hide resolved
@musm
Copy link
Member

musm commented Feb 5, 2021

The aarch64 bug is unfortunate, as this complicates the implementation. I'm ok with this workaround as we should strive to have as broad support as possible, but I think we should add a breadcrumb that reminds us "revert after julia JuliaLang/julia#27174" is resolved, in the source so we don't forget to revert back after the Julia bug is fixed.

fptr = @cfunction($f, herr_t, (hid_t, Ptr{Cchar}, Ptr{H5A_info_t}, Ptr{Cvoid}))
h5a_iterate(obj_id, idx_type, order, idxref, fptr, C_NULL)
fptr = @cfunction(h5a_iterate_helper, herr_t, (hid_t, Ptr{Cchar}, Ptr{H5A_info_t}, Ref{Any}))
userf = Ref{Any}(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter that this is a Ref instead of a Ptr{Any} thereby eliminating the unsafe_convert below? Presumably Ref is the better option here since f is a method defined on the Julia end, which more closely aligns with the general usage of Ref.

Copy link

@vchuravy vchuravy Feb 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two issues here:

  1. Functions are immutable and thus you can't use Base.pointer_from_objref.
  2. Ideally you would want the signature of the ccall to be ::Any and thus have Julia take care of the implicit conversion

I am a bit sketched out by the Ptr{Any} which then is converted to Any on the callee side, but I think Ref{Any} in ccall/ cfunction has special handling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm a bit unsure of the dance of Ptr{Any}, Ref{Any}, Any (and having put this up over two weeks ago, I forget what some of my problems/conclusions were). I think it's the special-cased handling mentioned in the Ref docstring that was throwing me off for a while, and this is what I got to work:

As a special case, setting T = Any will instead cause the creation of a pointer to the reference itself when converted to a Ptr{Any} (a jl_value_t const* const* if T is immutable, else a jl_value_t *const *). When converted to a Ptr{Cvoid}, it will still return a pointer to the data region as for any other T.

Since @vtjnash just commented here, though, would you mind giving an expert opinion on what the cleanest way to do this would be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply put, you need to use the same expression in both places. That can be Ref{T} or Any (or perhaps even Ref{Any}, but that seems like usually an odd choice)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I simplified the signature in the @cfunction call and then the argument being passed through to the low-level h5(a|l)_iterate. (Not sure how I skipped over that particular variation a couple of weeks ago considering it is the simple option...)

If I'm understanding correctly, though, I do still need the userf = Ref{Any}(f) line so that the closure has a pointer address for the ccall to make use of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need that? You've used cfunction with Any, so you must use ccall with Any also

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the Ref container, there are MethodErrors due to no unsafe_convert{::Type{Ptr{Nothing}}, ::<closure type>).

diff --git a/src/api_helpers.jl b/src/api_helpers.jl
index f5ba65e..864230b 100644
--- a/src/api_helpers.jl
+++ b/src/api_helpers.jl
@@ -68,15 +68,7 @@ julia> HDF5.h5a_iterate(obj, HDF5.H5_INDEX_NAME, HDF5.H5_ITER_INC) do loc, name,
 function h5a_iterate(@nospecialize(f), obj_id, idx_type, order, idx = 0)
     idxref = Ref{hsize_t}(idx)
     fptr = @cfunction(h5a_iterate_helper, herr_t, (hid_t, Ptr{Cchar}, Ptr{H5A_info_t}, Any))
-    userf = Ref{Any}(f)
-    if VERSION < v"1.6.0-DEV.1038"
-        # unsafe_convert(Ptr{Cvoid}, ::RefValue{Any}) returns pointer to RefValue instead
-        # of data --- see JuliaLang/julia#37591
-        userfptr = unsafe_load(Ptr{Ptr{Cvoid}}(unsafe_convert(Ptr{Any}, userf)))
-        GC.@preserve userf h5a_iterate(obj_id, idx_type, order, idxref, fptr, userfptr)
-    else
-        GC.@preserve userf h5a_iterate(obj_id, idx_type, order, idxref, fptr, userf)
-    end
+    h5a_iterate(obj_id, idx_type, order, idxref, fptr, f)
     return idxref[]
 end
 
@@ -204,15 +196,7 @@ julia> HDF5.h5l_iterate(hfile, HDF5.H5_INDEX_NAME, HDF5.H5_ITER_INC) do group, n
 function h5l_iterate(@nospecialize(f), group_id, idx_type, order, idx = 0)
     idxref = Ref{hsize_t}(idx)
     fptr = @cfunction(h5l_iterate_helper, herr_t, (hid_t, Ptr{Cchar}, Ptr{H5L_info_t}, Any))
-    userf = Ref{Any}(f)
-    if VERSION < v"1.6.0-DEV.1038"
-        # unsafe_convert(Ptr{Cvoid}, ::RefValue{Any}) returns pointer to RefValue instead
-        # of data --- see JuliaLang/julia#37591
-        userfptr = unsafe_load(Ptr{Ptr{Cvoid}}(unsafe_convert(Ptr{Any}, userf)))
-        GC.@preserve userf h5l_iterate(group_id, idx_type, order, idxref, fptr, userfptr)
-    else
-        GC.@preserve userf h5l_iterate(group_id, idx_type, order, idxref, fptr, userf)
-    end
+    h5l_iterate(group_id, idx_type, order, idxref, fptr, f)
     return idxref[]
 end
julia> using HDF5
[ Info: Precompiling HDF5 [f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f]

julia> hid = h5open("/tmp/test.h5", "w");

julia> hid["test"] = 1;

julia> hid
Error showing value of type HDF5.File:
ERROR: MethodError: no method matching unsafe_convert(::Type{Ptr{Nothing}}, ::HDF5.var"#10#14"{IOContext{IOBuffer}, String, HDF5.var"#depth_check#13"{Int64, IOContext{IOBuffer}, Int64, Bool}, Int64, String, String})
Closest candidates are:
  unsafe_convert(::Union{Type{Ptr{Nothing}}, Type{Ptr{Base.Libc.FILE}}}, ::Base.Libc.FILE) at libc.jl:94
  unsafe_convert(::Type{Ptr{T}}, ::SharedArrays.SharedArray{T, N} where N) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/SharedArrays/src/SharedArrays.jl:354
  unsafe_convert(::Type{Ptr{T}}, ::SharedArrays.SharedArray) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/SharedArrays/src/SharedArrays.jl:355
  ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I misunderstood and answered the wrong question.

We've been following the C api, so the ccall wrapper function's are declared with void* / Ptr{Cvoid} arguments. Are you saying we should change that to Any instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the wrong decl.

there are MethodErrors

This is intentional, as otherwise you're missing the GC root (as detected by the miscompilation assert)

src/HDF5.jl Show resolved Hide resolved
@vtjnash
Copy link
Contributor

vtjnash commented Feb 8, 2021

The old version was also slower. This is the much preferred form for interacting with C libraries.

src/api_helpers.jl Outdated Show resolved Hide resolved
@jmert
Copy link
Contributor Author

jmert commented Feb 9, 2021

Oh, CI does not like that latest change. I can confirm I get similar segfaults on Julia v1.5 but not v1.6-rc1 (where I was initially testing).

@jmert
Copy link
Contributor Author

jmert commented Feb 9, 2021

Are the CI failures in v1.5 and earlier a case of hitting the pointer conversion described/fixed in JuliaLang/julia#37591? If I @run HDF5.show_tree(hid) with Debugger.jl, instead of the segfault I get an ERROR: MethodError: objects of type Base.RefValue{Any} are not callable.

@musm
Copy link
Member

musm commented Feb 9, 2021

Will merge later today sans objections

@musm musm merged commit ef228d5 into JuliaIO:master Feb 10, 2021
@jmert jmert deleted the closure_callbacks branch February 10, 2021 15:01
@musm musm mentioned this pull request Feb 11, 2021
if VERSION < v"1.6.0-DEV.1038"
# unsafe_convert(Ptr{Cvoid}, ::RefValue{Any}) returns pointer to RefValue instead
# of data --- see JuliaLang/julia#37591
userfptr = unsafe_load(Ptr{Ptr{Cvoid}}(unsafe_convert(Ptr{Any}, userf)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly this is causing issues on 1.4

Assertion failed: isSpecialPtr(V->getType()), file /cygdrive/d/buildbot/worker/package_win64/build/src/llvm-late-gc-lowering.cpp, line 773

signal (22): SIGABRT
in expression starting at C:\Users\Mus\.julia\dev\HDF5\test\plain.jl:5
crt_sig_handler at /cygdrive/d/buildbot/worker/package_win64/build/src\signals-win.c:92

exec_options at .\client.jl:264
_start at .\client.jl:484
jfptr__start_2087.clone_1 at C:\Users\Mus\AppData\Local\Programs\Julia\Julia-1.4.2\lib\julia\sys.dll (unknown line)
unknown function (ip: 00000000004017E1)
unknown function (ip: 0000000000401BD6)
unknown function (ip: 00000000004013DE)
unknown function (ip: 000000000040151A)
BaseThreadInitThunk at C:\WINDOWS\System32\KERNEL32.DLL (unknown line)
RtlUserThreadStart at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
Allocations: 19101740 (Pool: 19098255; Big: 3485); GC: 20
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid creating cfunction closures Support for AArch64
4 participants