-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for read-only s3 virtual driver #1010
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pull request.
We need to pass a pointer to H5Pset_fapl_ros3
. Perhaps we should box the struct in a Ref or concretely as a RefValue
.
src/drivers/drivers.jl
Outdated
This is the read-only virtual driver that enables access to HDF5 objects stored in AWS S3 | ||
""" | ||
struct ROS3 <: Driver | ||
fa::API.H5FD_ros3_fapl_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably should be a Base.RefValue{API.H5FD_ros3_fapl_t}
. It can then be converted to a pointer by ccall
.
@mkitti thank you for the helpful review! Do you have a recommendation for a unit test for this? We need a public HDF5 file hosted on S3.. It appears that some of the builds do not pass raising
suggesting that these |
You might need to register this lazily like the mpio driver rather than during |
The modifications to functions.jl directly are not going to work. They will be removed the next time someone calls the API generator. I will provide a suggestion t as an alternate fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some suggestions.
Please regenerate functions.jl.
Co-authored-by: Mark Kittisopikul <[email protected]>
Co-authored-by: Mark Kittisopikul <[email protected]>
test/ros3.jl
Outdated
h5open("http://s3.us-east-2.amazonaws.com/hdf5ros3/GMODO-SVM01.h5"; driver=HDF5.Drivers.ROS3()) do f | ||
@test keys(f) == ["All_Data", "Data_Products"] | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Can you add a test to see if you can actually read data correctly?
docs/src/api_bindings.md
Outdated
@@ -477,6 +477,7 @@ h5pl_size | |||
- [`h5p_get_fapl_mpio32`](@ref h5p_get_fapl_mpio32) | |||
- [`h5p_get_fapl_mpio64`](@ref h5p_get_fapl_mpio64) | |||
- [`h5p_get_fapl_multi`](@ref h5p_get_fapl_multi) | |||
- [`h5p_get_fapl_ros`](@ref h5p_get_fapl_ros) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I think we may need to the patch the generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. the name should be h5p_get_fapl_ros3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a bind exception for ros3 in cc6625e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few changes. If you could just comment out the failing tests and mark it for later that would be fine. It would be even better if you could narrow the situation to remove the test / error case.
This is getting closer. I need to examine the lazy loading mechanism more closely. Basically it should only try to load the driver when the user tries to use it rather than eagerly loading it during initialization. |
src/drivers/ros3.jl
Outdated
return (ROS3 ∘ API.H5FD_ros3_fapl_t)( | ||
version, | ||
auth, | ||
Base.unsafe_convert(Cstring, region), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a regular convert or does it have to be an unsafe_convert
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be an unsafe_convert
:
julia> convert(Cstring, "hello")
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Cstring
Closest candidates are:
convert(::Type{Cstring}, ::Union{Ptr{Int8}, Ptr{Nothing}, Ptr{UInt8}}) at ~/src/julia-1.7.1/share/julia/base/c.jl:167
convert(::Type{T}, ::T) where T at ~/src/julia-1.7.1/share/julia/base/essentials.jl:218
Stacktrace:
[1] top-level scope
@ REPL[55]:1
julia> Base.unsafe_convert(Cstring, "hello")
Cstring(0x00007fc5c5e06838)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- you need to first call
cconvert(Cstring, region)
(this converts toString
, and ensures it is nul-terminated) - you then need to keep a reference to the result of this whenever you pass it to a
ccall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the best option is to add them as extra fields to the ROS3 struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we just do unsafe_convert(Cstring, String(region))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I missed the original point of this all. Please correct me if I'm still wrong in my thinking
So the reason we need this is because we use the ROS3 instance in further calls (specifically h5p_set_fapl_ros3
h5p_get_fapl_ros3
) and those need to convert the outward driver struct ROS3
to the internal libhdf5 struct H5FD_ros3_fapl_t
and right now there's a mismatch in that ROS3
doesn't agree with the layout of H5FD_ros3_fapl_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to convert the Julia struct to the C struct. Part of the issue is that the Julia struct and its Strings could be garbage collected. Meanwhile, the Cstring is an unmanaged pointer to a null terminated character array.
Anyways, it turns out that the C struct definition here is wrong. The correct definition is
typedef struct H5FD_ros3_fapl_t {
int32_t version;
hbool_t authenticate;
char aws_region[H5FD_ROS3_MAX_REGION_LEN + 1];
char secret_id[H5FD_ROS3_MAX_SECRET_ID_LEN + 1];
char secret_key[H5FD_ROS3_MAX_SECRET_KEY_LEN + 1];
} H5FD_ros3_fapl_t;
In Julia, that would be
struct H5FD_ros3_fapl_t
version::Int32
authenticate::Bool
aws_region::NTuple{H5FD_ROS3_MAX_REGION_LEN + 1,Cchar}
secret_id ::NTuple{H5FD_ROS3_MAX_SECRET_ID_LEN + 1, Cchar}
secret_key::NTuple{H5FD_ROS3_MAX_SECRET_KEY_LEN + 1, Cchar};
end
I've started to rewrite this. Also, I think we need to check our definition of hbool_t
. Currently it is UInt32
, but I think it probably should be UInt8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's probably not the first time there has been a mismatch in definitions. Do you want to push your changes on this branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will later today. I just started it on a different computer than I am on right now. I didn't quite get the tests to pass yet on the branch I was working on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quickly saw this comment on the hbool issue aldanor/hdf5-rust#28
but perhaps it doesn't affect us since we dont' use msvc
I think this is about ready now. Is it? |
@simonbyrne , could you take a look at this pull request? |
@gszep is there any reason this is still in draft form? |
I think we should move forward with this. |
@@ -0,0 +1,45 @@ | |||
""" | |||
ROS3() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the docstring signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added additonal signatures below this line.
Do we have a way to test this? |
There is a test already, but without authentication. There seems to be an example here: |
ROS3(region::AbstractString, id::AbstractString, key::AbstractString) = | ||
ROS3(1, true, region, id, key) | ||
ROS3() = ROS3(1, false, "", "", "") | ||
_ntuple_to_string(x) = unsafe_string(Ptr{Cchar}(pointer_from_objref(Ref(x))), length(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't convert the ntuple to a string back for me
_ntuple_to_string(x) = unsafe_string(Ptr{Cchar}(pointer_from_objref(Ref(x))), length(x))
x = "Dfsddfs"
a = ntuple(i -> i <= length(x) ? Cchar(x[i]) : Cchar(0x0), 10 + 1)
julia> _ntuple_to_string(a)
"\xc8\x18\xb2\n\0\0\0\0X\xd0\xc3"
Shouldn't this actually look like something like the following
function _ntuple_to_string(buf)
len = findfirst(iszero, buf) -1
unsafe_string(Base.unsafe_convert(Ptr{Int8}, collect(buf[1:len])))
end
Looks like this can also be written as
function _ntuple_to_string(buf)
len = findfirst(iszero, buf) -1
unsafe_string(Base.unsafe_convert(Ptr{Int8}, collect(buf)), len)
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsafe_string already searches for a null itself.
julia> function _ntuple_to_string(x)
r = Ref(x)
GC.@preserve r unsafe_string(Ptr{Cchar}(pointer_from_objref(r)))
end
_ntuple_to_string (generic function with 1 method)
julia> _ntuple_to_string(a)
"Dfsddfs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also probably should use ncodeunits
and codeunits
like you suggested before:
#818 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use FixedString
, which is already in the package?
HDF5.jl/src/typeconversions.jl
Lines 99 to 105 in 0120cb2
struct FixedString{N,PAD} | |
data::NTuple{N,UInt8} | |
end | |
Base.length(::Type{FixedString{N,PAD}}) where {N,PAD} = N | |
Base.length(str::FixedString) = length(typeof(str)) | |
pad(::Type{FixedString{N,PAD}}) where {N,PAD} = PAD | |
pad(x::T) where {T<:FixedString} = pad(T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FixedString is not an AbstractString.
I started working on a version of InlineStrings.jl called NStrings.jl over the weekend that basically is a full blown general implementation of this and is an AbstractString.
This is basically an implementation of this idea:
JuliaStrings/InlineStrings.jl#6
I'll publish that repo soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually about to suggest moving it to a separate repo! Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial version of NTuple based Strings is here:
https://github.com/mkitti/NStrings.jl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current NTuple based Strings are now here and registered:
https://github.com/mkitti/StaticStrings.jl
Let's merge just merge this. I'll follow up with a StaticStrings.jl pull request. |
Related to #552; We would like the equivalent functionality as h5py