-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaffolding to support generating arbitrary filter pipelines #23
Conversation
…ialize, Deserialize
This doesn't quite work yet, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa. Yeah, that' turned out way better than I was expecting when we discussed possible approaches on the Thursday call. It really shines when looking at the proptest definitions as well.
Most things I had were either minor style nits or general questions that don't need answering for this PR. The only real thing I saw was that c_uchar
vs c_uint
question.
Otherwise, +1 once CI is green.
} | ||
if let Some(reinterpret_datatype) = reinterpret_datatype { | ||
let c_datatype = | ||
reinterpret_datatype.capi_enum() as std::ffi::c_uchar; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is std::ffi::c_uchar
correct here? I would have expected u32
or std::ffi::c_uint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprisingly yes - it expects just one byte inside of libtiledb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that's awkward. In a note to future selves, I tracked this down to the fact that most of our uses of Datatype refelect the public C API typedef enum { ... } tiledb_datatype_t
definition which is apparently backed by a u32
according to bindgen. However, the Filter::set_option
pipes the void*
all the way to CompressionFilter::set_option_impl
which is casting it to a tiledb::sm::Datatype
which is enum class Datatype : uint8_t { ... }
definition which means it really does want a single byte ini this case.
@@ -67,6 +68,12 @@ impl<'ctx> FilterList<'ctx> { | |||
} | |||
} | |||
|
|||
pub fn to_vec(&self) -> TileDBResult<Vec<Filter<'ctx>>> { | |||
(0..self.get_num_filters()?) | |||
.map(|f| self.get_filter(f)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. TIL that collect() returns the first error or the whole list. That's awesome!
The missing piece from
Attribute::eq
was the filter pipeline. To get this in there with the testing infrastructure we have and/or will need, we need to doDebug
,Eq
,Serialize
andDeserialize
forFilterList
andFilter
.That turns out to require a bunch of things:
sys
crate with minimal dependencies, I movedDatatype
to theapi
crate where it can#[derive(Deserialize, Serialize)]
. This also is a big step towards keepingffi::
out of the public APIs.enum FilterData
is really useful because it can#[derive(PartialEq)]
which helps prevent us developers from making mistakes. Now those mistakes can live inFilter::create
and/orFilter::filter_data
instead.FilterData::transform_datatype
is an internal C API, but we need to use it for property-based testing to make sure that we construct valid filter pipelines. The output of each filter in the pipeline needs to be a valid input to the next one.