-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test which writes data and then reads it back for an arbitrary dense array schema and subarray #98
Conversation
… for other things too
tiledb/api/src/range.rs
Outdated
pub fn check_datatype(&self, datatype: Datatype) -> TileDBResult<()> { | ||
check_datatype!(self, datatype); | ||
Ok(()) | ||
} | ||
|
||
/// Merges this range into the `other` list of sorted, non-overlapping ranges and returns the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this stub while thinking about "do multiple writes in sequence and read back to make everything checked out" but decided to restrict the scope of this PR. Will remote it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. +1 to merge once that test is fixed.
let dimension_params = DimensionRequirements { | ||
datatype: Some(dimension_type), | ||
extent_limit: { | ||
// Dth root of T is the same as T^(1/D) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
($field:expr, $data:pat, $then:expr) => { | ||
typed_field_data_go!($field, _DT, $data, $then, $then) | ||
}; | ||
($lexpr:expr, $rexpr:expr, $DT:ident, $lpat:pat, $rpat:pat, $same_type:expr, $else:expr) => {{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not insisting you change this, but I'm not very fond of how these are all smashed together in one macro. I've been staring at this for a good 5m trying to match clauses up to follow the recursive bits and invocation logic and then I realized this third variant isn't related to the other two? Personally I'd just make this its own macro with a more descriptive name.
Also, if possible, I'd move the helper variant to the top so that its easier to compare their argument lists rather than attempting to memorize them while scrolling back and forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It kinda reminds me of the proptest!
macro which has like 20 variations that are impossible to deduce the intended use cases and you just have to find examples to suss out how to use it properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty odd that doc comments are not even allowed inside of macro_rules!
. I'm quite happy to shuffle some pieces around to make it easier to understand.
The is the failure caused by our use of tiledb 2.21 which has a difference in
Here's
our filter strategy sets thinigs up with the newer code of |
Ah right right. I'll get the CI updated to use core nightlies and we should be able to get this fixed and merged today. |
I need to not forget - add the "non-empty domain" checks to the test. |
This ended up being quite a doozy!
A dense write requires data input for each of the attributes as well as the subarray that is being written to. The strategy which generates a dense write therefore generates an initial set of cells (
query::strategy::Cells
) for some "bounding subarray", and the shrinking strategy is to narrow the range of cells covered by the subarray. This is done by computing the offset of the "current" subarray compared to the "bounding" subarray and then slicing the cells.There is also a shrinking strategy for the Cells themselves. Note that this is not used by this PR but will be used for sparse writes, where the dimension coordinates are part of the cell data. This strategy explores the space by dividing the cells into N pieces, running the test N times once for each chunk missing. Each of these iterations tells us if a chunk was necessary for failure. Then we recur down to just the pieces which were needed for failure.
This pull request exposed several bugs with our filter pipeline strategy specifically, but also some arithmetic issues with other parts. There are lots of incidental fixes and changes to strategies to thread around more requirements.