-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add time series check #1426
add time series check #1426
Conversation
* rmv s3 utils * rmv corresponding tests * update tutorial to use latest dandi tools instead
This approach is good for now. @oruebel and I discussed adding a general We can refactor these checks (as well as the other ones currently in various other |
src/pynwb/base.py
Outdated
if ( | ||
isinstance(kwargs["data"], (np.ndarray, list)) | ||
and isinstance(kwargs["timestamps"], (np.ndarray, list)) | ||
and not len(kwargs["data"]) == len(kwargs["timestamps"]) | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these shape-based checks, I think it may be useful to use get_data_shape from HDMF. I think one advantage of using get_data_shape
is that we don't need to deal with differences between lists, ndarray, h5py.Dataset, or DataChunkIterator etc. for each of the checks, but those details are handled in get_data_shape
. Another benefit is that I think it would allow us to cover more cases than just the nd.array and list case that is covered by the current check, but it should also work for DataChunkIterator in the cases, e.g., when maxshape is set on the DataChunkIterator. Using the get_data_shape
function, the check would I think look something like this:
from hdmf.utils import get_data_shape
...
data_shape = get_data_shape(data=kwargs["data"], strict_no_data_load=True)
timestamps_shape = get_data_shape(data=kwargs["data"], strict_no_data_load=True)
if( (data_shape is not None and timestamps_shape is not None) # check that the shape is known
and (len(data_shape) > 0 and len(timestamps_shape) > 0) # check for scalars. Should never happen
and (data_shape[0] is not None and timestaps_shape[0] is not None) # check that the length of the first dimension is known
and (data_shape[0] != timestamps_shape[0]) # check that the data and timestamps match
)
warn(...)
I agree. This approach is fine for now. I think the overhead to refactor this later should be reasonable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
Please add an entry to the changelog and then we are good to merge. |
Check if the length of data is equal to the length of timestamps