-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically migrate TimeIntervals.timeseries to use TimeSeriesReferenceVectorData #1390
Conversation
…d for convenience and backward compatability of TimeIntervals
… to TimeSeriesReferenceVectorData
…ess rather than on set
Codecov Report
@@ Coverage Diff @@
## dev #1390 +/- ##
==========================================
+ Coverage 77.47% 78.18% +0.70%
==========================================
Files 37 37
Lines 2735 2759 +24
Branches 455 461 +6
==========================================
+ Hits 2119 2157 +38
+ Misses 535 522 -13
+ Partials 81 80 -1
Continue to review full report at Codecov.
|
Interesting approach. I'm surprised that changing A couple comments:
|
Yeah, Python allows you to do a lot of strange things that usually shouldn't be done ;-) It works in this case mainly because
Good point.
Yes. That is how I originally did it in this commit fe87879 Maybe the better place would be to do this in the ObjectMapper when getting the constructor arguments. In this way, the whole build-process should already be done and the swap just happens right before the
I don't think this is an issue. The datatypes in HDF5 are identical for |
Yeah, I agree - the better place to do this would be in the |
I can give it a try. We can always scrap this PR if we decide on another option. I just wanted to see if this "creative" way of changing types could work. |
I think it may be simpler to just do it in For what it's worth, here my attempt at doing the migration in the ObjectMapper. I didn't see a way to get the values for the constructor arguments that are not overwritten. It looks like this is all handled in the
|
That approach isn't terrible, especially if we make some changes to the I think we can combine yours and mine. In my approach, I also updated the builders just in case that was needed later for write/append but I'm not sure that is necessary. So something like: @DynamicTableMap.constructor_arg('columns')
def columns_carg(self, builder, manager):
# handle case when a TimeIntervals is read with a non-TimeSeriesReferenceVectorData "timeseries" column
timeseries_builder = builder.get('timeseries')
if timeseries_builder.attributes['neurodata_type'] != 'TimeSeriesReferenceVectorData':
# override builder attributes
timeseries_builder.attributes['neurodata_type'] = 'TimeSeriesReferenceVectorData'
timeseries_builder.attributes['namespace'] = 'core'
# construct new columns list
columns = list()
for dset_builder in builder.datasets.values():
dset_obj = manager.construct(dset_builder) # these have already been constructed
# go through only the column datasets and replace the 'timeseries' column class in-place
if isinstance(dset_obj, VectorData):
if dset_obj.name == 'timeseries':
dset_obj.__class__ = TimeSeriesReferenceVectorData
columns.append(dset_obj)
return columns
return None # do not override |
30a4a75 updates the PR to do the conversion in
Sure, I think that should work. Generally speaking, doing this in the ObjectMapper seems slightly trickier to do but I think its the more appropriate place for this sort of thing, rather than cluttering user-facing classes with backwards compatibility code. One thing I noticed is that it seems that |
@rly I updated the code as you suggested to do the migration of the builders and container in the ObjectMapper only. After a few minor fixes (the code failed when TimeIntervals.timeseries was missing), this seems to work fine. Thanks for your patience in iterating on possible solutions, but I think what we have here now seems like an elegant solution.
I stand corrected this seems to work correctly and I added an assert in the unit tests to make sure. @rly You mentioned at some point that you had a test to also test agains previous files that use VectorData. Can you add that test to this PR. I've tested agains both the updated and current schema and it seems fine, but I've done this only by changing the schema by hand. It would be good to have an actual test for this. |
…vals.timeseries work
This is good to go. |
…der in case we need to add custom init functionality in the future
Motivation
Fix NeurodataWithoutBorders/nwb-schema#484 . This PR automatically migrates
TimeIntervals.timeseries
column to use the newTimeSeriesReferenceVectorData
container class. It does so by patching the__class__
at runtime in theTimeIntervalsMap
ObjectMapper
class when theTimeIntervals
object is being built on read. SinceTimeIntervals.timeseries
andTimeSeriesReferenceVectorData
have the same schema and memory layout, swapping the class (i.e., object type) at runtime, after-the-fact should be fine.Advantages
Disadvantages
Major changes
VectorData
forTimeIntervals.timeseries
)TimeIntervals
andTimeSeriesReferenceVectorData
andTimeSeriesReference
Checklist
flake8
from the source directory.