-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Add an object path as a way to uniquely identify an object in the API #1108
Comments
@h-mayorquin I think this was a point of a bit of confusion during the meeting. I believe the way @rly was using the terms is:
|
I want to differentiate three things:
I don't know good terminology to differentiate between them. I think we can use 2 for 3. Right now both zarr and hdf5 have the same "file-like" structure? If so, that would be the simplest thing to do I feel. |
@h-mayorquin can you clarify, what is the difference between 3 and 2, and why is it needed. |
@oruebel I expect that we don't need a distinction between 2 and 3 but ... we might have a backend that does not have a file-structure like hdf5 and zarr? The path of the objects within zarr and hdf5 backends might differ from some objects? I want to emphasize that it should be a backend independent concept hence the distinction. Does that make sense? |
@h-mayorquin we may have a backend that does not internally use the "/" syntax for Group membership in their Python API, but any backend must enable the HDMF primitives, which means it must have the concept of a Group, so would be mappable to this syntax. Unless there is a good reason not to, I would like to propose we use the HDF5/Zarr path as the unique identifier. |
Totally agree with that. |
The real reference here is the mapping to schema. I.e., the path the object will have in the Builder structure. For HDF5/Zarr the path in the file and in Builder hierarchy are identical. All I'm trying to say is, even for non-hierarchical backend stores, we can determine that path from the schema. |
Ah, got you, thanks for the explanation! That's great to hear. |
It would be great to have something that can be used to specify an object within the nwbfile that is both unique and independent of the backend. An abstraction that can be used is that of paths so I am imaging an API that could look like this:
Use cases
In opposition to the
object_id
that uniquely specifies the object within the NWBFile the location can identify an object in an NWB that remains the same across different sessions. This can be used for:Previous or Similar Art
This function was implemented in neuroconv:
https://github.com/catalystneuro/neuroconv/blob/47a066ca8c58b88064bfecee90cfcfc70409d135/src/neuroconv/tools/nwb_helpers/_configuration_models/_base_dataset_io.py#L28-L44
And it produces output like this:
Then the function was ported to pynwb:
https://github.com/NeurodataWithoutBorders/pynwb/blob/2259bede338f2f202229bda0af15d7e3cea47369/src/pynwb/base.py#L290-L324
Complexities
The fact that hdf5 and zarr might have a different paths than the pynwb API can be confusing. An example that @rly pointed out is the electrical series.
Other considerations
I probably missed some subtleties from today's discussion, so I am tagging people here so they can correct my mistake @rly @bendichter @CodyCBakerPhD
The text was updated successfully, but these errors were encountered: