Skip to content

Commit

Permalink
nwb_schema_2.6.0 (#1636)
Browse files Browse the repository at this point in the history
* Check nwb_version on read (#1612)

* Added NWBHDF5IO.nwb_version property and check for version on NWBHDF5IO.read
* Updated icephys tests to skip version check when writing non NWBFile container
* Add tests for NWB version check on read
* Add unit tests for NWBHDF5IO.nwb_version property
* Updated changelog

Co-authored-by: Ryan Ly <[email protected]>

* Bump setuptools from 65.4.1 to 65.5.1 (#1614)

Bumps [setuptools](https://github.com/pypa/setuptools) from 65.4.1 to 65.5.1.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/CHANGES.rst)
- [Commits](pypa/setuptools@v65.4.1...v65.5.1)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* modify export.rst to have proper links to the NWBFile API docs (#1615)

* Create project_action.yml (#1617)

* Create project_action.yml

* Update project_action.yml

* Update project_action.yml

* Update project_action.yml (#1620)

* Update project_action.yml (#1623)

* Project action (#1626)

* Create project_action.yml

* Update project_action.yml

* Update project_action.yml

* Update project_action.yml

* Show recommended usaege for hdf5plugin in tutorial (#1630)

* Show recommended usaege for hdf5plugin in tutorial

* Update docs/gallery/advanced_io/h5dataio.py

* Update docs/gallery/advanced_io/h5dataio.py

Co-authored-by: Heberto Mayorquin <[email protected]>

Co-authored-by: Ben Dichter <[email protected]>
Co-authored-by: Heberto Mayorquin <[email protected]>

* Update iterative write and parallel I/O tutorial (#1633)

* Update iterative write tutorial
* Update doc makefiles to clean up files created by the advanced io tutorial
* Fix #1514  Update parallel I/O tutorial to use H5DataIO instead of DataChunkIterator to setup data for parallel write
* Update changelog
* Fix flake8
* Fix broken external links
* Update make.bat
* Update CHANGELOG.md
* Update plot_iterative_write.py
* Update docs/gallery/advanced_io/plot_iterative_write.py

Co-authored-by: Ryan Ly <[email protected]>

* Update project_action.yml (#1632)

* nwb_schema_2.6.0

* Update CHANGELOG.md

* remove

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Oliver Ruebel <[email protected]>
Co-authored-by: Ryan Ly <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ben Dichter <[email protected]>
Co-authored-by: Heberto Mayorquin <[email protected]>
  • Loading branch information
6 people authored Jan 17, 2023
1 parent 27902b7 commit e3381a0
Show file tree
Hide file tree
Showing 17 changed files with 264 additions and 142 deletions.
34 changes: 34 additions & 0 deletions .github/workflows/project_action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Add issues to Development Project Board

on:
issues:
types:
- opened

jobs:
add-to-project:
name: Add issue to project
runs-on: ubuntu-latest
steps:
- name: GitHub App token
id: generate_token
uses: tibdex/[email protected]
with:
app_id: ${{ secrets.APP_ID }}
private_key: ${{ secrets.APP_PEM }}

- name: Add to Developer Board
env:
TOKEN: ${{ steps.generate_token.outputs.token }}
uses: actions/[email protected]
with:
project-url: https://github.com/orgs/NeurodataWithoutBorders/projects/7
github-token: ${{ env.TOKEN }}

- name: Add to Community Board
env:
TOKEN: ${{ steps.generate_token.outputs.token }}
uses: actions/[email protected]
with:
project-url: https://github.com/orgs/NeurodataWithoutBorders/projects/8
github-token: ${{ env.TOKEN }}
23 changes: 14 additions & 9 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,26 +8,31 @@
- Add `Subject.age__reference` field. @bendichter ([#1540](https://github.com/NeurodataWithoutBorders/pynwb/pull/1540))
- `IntracellularRecordingsTable.add_recording`: the `electrode` arg is now optional, and is automatically populated from the stimulus or response.
[#1597](https://github.com/NeurodataWithoutBorders/pynwb/pull/1597)
- Add module `pynwb.testing.mock.icephys` and corresponding tests. @bendichter
- Added module `pynwb.testing.mock.icephys` and corresponding tests. @bendichter
[1595](https://github.com/NeurodataWithoutBorders/pynwb/pull/1595)
- Remove redundant object mapper code. @rly [#1600](https://github.com/NeurodataWithoutBorders/pynwb/pull/1600)
- Fix pending deprecations and issues in CI. @rly [#1594](https://github.com/NeurodataWithoutBorders/pynwb/pull/1594)
- Removed redundant object mapper code. @rly [#1600](https://github.com/NeurodataWithoutBorders/pynwb/pull/1600)
- Fixed pending deprecations and issues in CI. @rly [#1594](https://github.com/NeurodataWithoutBorders/pynwb/pull/1594)
- Added ``NWBHDF5IO.nwb_version`` property to get the NWB version from an NWB HDF5 file @oruebel [#1612](https://github.com/NeurodataWithoutBorders/pynwb/pull/1612)
- Updated ``NWBHDF5IO.read`` to check NWB version before read and raise more informative error if an unsupported version is found @oruebel [#1612](https://github.com/NeurodataWithoutBorders/pynwb/pull/1612)
- Updated NWB Schema tp 2.6.0 @mavaylon1 [#1636](https://github.com/NeurodataWithoutBorders/pynwb/pull/1636)

### Documentation and tutorial enhancements:
- Adjusted [ecephys tutorial](https://pynwb.readthedocs.io/en/stable/tutorials/domain/ecephys.html) to create fake data with proper dimensions @bendichter [#1581](https://github.com/NeurodataWithoutBorders/pynwb/pull/1581)
- Refactored testing documentation, including addition of section on ``pynwb.testing.mock`` submodule. @bendichter
[#1583](https://github.com/NeurodataWithoutBorders/pynwb/pull/1583)
- Update round trip tutorial to the newer ``NWBH5IOMixin`` and ``AcquisitionH5IOMixin`` classes. @bendichter
- Updated round trip tutorial to the newer ``NWBH5IOMixin`` and ``AcquisitionH5IOMixin`` classes. @bendichter
[#1586](https://github.com/NeurodataWithoutBorders/pynwb/pull/1586)
- More informative error message for common installation error. @bendichter, @rly
- Added more informative error message for common installation error. @bendichter, @rly
[#1591](https://github.com/NeurodataWithoutBorders/pynwb/pull/1591)
- Update citation for PyNWB in docs and duecredit to use the eLife NWB paper. @oruebel [#1604](https://github.com/NeurodataWithoutBorders/pynwb/pull/1604)
- Fix docs build warnings due to use of hardcoded links. @oruebel [#1604](https://github.com/NeurodataWithoutBorders/pynwb/pull/1604)
- Updated citation for PyNWB in docs and duecredit to use the eLife NWB paper. @oruebel [#1604](https://github.com/NeurodataWithoutBorders/pynwb/pull/1604)
- Fixed docs build warnings due to use of hardcoded links. @oruebel [#1604](https://github.com/NeurodataWithoutBorders/pynwb/pull/1604)
- Updated the [iterative write tutorial](https://pynwb.readthedocs.io/en/stable/tutorials/advanced_io/iterative_write.html) to reference the new ``GenericDataChunkIterator`` functionality and use the new ``H5DataIO.dataset`` property to simplify the custom I/O section. @oruebel [#1633](https://github.com/NeurodataWithoutBorders/pynwb/pull/1633)
- Updated the [parallel I/O tutorial](https://pynwb.readthedocs.io/en/stable/tutorials/advanced_io/parallelio.html) to use the new ``H5DataIO.dataset`` feature to set up an empty dataset for parallel write. @oruebel [#1633](https://github.com/NeurodataWithoutBorders/pynwb/pull/1633)

### Bug fixes
- Add shape constraint to `PatchClampSeries.data`. @bendichter
- Added shape constraint to `PatchClampSeries.data`. @bendichter
[#1596](https://github.com/NeurodataWithoutBorders/pynwb/pull/1596)
- Update the [images tutorial](https://pynwb.readthedocs.io/en/stable/tutorials/domain/images.html) to provide example usage of an ``IndexSeries``
- Updated the [images tutorial](https://pynwb.readthedocs.io/en/stable/tutorials/domain/images.html) to provide example usage of an ``IndexSeries``
with a reference to ``Images``. @bendichter [#1602](https://github.com/NeurodataWithoutBorders/pynwb/pull/1602)
- Fixed an issue with the `tox` tool when upgrading to tox 4. @rly [#1608](https://github.com/NeurodataWithoutBorders/pynwb/pull/1608)

Expand Down
3 changes: 2 additions & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ PAPER =
BUILDDIR = _build
SRCDIR = ../src
RSTDIR = source
GALLERYDIR = gallery
PKGNAME = pynwb

# Internal variables.
Expand Down Expand Up @@ -45,7 +46,7 @@ help:
@echo " apidoc to build RST from source code"

clean:
-rm -rf $(BUILDDIR)/* $(RSTDIR)/$(PKGNAME)*.rst $(RSTDIR)/tutorials
-rm -rf $(BUILDDIR)/* $(RSTDIR)/$(PKGNAME)*.rst $(RSTDIR)/tutorials $(GALLERYDIR)/advanced_io/*.npy $(GALLERYDIR)/advanced_io/*.nwb

html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
Expand Down
2 changes: 1 addition & 1 deletion docs/gallery/advanced_io/h5dataio.py
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@

wrapped_data = H5DataIO(
data=data,
compression=hdf5plugin.Zstd().filter_id,
**hdf5plugin.Zstd(clevel=3), # set the compression and compression_opts parameters
allow_plugin_filters=True,
)

Expand Down
29 changes: 8 additions & 21 deletions docs/gallery/advanced_io/parallelio.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
# from dateutil import tz
# from pynwb import NWBHDF5IO, NWBFile, TimeSeries
# from datetime import datetime
# from hdmf.data_utils import DataChunkIterator
# from hdmf.backends.hdf5.h5_utils import H5DataIO
#
# start_time = datetime(2018, 4, 25, 2, 30, 3, tzinfo=tz.gettz('US/Pacific'))
# fname = 'test_parallel_pynwb.nwb'
Expand All @@ -40,9 +40,11 @@
# # write in parallel but we do not write any data
# if rank == 0:
# nwbfile = NWBFile('aa', 'aa', start_time)
# data = DataChunkIterator(data=None, maxshape=(4,), dtype=np.dtype('int'))
# data = H5DataIO(shape=(4,),
# maxshape=(4,),
# dtype=np.dtype('int'))
#
# nwbfile.add_acquisition(TimeSeries('ts_name', description='desc', data=data,
# nwbfile.add_acquisition(TimeSeries(name='ts_name', description='desc', data=data,
# rate=100., unit='m'))
# with NWBHDF5IO(fname, 'w') as io:
# io.write(nwbfile)
Expand All @@ -58,24 +60,9 @@
# print(io.read().acquisition['ts_name'].data[rank])

####################
# To specify details about chunking, compression and other HDF5-specific I/O options,
# we can wrap data via ``H5DataIO``, e.g,
#
# .. code-block:: python
#
# data = H5DataIO(DataChunkIterator(data=None, maxshape=(100000, 100),
# dtype=np.dtype('float')),
# chunks=(10, 10), maxshape=(None, None))
# .. note::
#
# would initialize your dataset with a shape of (100000, 100) and maxshape of (None, None)
# and your own custom chunking of (10, 10).

####################
# Disclaimer
# ----------------
# Using :py:class:`hdmf.backends.hdf5.h5_utils.H5DataIO` we can also specify further
# details about the data layout, e.g., via the chunking and compression parameters.
#
# External links included in the tutorial are being provided as a convenience and for informational purposes only;
# they do not constitute an endorsement or an approval by the authors of any of the products, services or opinions of
# the corporation or organization or individual. The authors bear no responsibility for the accuracy, legality or
# content of the external site or for that of subsequent links. Contact the external site for answers to questions
# regarding its content.
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
# * **Data generators** Data generators are in many ways similar to data streams only that the
# data is typically being generated locally and programmatically rather than from an external
# data source.
#
# * **Sparse data arrays** In order to reduce storage size of sparse arrays a challenge is that while
# the data array (e.g., a matrix) may be large, only few values are set. To avoid storage overhead
# for storing the full array we can employ (in HDF5) a combination of chunking, compression, and
Expand Down Expand Up @@ -71,6 +72,13 @@
# This is useful for buffered I/O operations, e.g., to improve performance by accumulating data in memory and
# writing larger blocks at once.
#
# * :py:class:`~hdmf.data_utils.GenericDataChunkIterator` is a semi-abstract version of a
# :py:class:`~hdmf.data_utils.AbstractDataChunkIterator` that automatically handles the selection of
# buffer regions and resolves communication of compatible chunk regions. Users specify chunk
# and buffer shapes or sizes and the iterator will manage how to break the data up for write.
# For further details, see the
# :hdmf-docs:`GenericDataChunkIterator tutorial <tutorials/plot_generic_data_chunk_tutorial.html>`.
#

####################
# Iterative Data Write: API
Expand Down Expand Up @@ -107,11 +115,15 @@
from pynwb import NWBHDF5IO


def write_test_file(filename, data):
def write_test_file(filename, data, close_io=True):
"""
Simple helper function to write an NWBFile with a single timeseries containing data
:param filename: String with the name of the output file
:param data: The data of the timeseries
:param close_io: Close and destroy the NWBHDF5IO object used for writing (default=True)
:returns: None if close_io==True otherwise return NWBHDF5IO object used for write
"""

# Create a test NWBfile
Expand All @@ -133,7 +145,11 @@ def write_test_file(filename, data):
# Write the data to file
io = NWBHDF5IO(filename, 'w')
io.write(nwbfile)
io.close()
if close_io:
io.close()
del io
io = None
return io


####################
Expand Down Expand Up @@ -196,12 +212,6 @@ def iter_sin(chunk_length=10, max_chunks=100):
str(data.dtype)))

####################
# ``[Out]:``
#
# .. code-block:: python
#
# maxshape=(None, 10), recommended_data_shape=(1, 10), dtype=float64
#
# As we can see :py:class:`~hdmf.data_utils.DataChunkIterator` automatically recommends
# in its ``maxshape`` that the first dimensions of our array should be unlimited (``None``) and the second
# dimension be ``10`` (i.e., the length of our chunk. Since :py:class:`~hdmf.data_utils.DataChunkIterator`
Expand All @@ -216,8 +226,11 @@ def iter_sin(chunk_length=10, max_chunks=100):
# :py:class:`~hdmf.data_utils.DataChunkIterator` assumes that our generators yields in **consecutive order**
# **single** complete element along the **first dimension** of our a array (i.e., iterate over the first
# axis and yield one-element-at-a-time). This behavior is useful in many practical cases. However, if
# this strategy does not match our needs, then you can alternatively implement our own derived
# :py:class:`~hdmf.data_utils.AbstractDataChunkIterator`. We show an example of this next.
# this strategy does not match our needs, then using :py:class:`~hdmf.data_utils.GenericDataChunkIterator`
# or implementing your own derived :py:class:`~hdmf.data_utils.AbstractDataChunkIterator` may be more
# appropriate. We show an example of how to implement your own :py:class:`~hdmf.data_utils.AbstractDataChunkIterator`
# next. See the :hdmf-docs:`GenericDataChunkIterator tutorial <tutorials/plot_generic_data_chunk_tutorial.html>` as
# part of the HDMF documentation for details on how to use :py:class:`~hdmf.data_utils.GenericDataChunkIterator`.
#


Expand Down Expand Up @@ -387,26 +400,6 @@ def maxshape(self):
print(" Reduction : %.2f x" % (expected_size / file_size_largechunks_compressed))

####################
# ``[Out]:``
#
# .. code-block:: python
#
# 1) Sparse Matrix Size:
# Expected Size : 8000000.00 MB
# Occupied Size : 0.80000 MB
# 2) NWB HDF5 file (no compression):
# File Size : 0.89 MB
# Reduction : 9035219.28 x
# 3) NWB HDF5 file (with GZIP compression):
# File Size : 0.88847 MB
# Reduction : 9004283.79 x
# 4) NWB HDF5 file (large chunks):
# File Size : 80.08531 MB
# Reduction : 99893.47 x
# 5) NWB HDF5 file (large chunks with compression):
# File Size : 1.14671 MB
# Reduction : 6976450.12 x
#
# Discussion
# ^^^^^^^^^^
#
Expand Down Expand Up @@ -490,7 +483,7 @@ def maxshape(self):
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Note, we here use a generator for simplicity but we could equally well also implement our own
# :py:class:`~hdmf.data_utils.AbstractDataChunkIterator`.
# :py:class:`~hdmf.data_utils.AbstractDataChunkIterator` or use :py:class:`~hdmf.data_utils.GenericDataChunkIterator`.


def iter_largearray(filename, shape, dtype='float64'):
Expand Down Expand Up @@ -553,15 +546,6 @@ def iter_largearray(filename, shape, dtype='float64'):
else:
print("ERROR: Mismatch between data")


####################
# ``[Out]:``
#
# .. code-block:: python
#
# Success: All data values match


####################
# Example: Convert arrays stored in multiple files
# -----------------------------------------------------
Expand Down Expand Up @@ -705,46 +689,37 @@ def maxshape(self):
#
from hdmf.backends.hdf5.h5_utils import H5DataIO

write_test_file(filename='basic_alternative_custom_write.nwb',
data=H5DataIO(data=np.empty(shape=(0, 10), dtype='float'),
maxshape=(None, 10), # <-- Make the time dimension resizable
chunks=(131072, 2), # <-- Use 2MB chunks
compression='gzip', # <-- Enable GZip compression
compression_opts=4, # <-- GZip aggression
shuffle=True, # <-- Enable shuffle filter
fillvalue=np.nan # <-- Use NAN as fillvalue
)
)
# Use H5DataIO to specify how to setup the dataset in the file
dataio = H5DataIO(
shape=(0, 10), # Initial shape. If the shape is known then set to full shape
dtype=np.dtype('float'), # dtype of the dataset
maxshape=(None, 10), # Make the time dimension resizable
chunks=(131072, 2), # Use 2MB chunks
compression='gzip', # Enable GZip compression
compression_opts=4, # GZip aggression
shuffle=True, # Enable shuffle filter
fillvalue=np.nan # Use NAN as fillvalue
)

# Write a test NWB file with our dataset and keep the NWB file (i.e., the NWBHDF5IO object) open
io = write_test_file(
filename='basic_alternative_custom_write.nwb',
data=dataio,
close_io=False
)

####################
# Step 2: Get the dataset(s) to be updated
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
from pynwb import NWBHDF5IO # noqa

io = NWBHDF5IO('basic_alternative_custom_write.nwb', mode='a')
nwbfile = io.read()
data = nwbfile.get_acquisition('synthetic_timeseries').data

# Let's check what the data looks like
print("Shape %s, Chunks: %s, Maxshape=%s" % (str(data.shape), str(data.chunks), str(data.maxshape)))

####################
# ``[Out]:``
#
# .. code-block:: python
#
# Shape (0, 10), Chunks: (131072, 2), Maxshape=(None, 10)
#

####################
# Step 3: Implement custom write
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Let's check what the data looks like before we write
print("Before write: Shape= %s, Chunks= %s, Maxshape=%s" %
(str(dataio.dataset.shape), str(dataio.dataset.chunks), str(dataio.dataset.maxshape)))

data.resize((8, 10)) # <-- Allocate the space with need
data[0:3, :] = 1 # <-- Write timesteps 0,1,2
data[3:6, :] = 2 # <-- Write timesteps 3,4,5, Note timesteps 6,7 are not being initialized
dataio.dataset.resize((8, 10)) # <-- Allocate space. Only needed if we didn't set the initial shape large enough
dataio.dataset[0:3, :] = 1 # <-- Write timesteps 0,1,2
dataio.dataset[3:6, :] = 2 # <-- Write timesteps 3,4,5, Note timesteps 6,7 are not being initialized
io.close() # <-- Close the file


Expand All @@ -756,20 +731,13 @@ def maxshape(self):

io = NWBHDF5IO('basic_alternative_custom_write.nwb', mode='a')
nwbfile = io.read()
data = nwbfile.get_acquisition('synthetic_timeseries').data
print(data[:])
dataset = nwbfile.get_acquisition('synthetic_timeseries').data
print("After write: Shape= %s, Chunks= %s, Maxshape=%s" %
(str(dataset.shape), str(dataset.chunks), str(dataset.maxshape)))
print(dataset[:])
io.close()

####################
# ``[Out]:``
#
# .. code-block:: python
#
# [[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
# [ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
# [ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
# [ 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]
# [ 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]
# [ 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]
# [ nan nan nan nan nan nan nan nan nan nan]
# [ nan nan nan nan nan nan nan nan nan nan]]
# We allocated our data to be ``shape=(8, 10)`` but we only wrote data to the first 6 rows of the
# array. As expected, we therefore, see our ``fillvalue`` of ``nan`` in the last two rows of the data.
#
2 changes: 1 addition & 1 deletion docs/gallery/general/read_basics.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@
# object and accessing its attributes, but it may be useful to explore the data in a
# more interactive, visual way.
#
# You can use `NWBWidgets <https://github.com/NeurodataWithoutBorders/nwb-jupyter-widgets>`_,
# You can use `NWBWidgets <https://github.com/NeurodataWithoutBorders/nwbwidgets>`_,
# a package containing interactive widgets for visualizing NWB data,
# or you can use the `HDFView <https://www.hdfgroup.org/downloads/hdfview>`_
# tool, which can open any generic HDF5 file, which an NWB file is.
Expand Down
Loading

0 comments on commit e3381a0

Please sign in to comment.