Skip to content

Commit

Permalink
Documentation updates: data processing sections (#1093)
Browse files Browse the repository at this point in the history
* re-render ek60 echodata example, add ek80 and azfp examples

* small wording changes

* add echopype version

* update power/angle data beam dim description

* add consolidate to api.rst

* try to link add_splitbeam_angle

* try to link consolidate.add_splitbeam_angle

* start overhaul data proc page in new notebook, bibtext does not work

* add cal_params and env_params sections

* remove old data processing page

* add interfacing ecs file

* re-run notebook

* tweak text and sections

* fix echopype version print out

* add metrics subpackage into docs

* fix: process subpackage is now removed

* separate out section for sonarnetcdf4 adaptation, add nan-padding fig

* actually add adaptation page, update toc

* add image for echo_range and revise text

* add allowable cal params for each instrument type

* add docs/images

* reorganize data processing pages

* fix typo

* push up new sets of data-proc files
  • Loading branch information
leewujung authored Aug 21, 2023
1 parent 7c0c818 commit 29422d0
Show file tree
Hide file tree
Showing 16 changed files with 833 additions and 134 deletions.
8 changes: 5 additions & 3 deletions docs/source/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ execute:
# targetname: book.tex

# Add a bibtex file so that we can create citations
# bibtex_bibfiles:
# - references.bib
bibtex_bibfiles:
- references.bib

# Information about where the book exists on the web
repository:
Expand All @@ -39,13 +39,15 @@ sphinx:
'sphinx_automodapi.automodapi',
'numpydoc',
# 'sphinx.ext.autodoc',
'sphinxcontrib.bibtex',
'sphinx.ext.intersphinx',
'sphinx.ext.mathjax',
'sphinx.ext.ifconfig',
'sphinx.ext.githubpages',
'sphinxcontrib.mermaid'
]
# config:
config:
bibtex_reference_style: author_year
# # https://github.com/executablebooks/jupyter-book/issues/1186
# # https://sphinx-book-theme.readthedocs.io/en/latest/customize/sidebar-secondary.html
# html_theme_options:
Expand Down
10 changes: 7 additions & 3 deletions docs/source/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,17 @@ parts:
chapters:
- file: convert
- file: open-converted
- file: process
- file: data-format
title: Data formats
title: Raw data formats
sections:
- file: data-format-sonarnetcdf4
- file: data-format-raw
- file: data-format-processed
- file: data-format-5to6
- file: data-proc
title: Data processing
sections:
- file: data-proc-func
- file: data-proc-additional
- file: viz
- file: processing-levels
title: Processing levels
Expand Down
17 changes: 16 additions & 1 deletion docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ API components that most users will interact with.
In echopype versions prior to 0.5.0, the API in this page focused
on the ``convert`` and ``process`` subpackages. See the
`0.4.1 API page <https://echopype.readthedocs.io/en/v0.4.1/api.html>`_
if you're using a previous release. That workflow is being deprecated.
if you're using a previous release. That workflow is now removed.

**Content**

Expand Down Expand Up @@ -72,6 +72,13 @@ commongrid
:no-inheritance-diagram:
:no-heading:

consolidate
^^^^^^^^^^

.. automodapi:: echopype.consolidate
:no-inheritance-diagram:
:no-heading:

qc
^^^

Expand All @@ -86,6 +93,14 @@ mask
:no-inheritance-diagram:
:no-heading:

metrics
^^^^^^^

.. automodapi:: echopype.metrics
:no-inheritance-diagram:
:no-heading:


Utilities
---------

Expand Down
86 changes: 0 additions & 86 deletions docs/source/data-format-processed.ipynb

This file was deleted.

60 changes: 23 additions & 37 deletions docs/source/data-format-raw.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16257,71 +16257,57 @@
"metadata": {},
"source": [
"(data-format:power-angle-complex)=\n",
"## Data from different echosounders\n",
"\n",
"## Data from different echosounders"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Power/Angle data\n",
"\n",
"For echosounder setups using single-beam transducers, only the echo power (or intensity) data are available and these data are stored in the variable `backscatter_r` (the `r` in the suffix means the real part of the signal). This is the case for data from the AZFP echosounder or EK60/EK80 echosounder paired with single-beam transducers (see below for more details on EK80 data).\n",
"\n",
"For echosounder setups using split-beam transducers, the echo power data are similarly stored in the variable `backscatter_r`, but with the additional split-beam angle data for each sample (along `range_sample`) stored in variables `angle_alongship` and `angle_athwartship`. This is the case for data from the EK60 echosounder or the EK80 echosounder configured to store power/angle data.\n",
"\n",
"All the above data variables (`backscatter_r`, `angle_alongship`, `angle_athwartship`) use the gridded representation with dimensions `(channel, range_sample, ping_time, beam)`. Here, the length of the `beam` dimension equals to 1. This length is intuitive for single-beam data. For split-beam data, the length of this dimension is 1, because the power/angle data are already in a derived form from the split-beam transducer sectors. All data are stored in the `Sonar/Beam_group1` group.\n",
"For echosounder setups using single-beam transducers, only the echo power (or intensity) data are available and these data are stored in the data variable `backscatter_r` (the `r` in the suffix refers to the real part of the signal). This is the case for data from:\n",
"- the AZFP echosounder\n",
"- the EK60 echosounder paired with single-beam transducers\n",
"- the EK80 echosounder paired with single-beam transducers and configured to transmit narrowband signals and store data in power/angle format.\n",
"\n",
"### Complex data\n",
"\n",
"A deviation from the above is the case when the raw _complex_ samples are recorded by EK80 echosounders paired with split-beam transducers. In this case, both `backscatter_r` and `backscatter_i` variables exist and contain the real and imaginary part of the echo waveform data, respectively. These vairables are with dimension `(channel, range_sample, ping_time, beam)` as before, but the length of the `beam` dimension can be 3 or 4, depending on the specific transducer used in the setup. The `angle_alongship` and `angle_athwartship` variables are not present in such files.\n",
"For echosounder setups using split-beam transducers, the echo power data are similarly stored in the variable `backscatter_r`, but with the additional split-beam angle data stored in data variables `angle_alongship` and `angle_athwartship`. This is the case for data from:\n",
"- the EK60 echosounder paired with split-beam transducers\n",
"- the EK80 echosounder paired with split-beam transducers and configured to transmit narrowband signals and store data in power/angle format.\n",
"\n",
":::{Note}\n",
"It is possible for power/angle data and complex data to coexist in files collected by EK80 echosounders, since each frequency channel can be configured separately. In this case, the complex data are stored in the `Sonar/Beam_group1` group and the power/angle data are stored in the `Sonar/Beam_group2` group. This is scenario 3 in the above example EK80 data section.\n",
":::\n"
"All the above data variables (`backscatter_r`, `angle_alongship`, `angle_athwartship`) use the gridded representation with dimensions `(channel, range_sample, ping_time)` and are stored in the `Sonar/Beam_group1` group."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(data-format:multfreq-organization)=\n",
"## Organization of multi-frequency data\n",
"\n",
"Echopype follows the [ICES SONAR-netCDF4 convention ver.1](http://www.ices.dk/sites/pub/Publication%20Reports/Cooperative%20Research%20Report%20(CRR)/CRR341.pdf) when possible. However, to fully leverage the power of label-aware manipulation provided by the [xarray](https://docs.xarray.dev/en/stable/) library and enhance coherence of data representation for scientific echosounders, the echopype developers have made decisions to deviate from the convention in key aspects.\n",
"\n",
"One significant change is on the organization of multi-frequency data. Echopype implements a data model that optimizes data access and filtering (“slicing”) efficiency and usability at the expense of potentially increased file storage.\n",
"\n",
"The convention defines that data variables, such as `backscatter_r`, from each sonar beam (i.e. frequency channel or transducers for typical scientific echosounder) are stored based on a one-dimensional ragged array structure that uses a custom variable-length vector data type (`sample_t`) and `ping_time` as its coordinate dimensions. In addition, each frequency channel is stored in a separate netCDF4 group (`Sonar/Beam_group1`, `Sonar/Beam_group2`, ...).\n",
"### Complex data\n",
"\n",
"Echopype restructures this multi-group ragged array representation into a single-group, 4-dimensional gridded representation, with dimensions `(channel, range_sample, ping_time, beam)` across all channels. Here, the `ping_time` and `beam` dimensions follow the convention definition, whereas the `channel` and `range_sample` (along-range sample number) dimensions are echopype-specific modifications. Data from each frequency channel are mapped along the `channel` dimension, and echo data from each ping are mapped along the `range_sample` dimension. These consolidated, uniform multi-channel (or multi-frequency) [`DataArrays`](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.html) are stored in `Sonar/Beam_group1`, `Sonar/Beam_group2`, and potentially other such groups (`Sonar/Beam_group3`, etc.) in the netCDF data model.\n",
"\n"
"A deviation from the above is the case when the raw _complex_ samples are recorded by EK80 echosounders paired with split-beam transducers. In this case, both `backscatter_r` and `backscatter_i` variables exist and contain the real and imaginary part of the echo waveform data, respectively. These variables are with dimension `(channel, range_sample, ping_time, beam)`, and the length of the `beam` dimension can be 3 or 4 depending on the specific transducer used. The `angle_alongship` and `angle_athwartship` variables are not present in such files and can be computed and added to the calibrate Sv dataset using [](echopype.consolidate.add_splitbeam_angle)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
":::{Note}\n",
"Due to flexibility in echosounder settings, there can potentially be unequal number of samples along sonar range (i.e., length of the `range_sample` dimension) across different `ping_time` or `channel`. Echopype addresses this by padding `NaN` for pings or channels with fewer samples to maintain the uniform shape of the 4-dimensional gridded representation.\n",
"\n",
"The `NaN` padding approach could consume large amount of memory in some specific cases due to the echosounder setup. This is an issue we are actively working on. See [#1070](https://github.com/OSOceanAcoustics/echopype/pull/1070) for detail.\n",
":::\n",
"\n",
"<!-- Below is a comparison of data representations defined in the convention and in echopype.\n",
"\n",
"### ADD FIGURE -->"
"It is possible for power/angle data and complex data to coexist in files collected by EK80 echosounders, since each frequency channel can be configured separately. In this case, the complex data are stored in the `Sonar/Beam_group1` group and the power/angle data are stored in the `Sonar/Beam_group2` group. This is scenario 3 in the above example EK80 data section.\n",
":::"
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Tags",
"kernelspec": {
"display_name": "Python [conda env:echopype]",
"display_name": "ep-dev-20230705",
"language": "python",
"name": "conda-env-echopype-py"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -16333,7 +16319,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.12"
}
},
"nbformat": 4,
Expand Down
74 changes: 74 additions & 0 deletions docs/source/data-format-sonarnetcdf4.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(data-format:sonarnetcdf4-adaptation)=\n",
"# Adaptation of SONAR-netCDF4 convention"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Echopype follows the [ICES SONAR-netCDF4 convention ver.1](http://www.ices.dk/sites/pub/Publication%20Reports/Cooperative%20Research%20Report%20(CRR)/CRR341.pdf) when possible. However, to fully leverage the power of label-aware manipulation provided by the [xarray](https://docs.xarray.dev/en/stable/) library and enhance coherence of data representation for scientific echosounders, the echopype developers have made decisions to deviate from the convention in key aspects."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(data-format:multfreq-organization)=\n",
"## Organization of multi-frequency data\n",
"\n",
"One important Echopype adaptation is the organization of multi-frequency data. Echopype implements a data structure that optimizes data access and filtering (“slicing”) efficiency and usability at the expense of potentially increased file storage.\n",
"\n",
"Specifically, the SONAR-netCDF4 convention defines that data variables, such as `backscatter_r`, from each sonar beam (i.e. frequency channel or transducers for typical scientific echosounder) are stored based on a one-dimensional ragged array structure that uses a custom variable-length vector data type (`sample_t`) and `ping_time` as its coordinate dimensions. In addition, each frequency channel is stored in a separate netCDF4 group (`Sonar/Beam_group1`, `Sonar/Beam_group2`, ...).\n",
"\n",
"Echopype restructures this multi-group ragged array representation into a single-group, 3-dimension (`(channel, range_sample, ping_time)`) or 4-dimensional (`(channel, range_sample, ping_time, beam)`) gridded representation across all channels. Here:\n",
"- the `ping_time` dimension follows the convention definition\n",
"- the `beam` dimension, when exists, maps to the different sectors of split-beam transducers\n",
"- the `channel` and `range_sample` (along-range sample number) dimensions are echopype-specific modifications\n",
"\n",
"Data from each frequency channel are mapped along the `channel` dimension, and echo data from each ping are mapped along the `range_sample` dimension. These consolidated, uniform multi-channel (or multi-frequency) [`DataArrays`](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.html) are stored in `Sonar/Beam_group1`, `Sonar/Beam_group2`, and potentially other such groups (`Sonar/Beam_group3`, etc.) in the netCDF data model.\n",
"\n",
"See [](data-format:power-angle-complex) for detail on core variables that store the echo data and the number of dimensions, which varies depending on the instrument setup."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## NaN-padding\n",
"Due to the flexibility in echosounder configuration, there can potentially be unequal number of samples along sonar range (i.e., length of the `range_sample` dimension) across different `ping_time` or `channel`. Echopype addresses this by padding `NaN` for pings or channels with fewer samples to maintain the uniform shape of a 3- or 4-dimensional gridded representation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is a comparison of data representations defined in (**A**) the SONAR-netCDF4 convention and in (**B**) echopype, where the gray cells represent NaN-padded cells. This sketch illustrates the case of 3-dimensional gridded data such as `backscatter_r` from AZFP and EK60 data, or EK80 power/angle data.\n",
"\n",
"![](./images/beam_dim_v5-01.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
":::{Note}\n",
"The `NaN` padding approach could consume large amount of memory in some specific cases due to the echosounder setup. This is an issue we are actively working on. See [#1070](https://github.com/OSOceanAcoustics/echopype/pull/1070) for detail.\n",
":::"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 29422d0

Please sign in to comment.