From 34071470caef512f537f4fb8c73e22aabfa56d13 Mon Sep 17 00:00:00 2001 From: Emilio Mayorga Date: Sun, 27 Aug 2023 07:01:27 -0700 Subject: [PATCH] Updates to Raw data format pages (#1134) * docs: update SONAR-netCDF4 v1 url * docs: fix open_raw api link * docs: Generalize the name and title of data-format-5to6 page * docs: Add section describing 0.7.1 to 0.8.0 major changes * docs: Add stub that will include brief reference to the echopype-checker package and repo * docs: add section on echopype-checker * docs: remove obsolete process.rst * tweak wording for v0.8.0 changes section * change v1 to version 1 --------- Co-authored-by: Wu-Jung Lee --- docs/source/_toc.yml | 2 +- docs/source/data-format-5to6.ipynb | 4040 ------------------- docs/source/data-format-changes.ipynb | 4086 ++++++++++++++++++++ docs/source/data-format-sonarnetcdf4.ipynb | 162 +- docs/source/data-format.md | 6 +- docs/source/process.rst | 237 -- docs/source/processing-levels.md | 2 +- 7 files changed, 4183 insertions(+), 4352 deletions(-) delete mode 100644 docs/source/data-format-5to6.ipynb create mode 100644 docs/source/data-format-changes.ipynb delete mode 100644 docs/source/process.rst diff --git a/docs/source/_toc.yml b/docs/source/_toc.yml index 4eb61e57b..3df4e1413 100644 --- a/docs/source/_toc.yml +++ b/docs/source/_toc.yml @@ -17,7 +17,7 @@ parts: sections: - file: data-format-sonarnetcdf4 - file: data-format-raw - - file: data-format-5to6 + - file: data-format-changes - file: data-proc title: Data processing sections: diff --git a/docs/source/data-format-5to6.ipynb b/docs/source/data-format-5to6.ipynb deleted file mode 100644 index 85db13a46..000000000 --- a/docs/source/data-format-5to6.ipynb +++ /dev/null @@ -1,4040 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "decent-individual", - "metadata": {}, - "source": [ - "(data-format:v0.5.x-to-v0.6.0-changes)=\n", - "# v0.5.x to v0.6.0 changes" - ] - }, - { - "cell_type": "markdown", - "id": "overhead-mandate", - "metadata": {}, - "source": [ - "In order to enhance the compliance of echopype-generated datasets to the SONAR-netCDF4 version 1 convention, a number of changes were introduced in echopype v0.6.0 that create incompatibilities with the data structure used in previous versions.\n", - "\n", - "To ease the transition, the [`open_converted`](function:open-converted) function is able to open files previously converted using echopype v0.5.x (0.5.0 to 0.5.6) into the v0.6.0 data format, encapsulated in the `EchoData` object." - ] - }, - { - "cell_type": "markdown", - "id": "early-mechanism", - "metadata": {}, - "source": [ - "Key changes involved renaming and restructuring a couple of groups, and renaming some coordinates and data variables, as summarized below:\n", - "\n", - "| Type | v0.5.x | v0.6.0 | Rationale and notes |\n", - "| :--------- | :-------------- | :--------------------- | :----------------------- |\n", - "| Group | `Beam` | `Sonar/Beam_group1` | Convention compliance |\n", - "| Group | `Beam_power` | `Sonar/Beam_group2` | Convention compliance |\n", - "| Group | `Vendor` | `Vendor_specific` | Convention compliance |\n", - "| Coordinate | `frequency` | `channel` | Accommodate channels with duplicated frequencies. The new variable `frequency_nominal` was introduced |\n", - "| Coordinate | `range_bin` | `range_sample` | Better intuitive understanding of data |\n", - "| Coordinate | `quadrant` | `beam` | Convention compliance. This `Beam_groupX` coordinate was added when it did not exist. |\n", - "| Coordinate | `location_time` | `time1` | Convention compliance. In `Platform` group |\n", - "| Coordinate | `mru_time` | `time2` | Convention compliance. In `Platform` group |\n", - "| Variable | `heave` | `vertical_offset` | Convention compliance. In `Platform` group |\n", - "| Variable | `src_filenames` | `source_filenames` | Convention compliance. In `Provenance` group. Also converted from global attribute to variable |\n", - "\n", - "Other changes included:\n", - "- Adding previously missing, mandatory convention variables. When no data are available to populate them, these are filled with null (`NaN`) values.\n", - "- Moving variables from one group to another, particularly from the Beam groups to `Platform` and `Vendor`. These variables were not typically not part of the convention.\n", - "- The Beam_groupX `beamwidth_receive_athwartship` and `beamwidth_transmit_athwartship` variables were consolidated into `beamwidth_twoway_athwartship` because the EK60 and EK80 echosounders do not store one-way transmit or receive beam widths. Likewise for `beamwidth_receive_alongship` and `beamwidth_transmit_alongship`.\n", - "\n", - "More details, including Pull Requests and discussions related to these changes, can be found in the [Release notes](whats-new.html#v0-6-0-2022-may-26)." - ] - }, - { - "cell_type": "markdown", - "id": "touched-concentration", - "metadata": {}, - "source": [ - "## Convert old files to v0.6.0 format" - ] - }, - { - "cell_type": "markdown", - "id": "fuzzy-sociology", - "metadata": {}, - "source": [ - "To convert data files from v0.5.x to v0.6.0 format, simply open the old files and re-save them:\n", - "\n", - "```python\n", - "import echopype as ep\n", - "# open old v0.5.x file and convert it into a v0.6.0-format EchoData object\n", - "ed = ep.open_converted(\"old_format_file.nc\")\n", - "ed.to_netcdf(\"new_format_file.nc\")\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "stone-trash", - "metadata": {}, - "source": [ - "## v0.5.x data format" - ] - }, - { - "cell_type": "markdown", - "id": "naughty-breast", - "metadata": {}, - "source": [ - "Below we provide a sample of the v0.5.x data format via a printout of the previous `EchoData` object.\n", - "\n", - "Compare this with the [v0.6.0 `EchoData` object](data-format:echodata-object) to see the changes listed in the table above." - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "typical-piece", - "metadata": { - "execution": { - "iopub.execute_input": "2022-05-26T18:02:02.167924Z", - "iopub.status.busy": "2022-05-26T18:02:02.166085Z", - "iopub.status.idle": "2022-05-26T18:02:02.287977Z", - "shell.execute_reply": "2022-05-26T18:02:02.287273Z", - "shell.execute_reply.started": "2022-05-26T18:02:02.167819Z" - }, - "tags": [ - "remove-input" - ] - }, - "outputs": [ - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "
EchoData: standardized raw data from Internal Memory
\n", - "
\n", - "
    \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:  ()\n",
      -       "Data variables:\n",
      -       "    *empty*\n",
      -       "Attributes:\n",
      -       "    conventions:                 CF-1.7, SONAR-netCDF4-1.0, ACDD-1.3\n",
      -       "    keywords:                    EK60\n",
      -       "    sonar_convention_authority:  ICES\n",
      -       "    sonar_convention_name:       SONAR-netCDF4\n",
      -       "    sonar_convention_version:    1.0\n",
      -       "    summary:                     EK60 raw file s3://ncei-wcsd-archive/data/ra...\n",
      -       "    title:                       2017 Pacific Hake Acoustic Trawl Survey\n",
      -       "    date_created:                2017-07-28T18:16:19Z\n",
      -       "    survey_name:                 

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:                 (frequency: 3, ping_time: 529)\n",
      -       "Coordinates:\n",
      -       "  * frequency               (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      -       "  * ping_time               (ping_time) datetime64[ns] 2017-07-28T18:16:19.31...\n",
      -       "Data variables:\n",
      -       "    absorption_indicative   (frequency, ping_time) float64 0.002822 ... 0.03259\n",
      -       "    sound_speed_indicative  (frequency, ping_time) float64 1.481e+03 ... 1.48...

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:        (location_time: 2165, frequency: 3, ping_time: 529)\n",
      -       "Coordinates:\n",
      -       "  * location_time  (location_time) datetime64[ns] 2017-07-28T18:16:21.4759997...\n",
      -       "  * frequency      (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      -       "  * ping_time      (ping_time) datetime64[ns] 2017-07-28T18:16:19.313999872 ....\n",
      -       "Data variables:\n",
      -       "    latitude       (location_time) float64 dask.array<chunksize=(2165,), meta=np.ndarray>\n",
      -       "    longitude      (location_time) float64 dask.array<chunksize=(2165,), meta=np.ndarray>\n",
      -       "    sentence_type  (location_time) <U3 dask.array<chunksize=(2165,), meta=np.ndarray>\n",
      -       "    pitch          (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      -       "    roll           (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      -       "    heave          (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      -       "    water_level    (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      -       "Attributes:\n",
      -       "    platform_type:       Research vessel\n",
      -       "    platform_name:       Bell M. Shimada\n",
      -       "    platform_code_ICES:  315

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:        (location_time: 22037)\n",
      -       "Coordinates:\n",
      -       "  * location_time  (location_time) datetime64[ns] 2017-07-28T18:16:19.3140003...\n",
      -       "Data variables:\n",
      -       "    NMEA_datagram  (location_time) <U73 '$SDVLW,5050.149,N,5050.149,N' ... '$...\n",
      -       "Attributes:\n",
      -       "    description:  All NMEA sensor datagrams

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:  ()\n",
      -       "Data variables:\n",
      -       "    *empty*\n",
      -       "Attributes:\n",
      -       "    conversion_software_name:     echopype\n",
      -       "    conversion_software_version:  0.5.6\n",
      -       "    conversion_time:              2022-05-26T18:01:56Z\n",
      -       "    src_filenames:                s3://ncei-wcsd-archive/data/raw/Bell_M._Shi...\n",
      -       "    duplicate_ping_times:         0

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:  ()\n",
      -       "Data variables:\n",
      -       "    *empty*\n",
      -       "Attributes:\n",
      -       "    sonar_manufacturer:      Simrad\n",
      -       "    sonar_model:             ER60\n",
      -       "    sonar_serial_number:     \n",
      -       "    sonar_software_name:     \n",
      -       "    sonar_software_version:  2.4.3\n",
      -       "    sonar_type:              echosounder

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:                         (frequency: 3, ping_time: 529,\n",
      -       "                                     range_bin: 3957)\n",
      -       "Coordinates:\n",
      -       "  * frequency                       (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      -       "  * ping_time                       (ping_time) datetime64[ns] 2017-07-28T18:...\n",
      -       "  * range_bin                       (range_bin) int64 0 1 2 3 ... 3954 3955 3956\n",
      -       "Data variables: (12/30)\n",
      -       "    channel_id                      (frequency) <U37 'GPT  18 kHz 009072058c8...\n",
      -       "    beam_type                       (frequency) int64 1 1 1\n",
      -       "    beamwidth_receive_alongship     (frequency) float64 10.9 6.81 6.58\n",
      -       "    beamwidth_receive_athwartship   (frequency) float64 10.82 6.85 6.52\n",
      -       "    beamwidth_transmit_alongship    (frequency) float64 10.9 6.81 6.58\n",
      -       "    beamwidth_transmit_athwartship  (frequency) float64 10.82 6.85 6.52\n",
      -       "    ...                              ...\n",
      -       "    data_type                       (frequency, ping_time) float64 3.0 ... 3.0\n",
      -       "    count                           (frequency, ping_time) float64 3.957e+03 ...\n",
      -       "    offset                          (frequency, ping_time) float64 0.0 ... 0.0\n",
      -       "    transmit_mode                   (frequency, ping_time) float64 0.0 ... 0.0\n",
      -       "    angle_athwartship               (frequency, ping_time, range_bin) float64 ...\n",
      -       "    angle_alongship                 (frequency, ping_time, range_bin) float64 ...\n",
      -       "Attributes:\n",
      -       "    beam_mode:              vertical\n",
      -       "    conversion_equation_t:  type_3

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
  • \n", - " \n", - " \n", - "
    \n", - "
    \n", - "
      \n", - "
      \n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
      <xarray.Dataset>\n",
      -       "Dimensions:           (frequency: 3, pulse_length_bin: 5)\n",
      -       "Coordinates:\n",
      -       "  * frequency         (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      -       "  * pulse_length_bin  (pulse_length_bin) int64 0 1 2 3 4\n",
      -       "Data variables:\n",
      -       "    sa_correction     (frequency, pulse_length_bin) float64 0.0 -0.7 ... -0.3\n",
      -       "    gain_correction   (frequency, pulse_length_bin) float64 20.3 22.95 ... 26.55\n",
      -       "    pulse_length      (frequency, pulse_length_bin) float64 0.000512 ... 0.00...

      \n", - "
    \n", - "
    \n", - "
  • \n", - " \n", - "
\n", - "
\n", - " " - ], - "text/plain": [ - "EchoData: standardized raw data from Internal Memory\n", - " > top: (Top-level) contains metadata about the SONAR-netCDF4 file format.\n", - " > environment: (Environment) contains information relevant to acoustic propagation through water.\n", - " > platform: (Platform) contains information about the platform on which the sonar is installed.\n", - " > nmea: (Platform/NMEA) contains information specific to the NMEA protocol.\n", - " > provenance: (Provenance) contains metadata about how the SONAR-netCDF4 version of the data were obtained.\n", - " > sonar: (Sonar) contains specific metadata for the sonar system.\n", - " > beam: (Beam) contains backscatter data and other beam or channel-specific data.\n", - " > vendor: (Vendor specific) contains vendor-specific information about the sonar and the data." - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "ed" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "traditional-advance", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "celltoolbar": "Tags", - "kernelspec": { - "display_name": "Python [conda env:oldep]", - "language": "python", - "name": "conda-env-oldep-py" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/docs/source/data-format-changes.ipynb b/docs/source/data-format-changes.ipynb new file mode 100644 index 000000000..8e1bdb8d8 --- /dev/null +++ b/docs/source/data-format-changes.ipynb @@ -0,0 +1,4086 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "87f7ff50", + "metadata": {}, + "source": [ + "# Changes in recent versions" + ] + }, + { + "cell_type": "markdown", + "id": "21267048", + "metadata": {}, + "source": [ + "\n", + "The underlying raw converted data structure used by echopype (the [`EchoData`](data-format:echodata-object) object) has undergone revisions over time. While in most echopype releases these changes are relatively small, versions 0.8.0 and 0.6.0 incorporated significant changes, with implications for backward compatibility. Here we describe the main changes in each of these major releases. Please refer to [What's new](whats-new.md) for more complete details on specific changes that impacted the data structure." + ] + }, + { + "cell_type": "markdown", + "id": "e0df3a8d", + "metadata": {}, + "source": [ + "(data-format:v0.7.1-to-v0.8.0-changes)=\n", + "## v0.7.1 to v0.8.0 changes" + ] + }, + { + "cell_type": "markdown", + "id": "679d9570", + "metadata": {}, + "source": [ + "Changes introduced in version 0.8.0 were carried out to incorporate missing variables that are mandated by SONAR-netCDF4 v1 and to implement adaptations to the convention in a more consistent fashion across variables and instrument types. Some of these changes modified or reverted decisions implemented in version 0.6.0 (below) that were later found to have large impacts on performance and usability.\n", + "\n", + "Highlights include:\n", + "- Remove `beam` and `ping_time` dimensions from a whole netCDF4 group or individual variables when they were determined to be not required for specific instrument types. The `beam` dimension is now dropped from all `Sonar/Beam_groupX` groups _except for_ EK80 complex samples, where both `backscatter_r` and `backscatter_i` exist and the `beam` dimension represents different sectors of split-beam transducers. The `ping_time` dimension is retained only with variables that are known to potentially vary with time in the instrument types supported by echopype.\n", + "- In `Sonar/Beam_groupX` groups: Standardize the use of `transmit_frequency_start` and `transmit_frequency_stop`, where they were previously missing or the names being used (`frequency_start` and `frequency_end`) were not the ones specified by the convention.\n", + "- In the `Platform` group: Implement variables absed on the convention more consistently across instrument types:\n", + " - Assign default values to variables when no such variables are found in the raw data file\n", + " - Revise the dimensions of each variable to be consistent across instrument types, with dimensions deemed unnecessary dropped from some variables.\n", + "- In the `Provenance` group: Add new attributes `combination_*` to the \"combined\" `EchoData` object, mirroring the convention-based attributes `conversion_*`.\n", + "- In the `Vendor_specific` group: Move filter coefficients and decimation factor from attributes to variables in EK80, to facilitate consistent provenance tracking during `combine_echodata` operations.\n", + "- Improve the presence and use of variable attributes throughout `EchoData` groups.\n", + "\n", + "Version 0.8.0 does not incorporte the capability to read files converted by previous versions of echopype. We recommend using `open_raw` to re-convert the raw data files." + ] + }, + { + "cell_type": "markdown", + "id": "decent-individual", + "metadata": {}, + "source": [ + "(data-format:v0.5.x-to-v0.6.0-changes)=\n", + "## v0.5.x to v0.6.0 changes" + ] + }, + { + "cell_type": "markdown", + "id": "overhead-mandate", + "metadata": {}, + "source": [ + "In order to enhance the compliance of echopype-generated datasets to the SONAR-netCDF4 version 1 convention, a number of changes were introduced in echopype v0.6.0 that create incompatibilities with the data structure used in previous versions.\n", + "\n", + "To ease the transition, the [`open_converted`](function:open-converted) function is able to open files previously converted using echopype v0.5.x (0.5.0 to 0.5.6) into the v0.6.0 data format, encapsulated in the `EchoData` object." + ] + }, + { + "cell_type": "markdown", + "id": "early-mechanism", + "metadata": {}, + "source": [ + "Key changes involved renaming and restructuring a couple of groups, and renaming some coordinates and data variables, as summarized below:\n", + "\n", + "| Type | v0.5.x | v0.6.0 | Rationale and notes |\n", + "| :--------- | :-------------- | :--------------------- | :----------------------- |\n", + "| Group | `Beam` | `Sonar/Beam_group1` | Convention compliance |\n", + "| Group | `Beam_power` | `Sonar/Beam_group2` | Convention compliance |\n", + "| Group | `Vendor` | `Vendor_specific` | Convention compliance |\n", + "| Coordinate | `frequency` | `channel` | Accommodate channels with duplicated frequencies. The new variable `frequency_nominal` was introduced |\n", + "| Coordinate | `range_bin` | `range_sample` | Better intuitive understanding of data |\n", + "| Coordinate | `quadrant` | `beam` | Convention compliance. This `Beam_groupX` coordinate was added when it did not exist. |\n", + "| Coordinate | `location_time` | `time1` | Convention compliance. In `Platform` group |\n", + "| Coordinate | `mru_time` | `time2` | Convention compliance. In `Platform` group |\n", + "| Variable | `heave` | `vertical_offset` | Convention compliance. In `Platform` group |\n", + "| Variable | `src_filenames` | `source_filenames` | Convention compliance. In `Provenance` group. Also converted from global attribute to variable |\n", + "\n", + "Other changes included:\n", + "- Adding previously missing, mandatory convention variables. When no data are available to populate them, these are filled with null (`NaN`) values.\n", + "- Moving variables from one group to another, particularly from the Beam groups to `Platform` and `Vendor`. These variables were not typically not part of the convention.\n", + "- The Beam_groupX `beamwidth_receive_athwartship` and `beamwidth_transmit_athwartship` variables were consolidated into `beamwidth_twoway_athwartship` because the EK60 and EK80 echosounders do not store one-way transmit or receive beam widths. Likewise for `beamwidth_receive_alongship` and `beamwidth_transmit_alongship`.\n", + "\n", + "More details, including Pull Requests and discussions related to these changes, can be found in the [Release notes](whats-new.html#v0-6-0-2022-may-26)." + ] + }, + { + "cell_type": "markdown", + "id": "touched-concentration", + "metadata": {}, + "source": [ + "### Convert old files to v0.6.0 format" + ] + }, + { + "cell_type": "markdown", + "id": "fuzzy-sociology", + "metadata": {}, + "source": [ + "To convert data files from v0.5.x to v0.6.0 format, simply open the old files and re-save them:\n", + "\n", + "```python\n", + "import echopype as ep\n", + "# open old v0.5.x file and convert it into a v0.6.0-format EchoData object\n", + "ed = ep.open_converted(\"old_format_file.nc\")\n", + "ed.to_netcdf(\"new_format_file.nc\")\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "stone-trash", + "metadata": {}, + "source": [ + "### v0.5.x data format" + ] + }, + { + "cell_type": "markdown", + "id": "naughty-breast", + "metadata": {}, + "source": [ + "Below we provide a sample of the v0.5.x data format via a printout of the previous `EchoData` object.\n", + "\n", + "Compare this with the [v0.6.0 `EchoData` object](data-format:echodata-object) to see the changes listed in the table above." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "typical-piece", + "metadata": { + "execution": { + "iopub.execute_input": "2022-05-26T18:02:02.167924Z", + "iopub.status.busy": "2022-05-26T18:02:02.166085Z", + "iopub.status.idle": "2022-05-26T18:02:02.287977Z", + "shell.execute_reply": "2022-05-26T18:02:02.287273Z", + "shell.execute_reply.started": "2022-05-26T18:02:02.167819Z" + }, + "tags": [ + "remove-input" + ] + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "
EchoData: standardized raw data from Internal Memory
\n", + "
\n", + "
    \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:  ()\n",
      +              "Data variables:\n",
      +              "    *empty*\n",
      +              "Attributes:\n",
      +              "    conventions:                 CF-1.7, SONAR-netCDF4-1.0, ACDD-1.3\n",
      +              "    keywords:                    EK60\n",
      +              "    sonar_convention_authority:  ICES\n",
      +              "    sonar_convention_name:       SONAR-netCDF4\n",
      +              "    sonar_convention_version:    1.0\n",
      +              "    summary:                     EK60 raw file s3://ncei-wcsd-archive/data/ra...\n",
      +              "    title:                       2017 Pacific Hake Acoustic Trawl Survey\n",
      +              "    date_created:                2017-07-28T18:16:19Z\n",
      +              "    survey_name:                 

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:                 (frequency: 3, ping_time: 529)\n",
      +              "Coordinates:\n",
      +              "  * frequency               (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      +              "  * ping_time               (ping_time) datetime64[ns] 2017-07-28T18:16:19.31...\n",
      +              "Data variables:\n",
      +              "    absorption_indicative   (frequency, ping_time) float64 0.002822 ... 0.03259\n",
      +              "    sound_speed_indicative  (frequency, ping_time) float64 1.481e+03 ... 1.48...

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:        (location_time: 2165, frequency: 3, ping_time: 529)\n",
      +              "Coordinates:\n",
      +              "  * location_time  (location_time) datetime64[ns] 2017-07-28T18:16:21.4759997...\n",
      +              "  * frequency      (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      +              "  * ping_time      (ping_time) datetime64[ns] 2017-07-28T18:16:19.313999872 ....\n",
      +              "Data variables:\n",
      +              "    latitude       (location_time) float64 dask.array<chunksize=(2165,), meta=np.ndarray>\n",
      +              "    longitude      (location_time) float64 dask.array<chunksize=(2165,), meta=np.ndarray>\n",
      +              "    sentence_type  (location_time) <U3 dask.array<chunksize=(2165,), meta=np.ndarray>\n",
      +              "    pitch          (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      +              "    roll           (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      +              "    heave          (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      +              "    water_level    (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray>\n",
      +              "Attributes:\n",
      +              "    platform_type:       Research vessel\n",
      +              "    platform_name:       Bell M. Shimada\n",
      +              "    platform_code_ICES:  315

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:        (location_time: 22037)\n",
      +              "Coordinates:\n",
      +              "  * location_time  (location_time) datetime64[ns] 2017-07-28T18:16:19.3140003...\n",
      +              "Data variables:\n",
      +              "    NMEA_datagram  (location_time) <U73 '$SDVLW,5050.149,N,5050.149,N' ... '$...\n",
      +              "Attributes:\n",
      +              "    description:  All NMEA sensor datagrams

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:  ()\n",
      +              "Data variables:\n",
      +              "    *empty*\n",
      +              "Attributes:\n",
      +              "    conversion_software_name:     echopype\n",
      +              "    conversion_software_version:  0.5.6\n",
      +              "    conversion_time:              2022-05-26T18:01:56Z\n",
      +              "    src_filenames:                s3://ncei-wcsd-archive/data/raw/Bell_M._Shi...\n",
      +              "    duplicate_ping_times:         0

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:  ()\n",
      +              "Data variables:\n",
      +              "    *empty*\n",
      +              "Attributes:\n",
      +              "    sonar_manufacturer:      Simrad\n",
      +              "    sonar_model:             ER60\n",
      +              "    sonar_serial_number:     \n",
      +              "    sonar_software_name:     \n",
      +              "    sonar_software_version:  2.4.3\n",
      +              "    sonar_type:              echosounder

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:                         (frequency: 3, ping_time: 529,\n",
      +              "                                     range_bin: 3957)\n",
      +              "Coordinates:\n",
      +              "  * frequency                       (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      +              "  * ping_time                       (ping_time) datetime64[ns] 2017-07-28T18:...\n",
      +              "  * range_bin                       (range_bin) int64 0 1 2 3 ... 3954 3955 3956\n",
      +              "Data variables: (12/30)\n",
      +              "    channel_id                      (frequency) <U37 'GPT  18 kHz 009072058c8...\n",
      +              "    beam_type                       (frequency) int64 1 1 1\n",
      +              "    beamwidth_receive_alongship     (frequency) float64 10.9 6.81 6.58\n",
      +              "    beamwidth_receive_athwartship   (frequency) float64 10.82 6.85 6.52\n",
      +              "    beamwidth_transmit_alongship    (frequency) float64 10.9 6.81 6.58\n",
      +              "    beamwidth_transmit_athwartship  (frequency) float64 10.82 6.85 6.52\n",
      +              "    ...                              ...\n",
      +              "    data_type                       (frequency, ping_time) float64 3.0 ... 3.0\n",
      +              "    count                           (frequency, ping_time) float64 3.957e+03 ...\n",
      +              "    offset                          (frequency, ping_time) float64 0.0 ... 0.0\n",
      +              "    transmit_mode                   (frequency, ping_time) float64 0.0 ... 0.0\n",
      +              "    angle_athwartship               (frequency, ping_time, range_bin) float64 ...\n",
      +              "    angle_alongship                 (frequency, ping_time, range_bin) float64 ...\n",
      +              "Attributes:\n",
      +              "    beam_mode:              vertical\n",
      +              "    conversion_equation_t:  type_3

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
  • \n", + " \n", + " \n", + "
    \n", + "
    \n", + "
      \n", + "
      \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
      <xarray.Dataset>\n",
      +              "Dimensions:           (frequency: 3, pulse_length_bin: 5)\n",
      +              "Coordinates:\n",
      +              "  * frequency         (frequency) float64 1.8e+04 3.8e+04 1.2e+05\n",
      +              "  * pulse_length_bin  (pulse_length_bin) int64 0 1 2 3 4\n",
      +              "Data variables:\n",
      +              "    sa_correction     (frequency, pulse_length_bin) float64 0.0 -0.7 ... -0.3\n",
      +              "    gain_correction   (frequency, pulse_length_bin) float64 20.3 22.95 ... 26.55\n",
      +              "    pulse_length      (frequency, pulse_length_bin) float64 0.000512 ... 0.00...

      \n", + "
    \n", + "
    \n", + "
  • \n", + " \n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "EchoData: standardized raw data from Internal Memory\n", + " > top: (Top-level) contains metadata about the SONAR-netCDF4 file format.\n", + " > environment: (Environment) contains information relevant to acoustic propagation through water.\n", + " > platform: (Platform) contains information about the platform on which the sonar is installed.\n", + " > nmea: (Platform/NMEA) contains information specific to the NMEA protocol.\n", + " > provenance: (Provenance) contains metadata about how the SONAR-netCDF4 version of the data were obtained.\n", + " > sonar: (Sonar) contains specific metadata for the sonar system.\n", + " > beam: (Beam) contains backscatter data and other beam or channel-specific data.\n", + " > vendor: (Vendor specific) contains vendor-specific information about the sonar and the data." + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ed" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "traditional-advance", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "celltoolbar": "Tags", + "kernelspec": { + "display_name": "Python [conda env:oldep]", + "language": "python", + "name": "conda-env-oldep-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/source/data-format-sonarnetcdf4.ipynb b/docs/source/data-format-sonarnetcdf4.ipynb index 012897b49..907fc5034 100644 --- a/docs/source/data-format-sonarnetcdf4.ipynb +++ b/docs/source/data-format-sonarnetcdf4.ipynb @@ -1,74 +1,94 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "(data-format:sonarnetcdf4-adaptation)=\n", - "# Adaptation of SONAR-netCDF4 convention" - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(data-format:sonarnetcdf4-adaptation)=\n", + "# Adaptation of SONAR-netCDF4 convention" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Echopype follows the [ICES SONAR-netCDF4 convention ver.1](https://ices-library.figshare.com/articles/report/The_SONAR-netCDF4_convention_for_sonar_data_Version_1_0/18624056) when possible. However, to fully leverage the power of label-aware manipulation provided by the [xarray](https://docs.xarray.dev/en/stable/) library and enhance coherence of data representation for scientific echosounders, the echopype developers have made decisions to deviate from the convention in key aspects." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(data-format:multfreq-organization)=\n", + "## Organization of multi-frequency data\n", + "\n", + "One important Echopype adaptation is the organization of multi-frequency data. Echopype implements a data structure that optimizes data access and filtering (“slicing”) efficiency and usability at the expense of potentially increased file storage.\n", + "\n", + "Specifically, the SONAR-netCDF4 convention defines that data variables, such as `backscatter_r`, from each sonar beam (i.e. frequency channel or transducers for typical scientific echosounder) are stored based on a one-dimensional ragged array structure that uses a custom variable-length vector data type (`sample_t`) and `ping_time` as its coordinate dimensions. In addition, each frequency channel is stored in a separate netCDF4 group (`Sonar/Beam_group1`, `Sonar/Beam_group2`, ...).\n", + "\n", + "Echopype restructures this multi-group ragged array representation into a single-group, 3-dimension (`(channel, range_sample, ping_time)`) or 4-dimensional (`(channel, range_sample, ping_time, beam)`) gridded representation across all channels. Here:\n", + "- the `ping_time` dimension follows the convention definition\n", + "- the `beam` dimension, when exists, maps to the different sectors of split-beam transducers\n", + "- the `channel` and `range_sample` (along-range sample number) dimensions are echopype-specific modifications\n", + "\n", + "Data from each frequency channel are mapped along the `channel` dimension, and echo data from each ping are mapped along the `range_sample` dimension. These consolidated, uniform multi-channel (or multi-frequency) [`DataArrays`](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.html) are stored in `Sonar/Beam_group1`, `Sonar/Beam_group2`, and potentially other such groups (`Sonar/Beam_group3`, etc.) in the netCDF data model.\n", + "\n", + "See [](data-format:power-angle-complex) for detail on core variables that store the echo data and the number of dimensions, which varies depending on the instrument setup." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## NaN-padding\n", + "Due to the flexibility in echosounder configuration, there can potentially be unequal number of samples along sonar range (i.e., length of the `range_sample` dimension) across different `ping_time` or `channel`. Echopype addresses this by padding `NaN` for pings or channels with fewer samples to maintain the uniform shape of a 3- or 4-dimensional gridded representation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below is a comparison of data representations defined in (**A**) the SONAR-netCDF4 convention and in (**B**) echopype, where the gray cells represent NaN-padded cells. This sketch illustrates the case of 3-dimensional gridded data such as `backscatter_r` from AZFP and EK60 data, or EK80 power/angle data.\n", + "\n", + "![](./images/beam_dim_v5-01.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{Note}\n", + "The `NaN` padding approach could consume large amount of memory in some specific cases due to the echosounder setup. This is an issue we are actively working on. See [#1070](https://github.com/OSOceanAcoustics/echopype/pull/1070) for detail.\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(data-format:compliance)=\n", + "## Verifying compliance" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Ongoing echopype development creates a need to ensure that new modifications do not break the convention-based data structure unexpectedly, and that deliberate modifications are implemented consistently across instrument types. To assist with this need, we are developing a lightweight package that will verify the adherence of an `EchoData` object instance to the echopype adaptation of SONAR-netCDF4 version 1. The repository for this new, companion package, [**echopype-checker**](https://github.com/OSOceanAcoustics/echopype-checker/), currently contains a brief description of the package goals and operation as well as Jupyter notebooks that illustrate its use with specific raw data files." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "language_info": { + "name": "python" + }, + "orig_nbformat": 4 }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Echopype follows the [ICES SONAR-netCDF4 convention ver.1](http://www.ices.dk/sites/pub/Publication%20Reports/Cooperative%20Research%20Report%20(CRR)/CRR341.pdf) when possible. However, to fully leverage the power of label-aware manipulation provided by the [xarray](https://docs.xarray.dev/en/stable/) library and enhance coherence of data representation for scientific echosounders, the echopype developers have made decisions to deviate from the convention in key aspects." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "(data-format:multfreq-organization)=\n", - "## Organization of multi-frequency data\n", - "\n", - "One important Echopype adaptation is the organization of multi-frequency data. Echopype implements a data structure that optimizes data access and filtering (“slicing”) efficiency and usability at the expense of potentially increased file storage.\n", - "\n", - "Specifically, the SONAR-netCDF4 convention defines that data variables, such as `backscatter_r`, from each sonar beam (i.e. frequency channel or transducers for typical scientific echosounder) are stored based on a one-dimensional ragged array structure that uses a custom variable-length vector data type (`sample_t`) and `ping_time` as its coordinate dimensions. In addition, each frequency channel is stored in a separate netCDF4 group (`Sonar/Beam_group1`, `Sonar/Beam_group2`, ...).\n", - "\n", - "Echopype restructures this multi-group ragged array representation into a single-group, 3-dimension (`(channel, range_sample, ping_time)`) or 4-dimensional (`(channel, range_sample, ping_time, beam)`) gridded representation across all channels. Here:\n", - "- the `ping_time` dimension follows the convention definition\n", - "- the `beam` dimension, when exists, maps to the different sectors of split-beam transducers\n", - "- the `channel` and `range_sample` (along-range sample number) dimensions are echopype-specific modifications\n", - "\n", - "Data from each frequency channel are mapped along the `channel` dimension, and echo data from each ping are mapped along the `range_sample` dimension. These consolidated, uniform multi-channel (or multi-frequency) [`DataArrays`](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.html) are stored in `Sonar/Beam_group1`, `Sonar/Beam_group2`, and potentially other such groups (`Sonar/Beam_group3`, etc.) in the netCDF data model.\n", - "\n", - "See [](data-format:power-angle-complex) for detail on core variables that store the echo data and the number of dimensions, which varies depending on the instrument setup." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## NaN-padding\n", - "Due to the flexibility in echosounder configuration, there can potentially be unequal number of samples along sonar range (i.e., length of the `range_sample` dimension) across different `ping_time` or `channel`. Echopype addresses this by padding `NaN` for pings or channels with fewer samples to maintain the uniform shape of a 3- or 4-dimensional gridded representation." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Below is a comparison of data representations defined in (**A**) the SONAR-netCDF4 convention and in (**B**) echopype, where the gray cells represent NaN-padded cells. This sketch illustrates the case of 3-dimensional gridded data such as `backscatter_r` from AZFP and EK60 data, or EK80 power/angle data.\n", - "\n", - "![](./images/beam_dim_v5-01.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - ":::{Note}\n", - "The `NaN` padding approach could consume large amount of memory in some specific cases due to the echosounder setup. This is an issue we are actively working on. See [#1070](https://github.com/OSOceanAcoustics/echopype/pull/1070) for detail.\n", - ":::" - ] - } - ], - "metadata": { - "language_info": { - "name": "python" - }, - "orig_nbformat": 4 - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/docs/source/data-format.md b/docs/source/data-format.md index c1d0ca3e7..4b912b58f 100644 --- a/docs/source/data-format.md +++ b/docs/source/data-format.md @@ -2,6 +2,8 @@ The diversity of raw data file formats generated by different sonar systems is a major obstacle toward integrative analysis of ocean sonar data at large scales. Echopype addresses this problem by providing tools for converting and standardizing data from manufacturer-specific formats into the **[netCDF](https://www.unidata.ucar.edu/software/netcdf/) data model**, so that data from different instruments are interoperable. NetCDF is the [current defacto standard](https://clouds.eos.ubc.ca/~phil/courses/parallel_python/02_xarray_zarr.html) in climate research and is supported by many powerful Python packages for efficient computation, of which echopype takes advantage extensively. -The section [**Adaptation of SONAR-netCDF4 convention**](data-format:sonarnetcdf4-adaptation) details echopype's adaptation of the [ICES SONAR-netCDF4 convention version 1](http://www.ices.dk/sites/pub/Publication%20Reports/Cooperative%20Research%20Report%20(CRR)/CRR341.pdf) to create standardized data that fully leverage the power of label-aware manipulation and enhance coherence of data representation for scientific echosounders. +See the following sections: -The section [**Raw converted data**](data-format:raw-data) provides instrument-specific examples on raw data unpacked and organized ("converted" via [](echopype.open_raw)) into standardized `EchoData` objects. +- [**Adaptation of SONAR-netCDF4 convention**](data-format:sonarnetcdf4-adaptation) details echopype's adaptation of the [ICES SONAR-netCDF4 convention version 1](https://ices-library.figshare.com/articles/report/The_SONAR-netCDF4_convention_for_sonar_data_Version_1_0/18624056) to create standardized data that fully leverage the power of label-aware manipulation and enhance coherence of data representation for scientific echosounders. +- [**Raw converted data**](data-format:raw-data) describes the standardized [`EchoData` object](data-format:echodata-object) and provides instrument-specific examples on raw data unpacked and organized ("converted" via [`open_raw`](echopype.open_raw)) into `EchoData` objects. +- [**Changes in recent versions**](data-format-changes.md) describes significant changes in data structures in recent major versions of echopype. diff --git a/docs/source/process.rst b/docs/source/process.rst deleted file mode 100644 index 8147f4e28..000000000 --- a/docs/source/process.rst +++ /dev/null @@ -1,237 +0,0 @@ -Data processing -=============== - - -Functionality -------------- - -- EK60 and AZFP narrowband echosounders: - - - Calibration and echo-integration to obtain - volume backscattering strength (Sv) from power data. - - Simple noise removal by removing data points (set to ``NaN``) below - an adaptively estimated noise floor [1]_. - - Binning and averaging to obtain mean volume backscattering strength (MVBS) - from the calibrated data. - - Compute mean volume backscattering strength (MVBS) based - on either the number of pings and sample intervals - (the ``range_sample`` dimension in the dataset) or a - specified ping time interval and range interval in - physics units (seconds and meters, respectively). - -- EK80 and EA640 broadband echosounders: - - - Calibration based on pulse compression output in the - form of average over frequency. - - The same noise removal and MVBS computation functionality available - to the narrowband echosounders. - - -The steps for performing these analyses are summarized below: - -- Calibration: - - .. code-block:: python - - import echopype as ep - nc_path = './converted_files/file.nc' # path to a converted nc file - echodata = ep.open_converted(nc_path) # create an EchoData object - ds_Sv = ep.calibrate.compute_Sv(echodata) # obtain a dataset containing Sv, echo_range, and - # the calibration and environmental parameters - -- Reduce data by computing MVBS: - - .. code-block:: python - - # Reduce data based on physical units - ds_MVBS = ep.commongrid.compute_MVBS( - ds_Sv, # calibrated Sv dataset - range_meter_bin=20, # bin size to average along echo_range in meters - ping_time_bin='20S' # bin size to average along ping_time in seconds - ) - - # Reduce data based on sample number - ds_MVBS = ep.commongrid.compute_MVBS_index_binning( - ds_Sv, # calibrated Sv dataset - range_sample_num=30, # number of sample bins to average along the range_sample dimensionm - ping_num=5 # number of pings to average - ) - -- Noise removal: - - .. code-block:: python - - # Remove noise - ds_Sv_clean = ep.clean.remove_noise( # obtain a denoised Sv dataset - ds_Sv, # calibrated Sv dataset - range_sample_num=30, # number of samples along the range_sample dimension for estimating noise - ping_num=5, # number of pings for estimating noise - ) - -.. attention:: - - The ``clean`` and ``commongrid`` subpackages were introduced in version 0.7.0. - They contain functions previously found in the deprecated ``preprocess`` subpackage; - ``preprocess`` was removed in version 0.8.0. - -The functions in the ``calibrate`` subpackage take in an ``EchoData`` object, -which is essentially a container for multiple xarray ``Dataset`` instances, -and return a single xarray ``Dataset`` containing the calibrated backscatter -quantities and the samples' corresponding range in meters. -The input and output of all functions in the ``clean`` and ``commongrid`` -subpackages are xarray ``Dataset`` instances, with the input being a ``Dataset`` -containing ``Sv`` and ``echo_range`` generated from calibration. - -The ``calibrate``, ``clean`` and ``commongrid`` functions do not save the calculation results to disk, -but the returned xarray ``Dataset`` can be saved using native xarray methods -such as ``to_netcdf`` and ``to_zarr``. - -For example, to save the Sv and MVBS results to disk: - -.. code-block:: python - - ds_Sv.to_netcdf('file_Sv.nc') - ds_MVBS.to_netcdf('file_MVBS.nc') - - -.. note:: - - Echopype's data processing functionality is being developed actively. - Be sure to check back here often! - - -Environmental parameters ------------------------- - -Environmental parameters, including temperature, salinity and pressure, are -critical in biological interpretation of ocean sonar data. They influence: - -- Transducer calibration, through seawater absorption. This influence is - frequency-dependent, and the higher the frequency the more sensitive the - calibration is to the environmental parameters. - -- Sound speed, which impacts the conversion from temporal resolution - (of each data sample) to spatial resolution, i.e. the sonar observation - range changes with sound speed. - -By default, echopype uses the following for calibration: - -- EK60 and EK80: Environmental parameters saved with the raw data files. - For EK60, instrument operators may enter temperature and salinity values into the - `Simrad EK60 software's Environment dialog - `_ - and the Simrad software will calculate sound speed and sound absorption; - alternatively, sound speed may be entered directly. - Only sound speed and sound absorption are saved into the raw file. - -- AZFP: Salinity and pressure provided by the user, - and temperature recorded at the instrument. - -Seawater sound absorption and sound speed may be recalculated with echopype if -more accurate in-situ environmental parameters are available. -To update these values, pass the environmental parameters -as a dictionary while calling ``ep.calibrate.compute_Sv()``: - -.. code-block:: python - - env_params = { - 'temperature': 8, # temperature in degree Celsius - 'salinity': 30, # salinity in PSU - 'pressure': 50, # pressure in dbar - } - ds_Sv = ep.calibrate.compute_Sv(echodata, env_params=env_params) - -These values will be used in calculating sound speed, -sound absorption, and the thickness of each sonar sample, -which is used in calculating the range (``echo_range``). -The updated values can be retrieved with: - -.. code-block:: python - - ds_Sv['sound_absorption'] # absorption in [dB/m] - ds_Sv['sound_speed'] # sound speed in [m/s] - ds_Sv['echo_range'] # echo_range for each sonar sample in [m] - - -For EK60 and EK80 data, echopype updates -the sound speed using the formula from Mackenzie (1981) [2]_ and -seawater absorption using the formula from Ainslie and McColm (1981) [3]_. - -For AZFP data, echopype updates the sound speed and seawater absorption -using the formulae provided by the manufacturer ASL Environmental Sciences. - - -Calibration parameters ----------------------- - -*Calibration* here refers to the calibration of transducers on an -echosounder, which finds the mapping between the voltage signal -recorded by the echosounder and the actual (physical) acoustic pressure -received at the transducer. This mapping is critical in deriving biological -quantities from acoustic measurements, such as estimating biomass. -More detail about the calibration procedure can be found in [4]_. - -Echopype by default uses calibration parameters stored in the converted -files along with the backscatter measurements and other metadata parsed -from the raw data file. -However, since careful calibration is often done separately from the -data collection phase of the field work, accurate calibration parameters -are often supplied in the post-processing stage. -Currently echopype allows users to overwrite the following calibration parameters: - -- EK60 and EK80: ``sa_correction``, ``gain_correction``, and ``equivalent_beam_angle`` - -- AZFP: ``EL``, ``DS``, ``TVR``, ``VTX``, ``Sv_offset``, and ``equivalent_beam_angle`` - - -As an example, to reset the equivalent beam angle for all frequencies, -specify ``cal_params`` while calling the calibration functions: - -.. code-block:: python - - import xarray as xr - equivalent_beam_angle = xr.DataArray( # set all channels at once - [-17.47, -20.77, -21.13, -20.4, -30], - dims=['frequency'], - coords=[echodata.beam.frequency] - ) - cal_params = { - 'equivalent_beam_angle': equivalent_beam_angle - } - ds_Sv = ep.calibrate.compute_Sv(echodata, cal_params=cal_params) - -To reset the equivalent beam angle for 18 kHz only, one can do: - -.. code-block:: python - - # set value for 18 kHz only - echodata.beam.equivalent_beam_angle.loc[dict(frequency=18000)] = 18.02 - - -References ----------- - -.. [1] De Robertis A, Higginbottoms I. (2007) A post-processing technique to - estimate the signal-to-noise ratio and remove echosounder background noise. - `ICES J. Mar. Sci. 64(6): 1282–1291. `_ - -.. [2] Mackenzie K. (1981) Nine‐term equation for sound speed in the oceans. - `J. Acoust. Soc. Am. 70(3): 806-812 `_ - -.. [3] Ainslie MA, McColm JG. (1998) A simplified formula for viscous and - chemical absorption in sea water. - `J. Acoust. Soc. Am. 103(3): 1671-1672 `_ - -.. [4] Demer DA, Berger L, Bernasconi M, Bethke E, Boswell K, Chu D, Domokos R, - et al. (2015) Calibration of acoustic instruments. `ICES Cooperative Research Report No. - 1. 133 pp. `_ - - -.. TODO: Need to specify the changes we made from AZFP Matlab code to here: - In the Matlab code, users set temperature/salinity parameters in - AZFP_parameters.m and run that script first before doing unpacking. - Here we require users to unpack raw data first into netCDF, and then - set temperature/salinity in the process subpackage if they want to perform - calibration. This is cleaner and less error prone, because the param - setting step is separated from the raw data unpacking, so user-defined - params are not in the unpacked files. diff --git a/docs/source/processing-levels.md b/docs/source/processing-levels.md index c25610fd5..37d566c77 100644 --- a/docs/source/processing-levels.md +++ b/docs/source/processing-levels.md @@ -19,7 +19,7 @@ The `echopype` team is developing a clearly defined progression of data processi - as sets of individual converted files as originally segmented into arbitrary time ranges during sensor file creation, or - compiled into larger granules corresponding to logical deployment intervals. -- **L1A**: Raw L0 data converted to a standardized, open format with geographic coordinates (latitude & longitude) included. Includes other ancillary information extracted from sensor-generated L0 data or other external sources. May include environmental information such as temperature, salinity and pressure. Use of the SONAR-netcDF4 v1 convention is strongly recommended. +- **L1A**: Raw L0 data converted to a standardized, open format with geographic coordinates (latitude & longitude) included. Includes other ancillary information extracted from sensor-generated L0 data or other external sources. May include environmental information such as temperature, salinity and pressure. Use of the SONAR-netcDF4 version 1 convention is strongly recommended. - **L1B**: L1A data with quality-control steps applied, such as time-coordinate corrections that enforce strictly increasing, non-duplicate timestamps. ### Level 2 (L2)