diff --git a/docs/source/data-proc-additional.ipynb b/docs/source/data-proc-additional.ipynb index 51d230d3e..be2b2f5cb 100644 --- a/docs/source/data-proc-additional.ipynb +++ b/docs/source/data-proc-additional.ipynb @@ -1,106 +1,91 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "(data-proc:additional)=\n", - "# Additional information for processed data\n", - "\n", - "This page provide information on some aspects of processed data that may require additional explanation to fully understand the representation and underlying operations." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(data-proc:additional)=\n", + "# Additional information for processed data\n", + "\n", + "This page provide information on some aspects of processed data that may require additional explanation to fully understand the representation and underlying operations." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(data-proc:echo-range)=\n", + "## Range of echo samples" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The calibration operation in `compute_Sv` generates a new data variable, `echo_range`, which is the physically meaningful range (in meters) of the echo samples.\n", + "`echo_range` is compute from the `range_sample` coordinate (which contains 0-based indices of the digitized sample numbers of the received echoes) of the raw data in combination with the `sample_interval` in the `Sonar/Beam_groupX` group and sound speed either stored in the raw data file or provided by the user.\n", + "Echopype assumes a constant sound and does not currently support the use a sound speed profile." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The computation for `echo_range` ($r$) is:\n", + "\n", + "$$\n", + "r = N \\Delta r = N c \\tau / 2\n", + "$$\n", + "\n", + "where $\\Delta r$ is the along-range \"length\" of each sample, $N$ is the index number in `range_sample`, $\\tau$ is the `sample_interval`, and $c$ is sound speed. The factor of 2 is due to the round-trip travel from the transmitter to the scatterer and back to the receiver." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Because `sample_interval` can be different for different channels, the resulting ``echo_range`` arrays can be different across channel even if they have the same number of digitized samples. This is illustrated in the sketch below, in which (**A**) shows the dimensions of the variable `backscatter_r` and (**B**) shows the varying values of `echo_range` that change depending on `sample_interval`. In this example, the `sample_interval` of the first channel is twice of that of the second and the third channel.\n", + "\n", + "![](./images/echo_range.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "celltoolbar": "Tags", + "interpreter": { + "hash": "a292767406182d99a2458e67c2d2e96b524510c4a2166b4b423439fe75c32190" + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.7" + } }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "(data-proc:format)=\n", - "## Format of processed data" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once raw data (represented by the `EchoData` objects) are calibrated (via [`compute_Sv`](echopype.calibrate.compute_Sv)), the calibrated data and the outputs of all subsequent [processing functions](data-process:funcionalities) are generic [xarray Datasets](https://docs.xarray.dev/en/stable/user-guide/data-structures.html#dataset).\n", - "We currently do not follow any specific conventions for processed data, but we retain provenance information in the dataset, including the [data processing levels](./processing-levels.md).\n", - "However, whether and how data variables used in the processing will be stored remain to be determined." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "(data-proc:echo-range)=\n", - "## Range of echo samples" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The calibration operation in `compute_Sv` generates a new data variable, `echo_range`, which is the physically meaningful range (in meters) of the echo samples.\n", - "`echo_range` is compute from the `range_sample` coordinate (which contains 0-based indices of the digitized sample numbers of the received echoes) of the raw data in combination with the `sample_interval` in the `Sonar/Beam_groupX` group and sound speed either stored in the raw data file or provided by the user.\n", - "Echopype assumes a constant sound and does not currently support the use a sound speed profile." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "The computation for `echo_range` ($r$) is:\n", - "$$\n", - "r = N \\Delta r = N c \\tau / 2\n", - "$$\n", - "where $\\Delta r$ is the along-range \"length\" of each sample, $N$ is the indice number in `range_sample`, $\\tau$ is the `sample_interval`, and $c$ is sound speed. The factor of 2 is due to the round-trip travel from the transmitter to the scatterer and back to the receiver." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Because `sample_interval` can be different for different channels, the resulting ``echo_range`` arrays can be different across channel even if they have the same number of digitized samples. This is illustrated in the sketch below, in which (**A**) shows the dimensions of the variable `backscatter_r` and (**B**) shows the varying values of `echo_range` that change depending on `sample_interval`. In this example, the `sample_interval` of the first channel is twice of that of the second and the third channel.\n", - "\n", - "![](./images/echo_range.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [] - } - ], - "metadata": { - "celltoolbar": "Tags", - "interpreter": { - "hash": "a292767406182d99a2458e67c2d2e96b524510c4a2166b4b423439fe75c32190" - }, - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/docs/source/data-proc-func.ipynb b/docs/source/data-proc-func.ipynb index 026939225..c48aa3459 100644 --- a/docs/source/data-proc-func.ipynb +++ b/docs/source/data-proc-func.ipynb @@ -1,533 +1,533 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "(data-proc:functions)=\n", - "# Data processing functionalities" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Summary of subpacakges\n", - "\n", - "Echopype currently contains the following data processing subpacakges:\n", - "- [`calibrate`](echopype.calibrate): Transform raw instrument data into physicaly meaningful quantities, such as the volume backscattering strength (Sv)\n", - " - Narrowband Sv:\n", - " AZFP, EK60, and EK80 narrowband (\"CW\") mode transmission\n", - " - Broadband Sv average over the signal frequency band: EK80 broadband (\"FM\") mode transmission \n", - "- [`consolidate`](echopype.consolidate): Add additional information such as the geospatial locations, split-beam angle, etc to calibrated datasets. The source of information include:\n", - " - Data already stored in the `EchoData` object, and hence already stored in the raw instrument data files\n", - " - Quantities that can be computed from data stored in the `EchoData` object, e.g., for EK80 broadband data, split-beam angle can be computed from the complex samples\n", - " - External dataset, e.g., AZFP echosounders do not record geospatial data, but typically there is a companion dataset with GPS information\n", - "- [`clean`](echopype.clean): Reduce variabilities in backscatter data by perform noise removal operations. Currently contains only a simple noise removal function implementing the algorithm in {cite}`DeRobertis2007_noise`.\n", - "- [`commongrid`](echopype.commongrid): Enhance the spatial and temporal coherence of data. Currently contain functions to compute mean volume backscattering strength (MVBS) and nautical areal backscattering coefficients (NASC) that both result in gridded data at uniform spatial and temporal intervals.\n", - "- [`qc`](echopype.qc): Handle unexpected irregularities in the data. Currently contains only functions to handle timestamp reversal in EK60/EK80 raw data.\n", - "- [`mask`](echopype.mask): Create or apply mask to segment data\n", - "- [`metrics`](echopype.metrics): Calculate simple summary statistics from data\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - ":::{Note}\n", - "The ``clean`` and ``commongrid`` subpackages were introduced in version 0.7.0.\n", - "They contain functions previously found in the deprecated ``preprocess`` subpackage;\n", - "``preprocess`` was removed in version 0.8.0.\n", - ":::" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [ - "remove-cell" - ] - }, - "outputs": [], - "source": [ - "import echopype as ep" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The use of these processing functions are summarized below:" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Calibration" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "nc_path = './converted_files/file.nc' # path to a converted nc file\n", - "echodata = ep.open_converted(nc_path) # create an EchoData object\n", - "ds_Sv = ep.calibrate.compute_Sv(echodata) # obtain a dataset containing Sv, echo_range, and\n", - " # the calibration and environmental parameters\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - ":::{Note}\n", - "The calibration functions can require different input arguments depending on the echosounder instrument (`sonar_model`). See [](data-processing:calibration) for detail.\n", - ":::" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Nosie removal" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Remove noise\n", - "ds_Sv_clean = ep.clean.remove_noise( # obtain a denoised Sv dataset\n", - " ds_Sv, # calibrated Sv dataset\n", - " range_sample_num=30, # number of samples along the range_sample dimension for estimating noise\n", - " ping_num=5, # number of pings for estimating noise\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Reduce data by computing MVBS" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Reduce data based on physical units\n", - "ds_MVBS = ep.commongrid.compute_MVBS(\n", - " ds_Sv, # calibrated Sv dataset\n", - " range_meter_bin=20, # bin size to average along echo_range in meters\n", - " ping_time_bin='20S' # bin size to average along ping_time in seconds\n", - ")\n", - "\n", - "# Reduce data based on sample number\n", - "ds_MVBS = ep.commongrid.compute_MVBS_index_binning(\n", - " ds_Sv, # calibrated Sv dataset\n", - " range_sample_num=30, # number of sample bins to average along the range_sample dimensionm\n", - " ping_num=5 # number of pings to average\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Saving results\n", - "\n", - "Typically echopype functions do not save the calculation resuls to disk, but the returned xarray `Dataset` can be saved using native xarray method such as `to_netcdf` and `to_zarr`. For the `EchoData` object we have implemented identical methods to allow saving raw converted data to disk. For example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "ed = ep.open_raw(\"PATH_TO_RAW_FILE\", sonar_model=\"SONAR_MODEL\")\n", - "ed.to_zarr(\"RAW_CONVERTED_FILENAME.zarr\") # save converted raw data to Zarr format\n", - "\n", - "# Some processing functions that results in an xarray Dataset ds_Sv\n", - "ds_Sv.to_netcdf(\"PROCESSED_FILENAME.nc\") # save data to netCDF format" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "(data-proc:calibration)=\n", - "## Parameter considerations for calibrating data" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Calibration transforms raw data collected by the instrument into physically meaningful units by incorporating:\n", - "- Calibration parameters (`cal_params`) instrinsic to the instrument and its settings\n", - "- Environmental parameters (`env_params`), such as sound speed and absorption coefficient due to the physical environment\n", - "\n", - "Echopype also requires correct input argument combinations to calibrate data, due to intrinsic differences of the echosounders.\n", - "\n", - "These considerations are explained below.\n", - "\n", - "For more information on calibration, see {cite}`Demer2015`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Input arguments" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### AZFP and EK60 data\n", - "\n", - "For data from the AZFP and EK60 echosounders, since these instruments can only transmit narrowband data, you do not have to specify any additional argument when calibrating data, and the following should always work:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ed = ep.open_raw(\"PATH_TO_RAW_FILE\", sonar_model=\"EK60\")\n", - "ds_Sv = ep.calibrate.compute_Sv(ed)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### EK80 data\n", - "For data from the EK80 echosounder, since both narrowband and broadband transmissions are possible for different channels, and for narrowband transmissions the data can be stored in two forms, you need to specify argument combinations corresponding to the data you want to calibrate.\n", - "\n", - "For computing band-averaged Sv from broadband EK80 data, use:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds_Sv = ep.calibrate.compute_Sv(ed, waveform_mode=\"BB\", encode_mode=\"complex\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, `waveform_mode=\"BB\"` indicates that the data you want to calibrate are the channels set to do broadband transmissions. `encode_mode=\"complex\"` indicates that these data are stored as complex samples. The function will raise an error if there are no broadband data found in the provided `EchoData` object (`ed`)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For computing narrowband Sv from narrowband EK80 data:\n", - "\n", - "- If the data is stored as complex samples, use\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds_Sv = ep.calibrate.compute_Sv(ed, waveform_mode=\"CW\", encode_mode=\"complex\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "- If the data is stored as power/angle samples (this is the format that is equivalent to EK60 data), use" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds_Sv = ep.calibrate.compute_Sv(ed, waveform_mode=\"CW\", encode_mode=\"power\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Calibration parameters\n", - "\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Echopype by default uses calibration parameters stored in the converted files. However, since careful calibration is often done separately from the data collection phase of the field work, accurate calibration parameters are often supplied in the post-processing stage. This can be done via passing in a dictionary `cal_params` containing calibration parameters needed to overwrite values stored in the raw data files like below:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds_Sv = ed.calibrate.compute_Sv(\n", - " ed,\n", - " cal_params={\n", - " sa_correction: sa_vals,\n", - " equivalent_beam_angle: eba_vals,\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Calibration parameters are instrument specific, and currently we support the following:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "EK60_CAL_DICT = {\n", - " \"sa_correction\",\n", - " \"gain_correction\",\n", - " \"equivalent_beam_angle\",\n", - " \"angle_offset_alongship\",\n", - " \"angle_offset_athwartship\",\n", - " \"angle_sensitivity_alongship\",\n", - " \"angle_sensitivity_athwartship\",\n", - " \"beamwidth_alongship\",\n", - " \"beamwidth_athwartship\",\n", - "}\n", - "\n", - "EK80_CAL_DICT = {\n", - " \"sa_correction\",\n", - " \"gain_correction\",\n", - " \"equivalent_beam_angle\",\n", - " \"angle_offset_alongship\",\n", - " \"angle_offset_athwartship\",\n", - " \"angle_sensitivity_alongship\",\n", - " \"angle_sensitivity_athwartship\",\n", - " \"beamwidth_alongship\",\n", - " \"beamwidth_athwartship\",\n", - " \"impedance_transducer\", # z_et\n", - " \"impedance_transceiver\", # z_er\n", - " \"receiver_sampling_frequency\",\n", - "}\n", - "\n", - "AZFP_CAL_DICT = {\n", - " \"EL\",\n", - " \"DS\",\n", - " \"TVR\",\n", - " \"VTX\",\n", - " \"equivalent_beam_angle\",\n", - " \"Sv_offset\",\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "One could choose to bypass the above by updating only specific values in the `EchoData` object (calibration parameters are in the `Sonar/Beam_groupX` and the `Vendor_specifc` group). This may be useful if only one value needs to be updated among all channels." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Environmental parameters" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Environmental parameters, including temperature, salinity, pressure, and pH, are critical in processing raw data collected by the echosounders. They influence calibration through sound speed and absorption coefficients, both central to calibrating raw data to physically meaningful quantities:\n", - "- Absorption coefficients are frequency-dependent and determine the compensation applied to the raw data to compensate for the transmission loss resulted from absortion due to seawater. The higher the frequency, the stronger the absorption and the more sensitive of the calibrated data values are to absorption.\n", - "- Sound speed impacts the calculation of range at each echo samples (`echo_range` in the output of `compute_Sv`) recorded at discrete time intervals (`sample_interval` in `ed[\"Sonar/Beam_groupX\"]`), which in turn affects the compensation of absorption and the transmission loss resulted from spreading loss." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Default behavior\n", - "\n", - "By default, echopype uses environmental parameters stored in the `EchoData` object parsed from the raw file and users need to supply any required parameters that are missing from the data file.\n", - "- EK60 and EK80: Environmental parameters stored in the `EchoData` object parsed from the raw file. For these instruments, users may enter environmental parameter values into the software, where the sound speed and absorption coefficients are automatically calculated and stored; alternatively, sound speed may be entered manually. Note only sound speed and sound absorption are saved into the raw file. See relevant sections of the reference manuals for [EK60](https://www.simrad.online/ek60/ref_english/default.htm?startat=/ek60/ref_english/xxx_para_environment.html) and [EK80](https://www.simrad.online/ek80/ref_en/GUID-A8BD0AEA-3442-4511-BD35-4E9EAD8137CC.html#GUID-A8BD0AEA-3442-4511-BD35-4E9EAD8137CC) for detail.\n", - "- AZFP: Users need to supply at least salinity and pressure as those are not recorded by the instrument. By default echopype uses temperature values stored in the `EchoData` object, but as these are recorded by the instrument and do not necessarily capture the averaged properties of the water column, users are recommended to supply custom temperature values.\n", - "\n", - "\n", - "\n", - "Echopype does not currently handle calculation based on a sound speed profile." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Formulae used\n", - "\n", - "When custom environmental values are supplied, echopype calculates sound speed and absorption coefficients based on the following:\n", - "- EK60 and EK80:\n", - " - sound speed: {cite}`MacKenzie1981`\n", - " - absorption coefficients: {cite}`MacKenzie1981`\n", - "- AZFP: formulae provided in the AZFP Operator's Manual from ASL Environmental Sciences" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To supply custom environmetnal parameters, use input argument `env_params`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "env_params = {\n", - " \"temperature\": 8, # temperature in degree Celsius\n", - " \"salinity\": 30, # salinity in PSU\n", - " \"pressure\": 50, # pressure in dbar\n", - "}\n", - "ds_Sv = ep.calibrate.compute_Sv(ed, env_params=env_params)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Sanity checks\n", - "\n", - "Values of the calibration and environmental parameters, as well as the resultant sound speed and absorption coefficints are stored in the output of `compute_Sv` for easy access and verification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Check environmental parameters used in compute_Sv\n", - "print(\n", - " f\"Temperature used is: {ds_Sv[\"temperature\"]}\"\n", - " f\"Salinity used is: {ds_Sv[\"salinity\"]}\",\n", - " f\"Pressure used is: {ds_Sv[\"pressure\"]}\",\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Check absorption coefficients computed and used\n", - "ds_Sv['sound_absorption'] # absorption coefficients [dB/m]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Check sound speed absorption coefficients computed and used in compute_Sv\n", - "ds_Sv['sound_speed'] # sound speed [m/s]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Interfacing Echoview ECS file\n", - "\n", - "Echopype is able to interface with [Echoview ECS file](https://support.echoview.com/WebHelp/Files,_Filesets_And_Variables/About_ECS_files.htm) for EK60 and EK80 echosounder starting from v0.7.0. The ECS file contains specifications for both calibration parameters and environmental parameters. When an ECS file is provided, parameters in `cal_params` and `env_params`, if provided, are ignored. To provide an ECS file, use:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds_Sv = ep.calibrate.compute_Sv(ed, ecs_file=\"PATH_TO_ECS_FILE\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - ":::{Note}\n", - "The vocalbury currently implemented was assembled from example files we have access to and may not include all aliases. We have connected with Echoview and will add all aliases in a future release, as well as ECS support for the AZFP echosounder.\n", - ":::" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "celltoolbar": "Tags", - "interpreter": { - "hash": "a292767406182d99a2458e67c2d2e96b524510c4a2166b4b423439fe75c32190" - }, - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.12" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(data-proc:functions)=\n", + "# Data processing functionalities" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Summary of subpacakges\n", + "\n", + "Echopype currently contains the following data processing subpacakges:\n", + "- [`calibrate`](echopype.calibrate): Transform raw instrument data into physicaly meaningful quantities, such as the volume backscattering strength (Sv)\n", + " - Narrowband Sv:\n", + " AZFP, EK60, and EK80 narrowband (\"CW\") mode transmission\n", + " - Broadband Sv average over the signal frequency band: EK80 broadband (\"FM\") mode transmission \n", + "- [`consolidate`](echopype.consolidate): Add additional information such as the geospatial locations, split-beam angle, etc to calibrated datasets. The source of information include:\n", + " - Data already stored in the `EchoData` object, and hence already stored in the raw instrument data files\n", + " - Quantities that can be computed from data stored in the `EchoData` object, e.g., for EK80 broadband data, split-beam angle can be computed from the complex samples\n", + " - External dataset, e.g., AZFP echosounders do not record geospatial data, but typically there is a companion dataset with GPS information\n", + "- [`clean`](echopype.clean): Reduce variabilities in backscatter data by perform noise removal operations. Currently contains only a simple noise removal function implementing the algorithm in {cite}`DeRobertis2007_noise`.\n", + "- [`commongrid`](echopype.commongrid): Enhance the spatial and temporal coherence of data. Currently contain functions to compute mean volume backscattering strength (MVBS) and nautical areal backscattering coefficients (NASC) that both result in gridded data at uniform spatial and temporal intervals.\n", + "- [`qc`](echopype.qc): Handle unexpected irregularities in the data. Currently contains only functions to handle timestamp reversal in EK60/EK80 raw data.\n", + "- [`mask`](echopype.mask): Create or apply mask to segment data\n", + "- [`metrics`](echopype.metrics): Calculate simple summary statistics from data\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{Note}\n", + "The ``clean`` and ``commongrid`` subpackages were introduced in version 0.7.0.\n", + "They contain functions previously found in the deprecated ``preprocess`` subpackage;\n", + "``preprocess`` was removed in version 0.8.0.\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [ + "remove-cell" + ] + }, + "outputs": [], + "source": [ + "import echopype as ep" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The use of these processing functions are summarized below:" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Calibration" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nc_path = './converted_files/file.nc' # path to a converted nc file\n", + "echodata = ep.open_converted(nc_path) # create an EchoData object\n", + "ds_Sv = ep.calibrate.compute_Sv(echodata) # obtain a dataset containing Sv, echo_range, and\n", + " # the calibration and environmental parameters\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{Note}\n", + "The calibration functions can require different input arguments depending on the echosounder instrument (`sonar_model`). See [](data-processing:calibration) for detail.\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Noise removal" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Remove noise\n", + "ds_Sv_clean = ep.clean.remove_noise( # obtain a denoised Sv dataset\n", + " ds_Sv, # calibrated Sv dataset\n", + " range_sample_num=30, # number of samples along the range_sample dimension for estimating noise\n", + " ping_num=5, # number of pings for estimating noise\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Reduce data by computing MVBS" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Reduce data based on physical units\n", + "ds_MVBS = ep.commongrid.compute_MVBS(\n", + " ds_Sv, # calibrated Sv dataset\n", + " range_meter_bin=20, # bin size to average along echo_range in meters\n", + " ping_time_bin='20S' # bin size to average along ping_time in seconds\n", + ")\n", + "\n", + "# Reduce data based on sample number\n", + "ds_MVBS = ep.commongrid.compute_MVBS_index_binning(\n", + " ds_Sv, # calibrated Sv dataset\n", + " range_sample_num=30, # number of sample bins to average along the range_sample dimensionm\n", + " ping_num=5 # number of pings to average\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Saving results\n", + "\n", + "Typically echopype functions do not save the calculation resuls to disk, but the returned xarray `Dataset` can be saved using native xarray method such as `to_netcdf` and `to_zarr`. For the `EchoData` object we have implemented identical methods to allow saving raw converted data to disk. For example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "ed = ep.open_raw(\"PATH_TO_RAW_FILE\", sonar_model=\"SONAR_MODEL\")\n", + "ed.to_zarr(\"RAW_CONVERTED_FILENAME.zarr\") # save converted raw data to Zarr format\n", + "\n", + "# Some processing functions that results in an xarray Dataset ds_Sv\n", + "ds_Sv.to_netcdf(\"PROCESSED_FILENAME.nc\") # save data to netCDF format" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(data-proc:calibration)=\n", + "## Parameter considerations for calibrating data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Calibration transforms raw data collected by the instrument into physically meaningful units by incorporating:\n", + "- Calibration parameters (`cal_params`) instrinsic to the instrument and its settings\n", + "- Environmental parameters (`env_params`), such as sound speed and absorption coefficient due to the physical environment\n", + "\n", + "Echopype also requires correct input argument combinations to calibrate data, due to intrinsic differences of the echosounders.\n", + "\n", + "These considerations are explained below.\n", + "\n", + "For more information on calibration, see {cite}`Demer2015`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Input arguments" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### AZFP and EK60 data\n", + "\n", + "For data from the AZFP and EK60 echosounders, since these instruments can only transmit narrowband data, you do not have to specify any additional argument when calibrating data, and the following should always work:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ed = ep.open_raw(\"PATH_TO_RAW_FILE\", sonar_model=\"EK60\")\n", + "ds_Sv = ep.calibrate.compute_Sv(ed)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### EK80 data\n", + "For data from the EK80 echosounder, since both narrowband and broadband transmissions are possible for different channels, and for narrowband transmissions the data can be stored in two forms, you need to specify argument combinations corresponding to the data you want to calibrate.\n", + "\n", + "For computing band-averaged Sv from broadband EK80 data, use:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds_Sv = ep.calibrate.compute_Sv(ed, waveform_mode=\"BB\", encode_mode=\"complex\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, `waveform_mode=\"BB\"` indicates that the data you want to calibrate are the channels set to do broadband transmissions. `encode_mode=\"complex\"` indicates that these data are stored as complex samples. The function will raise an error if there are no broadband data found in the provided `EchoData` object (`ed`)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For computing narrowband Sv from narrowband EK80 data:\n", + "\n", + "- If the data is stored as complex samples, use\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds_Sv = ep.calibrate.compute_Sv(ed, waveform_mode=\"CW\", encode_mode=\"complex\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- If the data is stored as power/angle samples (this is the format that is equivalent to EK60 data), use" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds_Sv = ep.calibrate.compute_Sv(ed, waveform_mode=\"CW\", encode_mode=\"power\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Calibration parameters\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Echopype by default uses calibration parameters stored in the converted files. However, since careful calibration is often done separately from the data collection phase of the field work, accurate calibration parameters are often supplied in the post-processing stage. This can be done via passing in a dictionary `cal_params` containing calibration parameters needed to overwrite values stored in the raw data files like below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds_Sv = ed.calibrate.compute_Sv(\n", + " ed,\n", + " cal_params={\n", + " sa_correction: sa_vals,\n", + " equivalent_beam_angle: eba_vals,\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Calibration parameters are instrument specific, and currently we support the following:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "EK60_CAL_DICT = {\n", + " \"sa_correction\",\n", + " \"gain_correction\",\n", + " \"equivalent_beam_angle\",\n", + " \"angle_offset_alongship\",\n", + " \"angle_offset_athwartship\",\n", + " \"angle_sensitivity_alongship\",\n", + " \"angle_sensitivity_athwartship\",\n", + " \"beamwidth_alongship\",\n", + " \"beamwidth_athwartship\",\n", + "}\n", + "\n", + "EK80_CAL_DICT = {\n", + " \"sa_correction\",\n", + " \"gain_correction\",\n", + " \"equivalent_beam_angle\",\n", + " \"angle_offset_alongship\",\n", + " \"angle_offset_athwartship\",\n", + " \"angle_sensitivity_alongship\",\n", + " \"angle_sensitivity_athwartship\",\n", + " \"beamwidth_alongship\",\n", + " \"beamwidth_athwartship\",\n", + " \"impedance_transducer\", # z_et\n", + " \"impedance_transceiver\", # z_er\n", + " \"receiver_sampling_frequency\",\n", + "}\n", + "\n", + "AZFP_CAL_DICT = {\n", + " \"EL\",\n", + " \"DS\",\n", + " \"TVR\",\n", + " \"VTX\",\n", + " \"equivalent_beam_angle\",\n", + " \"Sv_offset\",\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "One could choose to bypass the above by updating only specific values in the `EchoData` object (calibration parameters are in the `Sonar/Beam_groupX` and the `Vendor_specifc` group). This may be useful if only one value needs to be updated among all channels." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Environmental parameters" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Environmental parameters, including temperature, salinity, pressure, and pH, are critical in processing raw data collected by the echosounders. They influence calibration through sound speed and absorption coefficients, both central to calibrating raw data to physically meaningful quantities:\n", + "- Absorption coefficients are frequency-dependent and determine the compensation applied to the raw data to compensate for the transmission loss resulted from absortion due to seawater. The higher the frequency, the stronger the absorption and the more sensitive of the calibrated data values are to absorption.\n", + "- Sound speed impacts the calculation of range at each echo samples (`echo_range` in the output of `compute_Sv`) recorded at discrete time intervals (`sample_interval` in `ed[\"Sonar/Beam_groupX\"]`), which in turn affects the compensation of absorption and the transmission loss resulted from spreading loss." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Default behavior\n", + "\n", + "By default, echopype uses environmental parameters stored in the `EchoData` object parsed from the raw file and users need to supply any required parameters that are missing from the data file.\n", + "- EK60 and EK80: Environmental parameters stored in the `EchoData` object parsed from the raw file. For these instruments, users may enter environmental parameter values into the software, where the sound speed and absorption coefficients are automatically calculated and stored; alternatively, sound speed may be entered manually. Note only sound speed and sound absorption are saved into the raw file. See relevant sections of the reference manuals for [EK60](https://www.simrad.online/ek60/ref_english/default.htm?startat=/ek60/ref_english/xxx_para_environment.html) and [EK80](https://www.simrad.online/ek80/ref_en/GUID-A8BD0AEA-3442-4511-BD35-4E9EAD8137CC.html#GUID-A8BD0AEA-3442-4511-BD35-4E9EAD8137CC) for detail.\n", + "- AZFP: Users need to supply at least salinity and pressure as those are not recorded by the instrument. By default echopype uses temperature values stored in the `EchoData` object, but as these are recorded by the instrument and do not necessarily capture the averaged properties of the water column, users are recommended to supply custom temperature values.\n", + "\n", + "\n", + "\n", + "Echopype does not currently handle calculation based on a sound speed profile." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Formulae used\n", + "\n", + "When custom environmental values are supplied, echopype calculates sound speed and absorption coefficients based on the following:\n", + "- EK60 and EK80:\n", + " - sound speed: {cite}`MacKenzie1981`\n", + " - absorption coefficients: {cite}`MacKenzie1981`\n", + "- AZFP: formulae provided in the AZFP Operator's Manual from ASL Environmental Sciences" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To supply custom environmetnal parameters, use input argument `env_params`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "env_params = {\n", + " \"temperature\": 8, # temperature in degree Celsius\n", + " \"salinity\": 30, # salinity in PSU\n", + " \"pressure\": 50, # pressure in dbar\n", + "}\n", + "ds_Sv = ep.calibrate.compute_Sv(ed, env_params=env_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Sanity checks\n", + "\n", + "Values of the calibration and environmental parameters, as well as the resultant sound speed and absorption coefficints are stored in the output of `compute_Sv` for easy access and verification:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Check environmental parameters used in compute_Sv\n", + "print(\n", + " f\"Temperature used is: {ds_Sv[\"temperature\"]}\"\n", + " f\"Salinity used is: {ds_Sv[\"salinity\"]}\",\n", + " f\"Pressure used is: {ds_Sv[\"pressure\"]}\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Check absorption coefficients computed and used\n", + "ds_Sv['sound_absorption'] # absorption coefficients [dB/m]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Check sound speed absorption coefficients computed and used in compute_Sv\n", + "ds_Sv['sound_speed'] # sound speed [m/s]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Interfacing Echoview ECS file\n", + "\n", + "Echopype is able to interface with [Echoview ECS file](https://support.echoview.com/WebHelp/Files,_Filesets_And_Variables/About_ECS_files.htm) for EK60 and EK80 echosounder starting from v0.7.0. The ECS file contains specifications for both calibration parameters and environmental parameters. When an ECS file is provided, parameters in `cal_params` and `env_params`, if provided, are ignored. To provide an ECS file, use:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds_Sv = ep.calibrate.compute_Sv(ed, ecs_file=\"PATH_TO_ECS_FILE\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{Note}\n", + "The vocalbury currently implemented was assembled from example files we have access to and may not include all aliases. We have connected with Echoview and will add all aliases in a future release, as well as ECS support for the AZFP echosounder.\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "celltoolbar": "Tags", + "interpreter": { + "hash": "a292767406182d99a2458e67c2d2e96b524510c4a2166b4b423439fe75c32190" + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/docs/source/data-proc.md b/docs/source/data-proc.md index 7b719c682..60678f8e9 100644 --- a/docs/source/data-proc.md +++ b/docs/source/data-proc.md @@ -5,3 +5,10 @@ Echopype data processing funcionalities are structured into different subpackage The section [**Data processing functionalities**](data-proc:functions) provides information for current processing functions and their usage. The section [**Additional information for processed data**](data-proc:additional) provides on some aspects of processed data that may require additional explanation to fully understand the representation and underlying operations. + +(data-proc:format)= +## Format of processed data + +Once raw data (represented by the `EchoData` objects) are calibrated (via [`compute_Sv`](echopype.calibrate.compute_Sv)), the calibrated data and the outputs of all subsequent [processing functions](data-process:funcionalities) are generic [xarray Datasets](https://docs.xarray.dev/en/stable/user-guide/data-structures.html#dataset). +We currently do not follow any specific conventions for processed data, but we retain provenance information in the dataset, including the [data processing levels](./processing-levels.md). +However, whether and how data variables used in the processing will be stored remain to be determined. diff --git a/docs/source/images/beam_dim_v5-01.png b/docs/source/images/beam_dim_v5-01.png index 7e0c5c244..27a19ea1d 100644 Binary files a/docs/source/images/beam_dim_v5-01.png and b/docs/source/images/beam_dim_v5-01.png differ