Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 'Division by Zero' error when converting EK80 files without Sound Velocity Profile Depth value(s) #1381

Merged
merged 6 commits into from
Sep 4, 2024

Conversation

ctuguinay
Copy link
Collaborator

When trying to save an Echodata object to Zarr that was missing the Sound Velocity Profile Depth coordinate in the Environment group, I got the following error:

{
	"name": "ZeroDivisionError",
	"message": "division by zero",
	"stack": "---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[6], line 1
----> 1 ed.to_zarr(\"test.zarr\")

File ~/echopype/echopype/echodata/echodata.py:602, in EchoData.to_zarr(self, save_path, compress, overwrite, parallel, output_storage_options, consolidated, **kwargs)
    577 \"\"\"Save content of EchoData to zarr.
    578 
    579 Parameters
   (...)
    598     xarray's documentation for a list of all possible arguments.
    599 \"\"\"
    600 from ..convert.api import to_file
--> 602 return to_file(
    603     echodata=self,
    604     engine=\"zarr\",
    605     save_path=save_path,
    606     compress=compress,
    607     overwrite=overwrite,
    608     parallel=parallel,
    609     output_storage_options=output_storage_options,
    610     consolidated=consolidated,
    611     **kwargs,
    612 )

File ~/echopype/echopype/convert/api.py:88, in to_file(echodata, engine, save_path, compress, overwrite, parallel, output_storage_options, **kwargs)
     86     else:
     87         logger.info(f\"saving {output_file}\")
---> 88     _save_groups_to_file(
     89         echodata,
     90         output_path=io.sanitize_file_path(
     91             file_path=output_file, storage_options=output_storage_options
     92         ),
     93         engine=engine,
     94         compress=compress,
     95         **kwargs,
     96     )
     98 # Link path to saved file with attribute as if from open_converted
     99 echodata.converted_raw_path = output_file

File ~/echopype/echopype/convert/api.py:118, in _save_groups_to_file(echodata, output_path, engine, compress, **kwargs)
    108 io.save_file(
    109     echodata[\"Top-level\"],
    110     path=output_path,
   (...)
    114     **kwargs,
    115 )
    117 # Environment group
--> 118 io.save_file(
    119     echodata[\"Environment\"],  # TODO: chunking necessary?
    120     path=output_path,
    121     mode=\"a\",
    122     engine=engine,
    123     group=\"Environment\",
    124     compression_settings=COMPRESSION_SETTINGS[engine] if compress else None,
    125     **kwargs,
    126 )
    128 # Platform group
    129 io.save_file(
    130     echodata[\"Platform\"],  # TODO: chunking necessary? time1 and time2 (EK80) only
    131     path=output_path,
   (...)
    136     **kwargs,
    137 )

File ~/echopype/echopype/utils/io.py:78, in save_file(ds, path, mode, engine, group, compression_settings, **kwargs)
     76         if isinstance(ds[var].data, DaskArray):
     77             ds[var] = ds[var].chunk(enc.get(\"chunks\", {}))
---> 78     ds.to_zarr(store=path, mode=mode, group=group, encoding=encoding, **kwargs)
     79 else:
     80     raise ValueError(f\"{engine} is not a supported save format\")

File ~/miniforge3/envs/echopype/lib/python3.9/site-packages/xarray/core/dataset.py:2549, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs)
   2404 \"\"\"Write dataset contents to a zarr group.
   2405 
   2406 Zarr chunks are determined in the following way:
   (...)
   2545     The I/O user guide, with more details and examples.
   2546 \"\"\"
   2547 from xarray.backends.api import to_zarr
-> 2549 return to_zarr(  # type: ignore[call-overload,misc]
   2550     self,
   2551     store=store,
   2552     chunk_store=chunk_store,
   2553     storage_options=storage_options,
   2554     mode=mode,
   2555     synchronizer=synchronizer,
   2556     group=group,
   2557     encoding=encoding,
   2558     compute=compute,
   2559     consolidated=consolidated,
   2560     append_dim=append_dim,
   2561     region=region,
   2562     safe_chunks=safe_chunks,
   2563     zarr_version=zarr_version,
   2564     write_empty_chunks=write_empty_chunks,
   2565     chunkmanager_store_kwargs=chunkmanager_store_kwargs,
   2566 )

...

File ~/miniforge3/envs/echopype/lib/python3.9/site-packages/zarr/indexing.py:181, in SliceDimIndexer.__init__(self, dim_sel, dim_len, dim_chunk_len)
    179 self.dim_chunk_len = dim_chunk_len
    180 self.nitems = max(0, ceildiv((self.stop - self.start), self.step))
--> 181 self.nchunks = ceildiv(self.dim_len, self.dim_chunk_len)

File ~/miniforge3/envs/echopype/lib/python3.9/site-packages/zarr/indexing.py:167, in ceildiv(a, b)
    166 def ceildiv(a, b):
--> 167     return math.ceil(a / b)

ZeroDivisionError: division by zero"
}

Zarr doesn't seem to like it when a coordinate has no associated value(s):

image

So instead of setting sound_velocity_profile_depth to [] in set groups when the parser contains no associated values, I set it to [np.nan].

@ctuguinay ctuguinay marked this pull request as draft August 27, 2024 17:19
@ctuguinay ctuguinay self-assigned this Aug 27, 2024
@ctuguinay ctuguinay added bug Something isn't working data conversion labels Aug 27, 2024
@ctuguinay ctuguinay added this to the v0.9.1 milestone Aug 27, 2024
@ctuguinay ctuguinay marked this pull request as ready for review August 27, 2024 21:20
@codecov-commenter
Copy link

codecov-commenter commented Aug 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.22%. Comparing base (9f56124) to head (c916fe9).
Report is 141 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1381      +/-   ##
==========================================
- Coverage   83.52%   80.22%   -3.31%     
==========================================
  Files          64       18      -46     
  Lines        5686     3089    -2597     
==========================================
- Hits         4749     2478    -2271     
+ Misses        937      611     -326     
Flag Coverage Δ
unittests 80.22% <ø> (-3.31%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ctuguinay
Copy link
Collaborator Author

@leewujung This should be ready for review

leewujung
leewujung previously approved these changes Aug 31, 2024
Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctuguinay : This looks good. I agree [np.nan] is better than []. There may be something else we need to change further on this due to the newly discovered Environment datagram variability (only the first one in a .raw file contain the full set of data and the later ones containing only a subset).

The only comment I have is whether we need 3 test files. If these are identical in what the content is like, let's just keep one.

@ctuguinay
Copy link
Collaborator Author

@leewujung Thanks for the review! Just made the change to use just one test (since the content was identical).

Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks @ctuguinay! I resolved one merge conflict and this should be ready to go!

@ctuguinay ctuguinay merged commit 95b6cb1 into OSOceanAcoustics:main Sep 4, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data conversion
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants