Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For EK60 and EK80 add beam and ping_time to Beam_groupX #638

Merged
merged 10 commits into from
Apr 23, 2022

Conversation

b-reyes
Copy link
Contributor

@b-reyes b-reyes commented Apr 20, 2022

This PR addresses the addition of the dimensions beam and ping_time to variables within Beam_groupX. The specific variables were identified in #520. Note that this PR is only concerned with the sensors EK60 and EK80. The modifications necessary for AZFP will take place in a different PR.

To complete this task, I have added the function beamgroups_to_convention() in echopype/convert/set_groups_base.py. This function depends on several dictionaries I have defined at the top of the file. This function is called at the bottom of set_beam() in the set_groups_ekXX.py files. Note: currently, my intention is to use this function again when I create my code for the conversion from v0.5.x to v0.6.x in open_converted.

Edit:
The above additions cause some issues downstream. For this reason, I had to also change lines in echopype/calibrate/calibrate_ek.py and echopype/visualize/plot.py to account for either the addition of the beam or ping_time dimension.

@codecov-commenter
Copy link

codecov-commenter commented Apr 21, 2022

Codecov Report

Merging #638 (59c4f95) into dev (bdadba5) will decrease coverage by 7.68%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##              dev     #638      +/-   ##
==========================================
- Coverage   78.43%   70.74%   -7.69%     
==========================================
  Files          42       30      -12     
  Lines        3746     3394     -352     
==========================================
- Hits         2938     2401     -537     
- Misses        808      993     +185     
Flag Coverage Δ
unittests 70.74% <100.00%> (-7.69%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
echopype/calibrate/calibrate_ek.py 80.14% <100.00%> (-13.36%) ⬇️
echopype/convert/set_groups_azfp.py 98.36% <100.00%> (+0.08%) ⬆️
echopype/convert/set_groups_base.py 88.17% <100.00%> (+0.01%) ⬆️
echopype/convert/set_groups_ek60.py 91.58% <100.00%> (+0.32%) ⬆️
echopype/convert/set_groups_ek80.py 96.25% <100.00%> (+0.14%) ⬆️
echopype/echodata/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
echopype/calibrate/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
echopype/echodata/convention/utils.py 0.00% <0.00%> (-100.00%) ⬇️
echopype/echodata/convention/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
echopype/echodata/convention/conv.py 11.11% <0.00%> (-88.89%) ⬇️
... and 23 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@leewujung
Copy link
Member

@b-reyes : this is awesome!! The code is very clean!! 😍 My head is not clear enough to review this right now so it has to be tomorrow, but I just want to say thank you for putting so much time and thought into this (and the gigantic #520 discussion thread 🤯).

@leewujung leewujung added this to the 0.6.0 milestone Apr 21, 2022
Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good! The only other thing I would suggest is to add tests for the dimensions after set_beam, perhaps best done as a separate test module (script) just for the dimensions?

@emiliom
Copy link
Collaborator

emiliom commented Apr 21, 2022

Are the changes to calibrate/calibrate_ek.py unrelated to the thrust of this PR? I don't see an obvious relationship to the addition of beam and ping_time dimensions. It's ok with me if they are in fact unrelated, but please make a note of that in your initial comment

@emiliom
Copy link
Collaborator

emiliom commented Apr 21, 2022

Alright, I'm done. None of my comments involve the substance of your implementation. They're all either clarifications in comments and docstrings, or suggestions for small changes to the coding style largely for consistency.

Since all CI tests are passing (woo-hoo!), I didn't bother pulling in your branch to test locally.

This PR does what it sets out to do, and it does it in a very clean and parsimonious way. As you mentioned, it also sets up reusable machinery for the open_converted 0.5.x > 0.6 conversion, which is fantastic! But I think there's one "cost" to this approach: what goes on in the individual set_groups_SENSOR.set_beam methods will now be fragmented between the sensor modules and what will now be in set_groups_base.py. One could argue that currently there's quite a bit of redundancy across the sensor-specific modules that could probably be abstracted away into common functions in set_groups_base.py or elsewhere. That's something we should revisit in the future. But right now all the sensor-specific actions and details are found in the set_groups_SENSOR.py modules, while set_groups_base.py is free of any sensor specificity (except possibly for the _parse_NMEA method, but see the comment above it). That makes conceptual sense to me. Your scheme will alter that conceptual separation.

I'm not suggesting we change your scheme, though. I see it as a great transitional approach, since it cleanly handles both the addition of the missing dimensions on open_raw and the open_converted 0.5.x > 0.6 conversion. But I think we should revisit this new conceptual blurring between set_groups_SENSOR.py and set_groups_base.py in the future, after 0.6.0.

@emiliom
Copy link
Collaborator

emiliom commented Apr 21, 2022

Actually, I have two suggestions to create some separation of sensor-specific from generic stuff. I think this would work well for the open_raw aspects in this PR, but I don't have a good sense of the impacts on your planned open_converted changes:

  1. Move the beam_only_names, ping_time_only_names and beam_ping_time_names dicts to the set_groups_SENSOR.py modules, so they're passed as arguments to beamgroups_to_convention.
  2. Eliminate the sonar_model block in beamgroups_to_convention, and instead have the set_groups_SENSOR.py modules pass the appropriate sensor to the beamgroups_to_convention call.

@emiliom
Copy link
Collaborator

emiliom commented Apr 21, 2022

I forgot to comment on downstream impacts on processed data like Sv. It looks like compute_Sv (for example) would return an xarray dataset that retains the beam dimension. I'm inferring that from the changes @b-reyes made to test_echodata.py and test_calibrate.py. @leewujung should compute_Sv and compute_TS be modified so beam is squeezed? For EK80 data having both beam and beam power groups, do these functions currently return a dataset with a length-2 beam dimension?

@b-reyes
Copy link
Contributor Author

b-reyes commented Apr 21, 2022

Are the changes to calibrate/calibrate_ek.py unrelated to the thrust of this PR? I don't see an obvious relationship to the addition of beam and ping_time dimensions. It's ok with me if they are in fact unrelated, but please make a note of that in your initial comment

@emiliom these changes are required because of the addition of the dimensions. I have added an edit to my initial comment.

@leewujung
Copy link
Member

Late to the party!

On the potential blurring between set_groups_SENSOR.py and set_groups_base.py: one way to resolve this may be to just move those definitions (the beam_only_names, ping_time_only_names and beam_ping_time_names dicts) into another script, e.g. set_groups_dims_mod.py. These dicts can then be used by beamgroups_to_convention.

This is along the same line as in our core.py, BUT in core.py the levels of organization is first with SENSOR and then with specific info, whereas this order is the reversed.

@leewujung
Copy link
Member

I forgot to comment on downstream impacts on processed data like Sv. It looks like compute_Sv (for example) would return an xarray dataset that retains the beam dimension. I'm inferring that from the changes @b-reyes made to test_echodata.py and test_calibrate.py. @leewujung should compute_Sv and compute_TS be modified so beam is squeezed?

Good question. The main reason is because for EK80 complex data (Sonar/Beam_group1) compute_Sv would involve a .mean(dim="beam") operation that changes the original length=4 beam dimension to be of length=1:

* np.abs(pc.mean(dim="beam")) ** 2
and
* np.abs(backscatter_cw.mean(dim="beam")) ** 2

So, there is a conceptual incongruence and potential confusion that takes place here. This is related to how we use the beam dimension to store data from each element of a split-beam transducer.

I wonder if it may be better if we remove the beam dimension for the time being given our current scope (scientific echosounder).


For EK80 data having both beam and beam power groups, do these functions currently return a dataset with a length-2 beam dimension?

No, the complex and power/angle data are calibrated separately in the current compute_Sv setup. Users have to pass in both the waveform_mode and encode_mode to use this function.

waveform_mode : {"CW", "BB"}, optional
Type of transmit waveform.
Required only for data from the EK80 echosounder
and not used with any other echosounder.
- `"CW"` for narrowband transmission,
returned echoes recorded either as complex or power/angle samples
- `"BB"` for broadband transmission,
returned echoes recorded as complex samples
encode_mode : {"complex", "power"}, optional
Type of encoded return echo data.
Required only for data from the EK80 echosounder
and not used with any other echosounder.
- `"complex"` for complex samples
- `"power"` for power/angle samples, only allowed when
the echosounder is configured for narrowband transmission

Outputs would always just be length=1 along the beam dimension.

@emiliom
Copy link
Collaborator

emiliom commented Apr 21, 2022

Thanks, @leewujung ! I just realized that this question of mine was non-sensical as written:

For EK80 data having both beam and beam power groups, do these functions currently return a dataset with a length-2 beam dimension?

I was conflating the existence of 2 beam groups with the 4 beam coordinate values. Sigh. But I guess your answer both to this bad question and my earlier comments is the same: at this time, compute_Sv/TS always return Sv/TS dataarrays with length-1 beam dimension. Based on what you've said, I suggest that we squeeze beam out for the returned datasets. It looks like it can be done easily at the very end of _compute_cal. It could be applied on the return statement itself.

@leewujung can you confirm? If you do, then, @b-reyes please push a new commit with this change. Note that you'll have to undo some of your changes to test_echodata.py and test_calibrate.py.

@b-reyes
Copy link
Contributor Author

b-reyes commented Apr 21, 2022

Actually, I have two suggestions to create some separation of sensor-specific from generic stuff. I think this would work well for the open_raw aspects in this PR, but I don't have a good sense of the impacts on your planned open_converted changes:

  1. Move the beam_only_names, ping_time_only_names and beam_ping_time_names dicts to the set_groups_SENSOR.py modules, so they're passed as arguments to beamgroups_to_convention.
  2. Eliminate the sonar_model block in beamgroups_to_convention, and instead have the set_groups_SENSOR.py modules pass the appropriate sensor to the beamgroups_to_convention call.

@emiliom I think this could work. It wouldn't have too much of an effect on my plans for open_converted. I would essentially need to have the following in my conversion script:

from ...convert.set_groups_ek60 import SetGroupsEK60
from ...convert.set_groups_ek80 import SetGroupsEK80
from ...convert.set_groups_azfp import SetGroupsAZFP

Using SetGroupsXX I could then get access to the appropriate dictionary. To make this work though, I would need to make these dictionaries be global variables. One more note, if we do item 1., then there is no need to pass sonar_model into beamgroups_to_convention (so item 2. would go away).

@leewujung
Copy link
Member

For the beam dimension on outputs of compute_Sv:

Based on what you've said, I suggest that we squeeze beam out for the returned datasets. It looks like it can be done easily at the very end of _compute_cal. It could be applied on the return statement itself.

@leewujung can you confirm? If you do, then, @b-reyes please push a new commit with this change. Note that you'll have to undo some of your changes to test_echodata.py and test_calibrate.py.

Yes, it is always length=1 and let's just squeeze it out.

@emiliom
Copy link
Collaborator

emiliom commented Apr 21, 2022

Yes, it is always length=1 and let's just squeeze it out.

Thanks! @b-reyes : please go ahead and apply the changes I've suggested above. As you'll see, even though the squeeze operation will be implemented in the _compute_cal function, it will apply to both compute_Sv and compute_TS, as expected.

@emiliom
Copy link
Collaborator

emiliom commented Apr 21, 2022

Using SetGroupsXX I could then get access to the appropriate dictionary. To make this work though, I would need to make these dictionaries be global variables. One more note, if we do item 1., then there is no need to pass sonar_model into beamgroups_to_convention (so item 2. would go away).

Great. I'm glad this is workable.

Please see also @leewujung's alternative approach for accomplishing the same thing. I don't have a strong preference for either my or her approach. You two can decide 😃

@leewujung
Copy link
Member

I don't have a strong preference either. @b-reyes can decide :)

@b-reyes
Copy link
Contributor Author

b-reyes commented Apr 22, 2022

Late to the party!

On the potential blurring between set_groups_SENSOR.py and set_groups_base.py: one way to resolve this may be to just move those definitions (the beam_only_names, ping_time_only_names and beam_ping_time_names dicts) into another script, e.g. set_groups_dims_mod.py. These dicts can then be used by beamgroups_to_convention.

This is along the same line as in our core.py, BUT in core.py the levels of organization is first with SENSOR and then with specific info, whereas this order is the reversed.

@leewujung I definitely see how this could be done and it has the upside of being very simple to implement! I think the only downside to this type of approach is that sensor specific information would be in several places. I kind of like the idea of having one place for all of the information necessary for setting the groups (analogous to self._beamgroups in the sensor specific set_groups_XX.py files).

Since it is up to me ... I guess I will go with @emiliom's approach as it has the upside that all of this information is in one place.

@b-reyes
Copy link
Contributor Author

b-reyes commented Apr 22, 2022

@emiliom and @leewujung based on the above comment it seems like the conclusion was that compute_Sv would return an xarray dataset that retains the beam dimension and this dimension would have length=1.

However, after running the tests I came across a situation where the parameters sonar_model='EK80', waveform_mode='BB', and encode_mode='complex' produce a beam dimension with length=4. So, it looks like we cannot just squeeze out the beam dimension at the very end of _compute_cal.

As this seems like unexpected behavior, how should we handle this?

@leewujung
Copy link
Member

@b-reyes : Thanks for finding this out! The extra beam dimension in the outputs you saw are due to other variables being added that dimension so got carried through in the computation. I'll push up some changes.

@b-reyes
Copy link
Contributor Author

b-reyes commented Apr 22, 2022

@emiliom and @leewujung I have addressed all of the comments you have suggested. The main portion of code that I modified was changing where the sensor-specific sets for adding the dimensions are located. This was done following @emiliom's suggestions, mainly:

  1. Move the beam_only_names, ping_time_only_names and beam_ping_time_names dicts to the set_groups_SENSOR.py modules, so they're passed as arguments to beamgroups_to_convention.

Via Slack, it was decided that we will not be implementing the squeezing of beam in _compute_cal as suggested in this PR because there is more discussion that is needed. This will be addressed in #644.

beam_ping_time_names are used in set_groups_base and
in converting from v0.5.x to v0.6.0. The values within
these sets are applied to all Sonar/Beam_groupX groups.
"""
Copy link
Collaborator

@emiliom emiliom Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but the presence of this string block here threw me off. It's like a docstring right after a docstring.

Also, I may be wrong here, but I don't think I've seen "stuff" placed here in a class, outside of __init__. I guess it works! But I would've expected it to be within __init__.

Same goes for SetGroupsEK80 and SetGroupsAZFP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are "class variables" that are not instance variables under __init__.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh maybe you're just referring to the docstrings. Ha, I think there's some debate on where those go.

Copy link
Contributor Author

@b-reyes b-reyes Apr 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know this string looks out of place. I was using it for a multiline comment, but if you prefer, I can use #. In regards to the variables. I placed them before the __init__ this makes them global variables of the class. The benefit of this placement is that you do not have to initialize the class to use them. If you prefer, we can put them outside of the class at the top of the file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and sorry for my ignorance there. I like @leewujung 's description -- class variables rather than instance variables.

I think I'd vote for using #'s. Otherwise my first instinct was to read that block as part of the class docstring.

Copy link
Contributor Author

@b-reyes b-reyes Apr 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and sorry for my ignorance there. I like @leewujung 's description -- class variables rather than instance variables.

Yes, that is a correct and a good description.

I think I'd vote for using #'s. Otherwise my first instinct was to read that block as part of the class docstring.

Sounds good, unless @leewujung has any objections, I will go ahead and change this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go ahead and change this, and feel free to self-merge!

@emiliom
Copy link
Collaborator

emiliom commented Apr 22, 2022

I have addressed all of the comments you have suggested. The main portion of code that I modified was changing where the sensor-specific sets for adding the dimensions are located.

Looks good, thanks! I just had one style comment; see my inline comment.

@leewujung
Copy link
Member

I don't have other comments. I think this is ready to be merged!

@b-reyes b-reyes merged commit 6082ec6 into OSOceanAcoustics:dev Apr 23, 2022
@b-reyes b-reyes deleted the add-beam-dim branch April 23, 2022 02:08
@leewujung leewujung added data conversion data format Anything about data format labels Apr 23, 2022
@b-reyes b-reyes mentioned this pull request May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data conversion data format Anything about data format
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants