Skip to content

Conversation

@xylar
Copy link
Collaborator

@xylar xylar commented Oct 16, 2016

All geojson files produced by all features scripts except
split_features.py and set_group_name.py now automatically set the
groupName property of the feature collection to 'enterGroupName'.
split_features.py produces individual feature files without a
groupName property (as before). The new script set_group_name.py
is used to set the groupName property of an existing feature file.

The driver scripts setup_MOC_basins.py has been merged into
setup_ocean_basins.py, and the combined script has been updated
to produce fewer output files, each of which has its own group name.

All geojson files produced by all features scripts except
split_features.py and set_group_name.py now automatically set the
groupName property of the feature collection to 'enterGroupName'.
split_features.py produces individual feature files without a
groupName property (as before).  The new script set_group_name.py
is used to set the groupName property of an existing feature file.

The driver scripts setup_MOC_basins.py and setup_ocean_basins.py
have been updated to produce fewer output files, each of which
has its own group name.
@xylar
Copy link
Collaborator Author

xylar commented Oct 16, 2016

@pwolfram and @milenaveneziani, I think there is a problem with the workflow that results from how groupNames are being set after #67. Intuitively, a user might expect to set the groupName property in a file at some point in the workflow (and this could reasonably occur anywhere in the course of the workflow), and then merging additional features into the file could reasonably be expected to leave the groupName unchanged. This is not what happens with #67. Instead, the groupName must be set by each call to any feature manipulation script or else the this property resets (presumably unexpectedly) to unspecifiedGroupName.

This merge seeks to make the workflow more intuitive. The only means to set the groupName is with the set_group_name.py script, which reads from and writes to the same file (unlike all other scripts, so let me know if this bugs you). All other scripts except for split_features.py write features in the "default" way, which means that the groupName remains unchanged if the output file already exists and has a groupName property, and that groupName is given the default value of enterGroupName (to be consistent with the MPAS mask creator tool) either if the output file is newly created or if groupName has (somehow) not been set.

split_features.py produces features without the groupName property, as before.

@xylar
Copy link
Collaborator Author

xylar commented Oct 16, 2016

In the process of updating the driver scripts that currently use groupName, I decided it made sense to try to address the parts of #70 that I understand by consolidating setup_MOC_basins.py into setup_ocean_basins.py. I also needed to addressing #73 by making the consolidated script produce just two two output files oceanBasins.geojson and MOCBasins.geojson. I thought of making these changes a separate PR but it turned out to be impractical to try to make these changes stepwise. In the modified script, I need to set the group name only once for each final output file.

@xylar
Copy link
Collaborator Author

xylar commented Oct 16, 2016

@milenaveneziani, I am not retaining the unmasked MOC basins (before the region below either 6S or 34S is masked out) at all in setup_ocean_basins.py in this PR. My guess is these are never used but I wanted to make sure. I wouldn't think these features would be useful but since they were given a groupName, it seemed like you were maybe using it for something on the MPAS-O side.

Also, is there any point in your workflow when you need the individual basin files (e.g. Arctic.geojson) as opposed to a single feature file, oceanBasins.geojson, with all basins in it?

@xylar
Copy link
Collaborator Author

xylar commented Oct 16, 2016

@milenaveneziani, these changes to the driver scripts may require changes to your workflow on the MPAS-O side. Let me know if you require changes to this PR to make that easier or if I've broken something.

@xylar
Copy link
Collaborator Author

xylar commented Oct 16, 2016

This PR is intended to address #72 and #73, and to partially address #70.

@xylar
Copy link
Collaborator Author

xylar commented Oct 16, 2016

Testing

I have made sure that the combined driver script modified by this PR produce the expected output files with the expected group names.

I made sure that features created by merge_features.py and difference_features.py with new files provided via the -o have the default enterGroupName value for groupName.

I made sure that split_features.py produced features without a groupName property.

@xylar xylar changed the title Change how groupName is set Change how groupName is set and simplify output from ocean basin driver script Oct 16, 2016
@xylar xylar changed the title Change how groupName is set and simplify output from ocean basin driver script Change how groupName is set Oct 16, 2016
@xylar xylar changed the title Change how groupName is set Change how groupName is set; update setup_ocean_basins Oct 16, 2016
@milenaveneziani
Copy link
Collaborator

@xylar: I am going to test this now. Here is what I thought I could do:

  1. run the setup_ocean_basins script and check results
  2. produce a new group of features. To do this, I think I should:
    a) run merge_features to get a single feature with several regions in it
    b) run set_group_name on the merged product to set the groupname for it
    Sounds good?
    Also, let me know if you'd prefer me to do additional testing.

@milenaveneziani
Copy link
Collaborator

milenaveneziani commented Oct 18, 2016

@xylar: I did the tests I mentioned and all seems fine. I created the following gists to show the results:

  1. oceanBasins from setup_ocean_basins: https://gist.github.com/milenaveneziani/f56fcdd82c5e6117172af761a68465fe
  2. MOCBasins from setup_ocean_basins: https://gist.github.com/milenaveneziani/6a27197e2a515d98d56913e29188d0a9
  3. my own MedSea region group, running merge_features several times to create a single feature, and then running set_group_name to assign the group name to the merged feature: https://gist.github.com/milenaveneziani/96b6fdc5e51eea14573ab3d7632b4e2c

@milenaveneziani
Copy link
Collaborator

Should we also allow setup_ocean_basins to add Global_Ocean_65N_to_65S to the MOCBasins group?

@xylar
Copy link
Collaborator Author

xylar commented Oct 18, 2016

@milenaveneziani, sorry, I didn't get to that earlier. I just added Global_Ocean_65N_to_65S. Make sure it works as you expected (see my latest commit, d835d30, for what changed).

@xylar
Copy link
Collaborator Author

xylar commented Oct 18, 2016

@milenaveneziani, it looks like we're having no problem creating the geojson files correctly. Can you make sure the resulting files do what they're supposed to do when they're run through the MPAS mask creator? I just want to make sure we haven't broken regional stats somehow.

@milenaveneziani
Copy link
Collaborator

milenaveneziani commented Oct 18, 2016

Thanks @xylar for adding the Global feature.

I ran the MOCBasin feature through the mask creator tool, using the EC60to30 mesh, and the mask file looks good to me in terms of the variables/structure, etc (the file is on IC, here: /usr/projects/climate/milena/MPAS-git-repositories/MPAS-Tools/grid_gen/mesh_conversion_tools/MOCBasinRegionsMask_oEC60to30.151031.nc).
I remember that @douglasjacobsen had a tool to plot the created masks, but I can't find it at the moment.

Two odd things that I find are:

  1. an ncdump of the 'constituentsRegions' variable gives weird results for what I think is the Global_Ocean_65N_65S feature, and incomplete results of the actual constituents for the other features (incomplete because the string is longer than 64 characters). Specifically, here is what I get:
    ncdump -v regionNames $maskfile
    regionNames =
    "IndoPacific_MOC",
    "Indian_MOC",
    "Atlantic_MOC",
    "Pacific_MOC",
    "Global Ocean 65N to 65S" ;

ncdump -v constituentsRegions $maskfile
constituentsRegions =
"Gulf of Aden; Savu Sea; Solomon Sea; Bass Strait North; Sea of ",
"Gulf of Aden; Andaman or Burma Sea; Laccadive Sea; Gulf of Oman",
"Gulf of Guinea; Ionian Sea; Caribbean Sea; Balearic Sea; Rio de",
"Savu Sea; Solomon Sea; Sea of Okhotsk; Japan Sea; Gulf of Tomin",
"\000/ncks -x -v xtime /lustre/scratch1/turquoise/mpeterse/ACME/case" ;

  1. there are 2 region groups, the MOCBasin group and 'all'. But then maxRegionsInGroup is 5, so 'all' doesn't really point to anything. I guess this is a low priority concern at the moment.

@DieMuhkuh, @q-rai: do you mind taking a look at the mask file I mentioned above and see if you think it's the correct format for the MOC and MHT regional AM to work?

@xylar
Copy link
Collaborator Author

xylar commented Oct 18, 2016

an ncdump of the 'constituentsRegions' variable gives weird results for what I think is the Global_Ocean_65N_65S feature, and incomplete results of the actual constituents for the other features (incomplete because the string is longer than 64 characters).

Interesting. constituentRegions isn't necessarily something that we need to preserve in the NetCDF file, so maybe the 64 character limit isn't a problem. @pwolfram, do you have any suggestions for how to make sure the full provenance propagates to the NetCDF files in this case?

The weird text you're seeing for Global Ocean 65N to 65S is probably because it has no constituentRegions property and the mask creator apparently expects all features to have the same properties. My guess is this is an uninitialized garbage string. @milenaveneziani, could you post this as an issue on MPAS-Tools and assign it to me? Should be easy to fix, I think.

there are 2 region groups, the MOCBasin group and 'all'. But then maxRegionsInGroup is 5, so 'all' doesn't really point to anything. I guess this is a low priority concern at the moment.

That all looks correct to me. The two groups (all and MOCBasin) both contain all 5 regions, which is what I would have expected. @milenaveneziani, did you expect a different behavior?

@milenaveneziani
Copy link
Collaborator

I guess I didn't really know what to expect from 'all'. Is it simply all the regions in all groups besides 'all' itself?

@xylar
Copy link
Collaborator Author

xylar commented Oct 18, 2016

Yes, I would expect all just has all the regions in the file inside it. Assuming all regions have to belong to at least one group (besides all), what you say would be true. Maybe there's also a way to have a region that doesn't belong to any groups besides all, but I don't have enough experience with the mask creator to know.

@milenaveneziani
Copy link
Collaborator

I'd say let us not worry about this now.
I would want to hear from @DieMuhkuh first before merging, to make sure that the maskfile I created is compatible with the MOC am.

@DieMuhkuh
Copy link

@milenaveneziani I looked at the mask file you pointed to and I think this will work for the MOC AM and also for the MHT AM (although Anne might want to confirm the last statement herself).

@milenaveneziani
Copy link
Collaborator

@pwolfram, @xylar: I think we can go ahead and merge this one, don't you think?

@pwolfram
Copy link
Contributor

@milenaveneziani, I'll do some quick testing and assuming everything is ok I'll merge.

@pwolfram
Copy link
Contributor

First, let's make sure the unresolved comments above have been addressed:

@xylar, I missed the comment above:

@pwolfram, do you have any suggestions for how to make sure the full provenance propagates to the NetCDF files in this case?

Did this get addressed? If not, we probably need an issue for it- especially if we merge this without fixing it. Note, however, that I think the 64-limit on the character string may be a consequence of how the netCDF file is defined.

Also, did @q-rai get a change to review this above @DieMuhkuh? I'd rather wait than merge something that is broken, unless we all agree that is ok.

@milenaveneziani
Copy link
Collaborator

@pwolfram: about the first question you ask: No, it was not addressed, but it may be that we don't need to address it, unless we decide to increase the max character limit to much more than 64. Problem is: how much longer can it be? The constituent regions could be many many, so we will never know what limit to use.
About the second question: the mask files should be used in the MOC and MHT am's to be properly tested. But this is not possible at the moment because we don't have the corresponding southern transects (PR #75). So, this is why I was simply asking @DieMuhkuh or @q-rai to check the netcdf file.

@xylar
Copy link
Collaborator Author

xylar commented Oct 21, 2016

I think we can merge. @pwolfram, the NetCDF provenance issue is not something that can be addressed in this repo in any case. It relates to MPAS-Tools, specifically to the mask creator. I don't think the 64-character limit is a NetCDF issue, just a choice made in the mask creator itself. I agree with @milenaveneziani that it would be hard to determine a length that would be sufficient, so maybe best to let it go.

If there are issues with MOC/MHT later, we can address them with a follow-up PR.

@pwolfram
Copy link
Contributor

@milenaveneziani and @xylar, I suppose we will agree to ignore the character overflow question for now. I've posted an issue at MPAS-Dev/MPAS-Tools#145 and it is possible that we could automate selection of the string length if necessary.

@pwolfram
Copy link
Contributor

@xylar, the resultant files from ./driver_scripts/setup_ocean_basins.py are not BFB, e.g., in a comparison with master in directory geometric_features:

┌─[pwolfram][shapiro][~/Documents/MPAS_pull_requests/geometric_features_groupName][14:51][±][ ✗]
└─▪ diff Atlantic_Basin.geojson geometric_features/Atlantic_Basin.geojson  | head -n 50
2a3
>     "groupName": "OceanBasinRegionsGroup", 
7c8
<                 "name": "Atlantic_MOC", 
---
>                 "name": "Atlantic_Basin", 
9c10
<                 "constituents": "Gulf of Guinea; Ionian Sea; Caribbean Sea; Balearic Sea; Rio de La Plata; Tyrrhenian Sea; Bay of Fundy; Davis Strait; Strait of Gibraltar; Kattegat; Aegean Sea; Gulf of Bothnia; Hudson Bay; Norwegian Sea; Gulf of Finland; Inner Seas off the West Coast of Scotland; Labrador Sea; North Sea; Baffin Bay; Skaggerak; Mediterranean Sea - Eastern Basin; English Channel; Celtic Sea; Gulf of Riga; Mediterranean Sea - Western Basin; Greenland Sea; Bay of Biscay; South Atlantic Ocean; Adriatic Sea; Ligurian Sea; Hudson Strait; North Atlantic Ocean; Bristol Channel; Baltic Sea; Alboran Sea; Gulf of Mexico; Irish Sea and St Georges Channel; Gulf of St-Lawrence", 
---
>                 "constituents": "Gulf of Guinea; Caribbean Sea; Rio de La Plata; Bay of Fundy; Davis Strait; Kattegat; Gulf of Bothnia; Hudson Bay; Norwegian Sea; Gulf of Finland; Inner Seas off the West Coast of Scotland; Labrador Sea; North Sea; Baffin Bay; Skaggerak; English Channel; Celtic Sea; Gulf of Riga; Greenland Sea; Bay of Biscay; South Atlantic Ocean; Hudson Strait; North Atlantic Ocean; Bristol Channel; Baltic Sea; Gulf of Mexico; Irish Sea and St Georges Channel; Gulf of St-Lawrence", 
19,20c20,21
<                             28.272636, 
<                             36.736936
---
>                             -61.046455, 
>                             9.576609
23,24c24,25
<                             28.247500, 
<                             36.766109
---
>                             -61.084164, 
>                             9.582218
27,28c28,29
<                             28.235482, 
<                             36.802836
---
>                             -61.197536, 
>                             9.578364
31,32c32,33
<                             28.246109, 
<                             36.830273
---
>                             -61.255982, 
>                             9.588891
35,36c36,37
<                             28.261936, 
<                             36.845127
---
>                             -61.383055, 
>                             9.680273
39,40c40,41
<                             28.284300, 
<                             36.845964
---
>                             -61.407227, 
>                             9.704718
43,44c44,45
<                             28.310836, 
<                             36.825827
---

Even the name changes from Atlantic_Basin to Atlantic_MOC. Are we all ok with allowing changes in output geojson files with this PR? Basin is changed to MOC unilaterally which seems ok. However, it doesn't seem ok that the geometry is being affected.
Full output from the diff comparisons at https://gist.github.com/730bbf10de151f7bcd70458944b2bfed.

Some perspective on this would be really helpful. If you want me to merge without further testing that is ok too.

@xylar
Copy link
Collaborator Author

xylar commented Oct 21, 2016

@pwolfram, I don't understand what you're seeing. I'm going to give you a quick call to discuss.

@pwolfram pwolfram merged commit d835d30 into MPAS-Dev:master Oct 21, 2016
@pwolfram
Copy link
Contributor

We sorted out this issues and a new PR will be submitted soon to handle alphabetization of geojson output in preparation for checking BFB and CI testing.

@pwolfram
Copy link
Contributor

Note, there are some tricky issues with alphabetization of geojson output and we may be back to a custom writer that would require more discussion. @xylar and I will keep this in mind for the future.

@q-rai
Copy link

q-rai commented Oct 23, 2016

@milenaveneziani I think it should be fine as long as there are region groups defined. I tried looking at the file but wtrw isn't letting me connect right now. I'll try again tomorrow.

@milenaveneziani
Copy link
Collaborator

@pwolfram, @DieMuhkuh : my suggestion is to merge this, then test #78 with the meshfiles that we create within the MOC AM, and then merge #78 as well.

@xylar xylar deleted the fix_groupName branch October 24, 2016 14:10
@xylar
Copy link
Collaborator Author

xylar commented Oct 24, 2016

@milenaveneziani, this has already been merged.

@milenaveneziani
Copy link
Collaborator

oh, of course! didn't realize that.. :)
Great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants