-
Notifications
You must be signed in to change notification settings - Fork 25
Update feature_write_utils to use json.dump #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@pwolfram, In the process of some work I'm doing, I needed to be able to write out a MultiPoint object, which was not supported by the writer. I did some poking around and found that you can use Please take a look and test these scripts out. I have rebased #49 onto this so please merge this before that one (which I am still testing). I haven't yet fully tested these changes, and they will affect init_step1 of MPAS-O global_ocean runs so we need to be careful that these changes do no harm. I will let you know when I think the PR is ready to merge but I wanted to get your feedback and give you a chance to look over the code before earlier rather than later in that process. Obviously, this is clean up so it's not a high priority. But I will be building other work (in addition to #49) off of this PR. |
96e9df5 to
aa5208d
Compare
|
@pwolfram, I was running into an issue where |
7659362 to
56b3ffc
Compare
|
@xylar, do you need this within the next day or is early next week ok? |
|
No, please take your time. I'm using this as part of my efforts to solve #44, and I am still either debugging it or making small improvements. Also, I really need to test it on Thanks for checking in. |
|
@pwolfram, I have tested all the scripts in the main directory and they work as I expected with these modifications. (I had trouble with the scripts for splitting at the prime meridian and antimeridian, but that is an issue for another time and unrelated to this PR. If I give them a feature that doesn't need splitting, they write it out correctly.) I have tested nearly all of the Please give this a look when you have time. I would appreciate any feedback you have on coding style since this might be a good chance to address stylistic issues in the A good set of tests might be to merge in a few features and make sure you, too, can use each of the scripts to manipulate the features without running into trouble. If you have any existing workflow that makes use of this repo, you should almost certainly test that out with these changes. Thanks for agreeing to review this. |
|
@xylar, I've started to tested this and am finding some potential bugs (or potentially just user-error that needs clarified in documentation). Note that this may not be a complete list because I'll test again after we resolve the hard-crash listed below.
mkdir test_feature_collection
cd test_feature_collection/
ln -s ../ocean/region/North_Atlantic_Ocean/region.geojson NA_region.geojson
ln -s ../ocean/region/South_Atlantic_Ocean/region.geojson SA_region.geojson
ln -s ../ocean/region/Southern_Ocean/region.geojson SO_region.geojson
cd ..
./merge_features.py -d test_feature_collection/ -o AO_SO.geojson
./combine_features.py -f AO_SO.geojson -n AOSO -o AO_SO_combined.geojsonyields ┌─[pwolfram][shapiro][~/Documents/MPAS_pull_requests/simplify_feature_write_utils][16:25][±][ ✗]
└─▪ ./combine_features.py -f AO_SO.geojson -n AOSO -o AO_SO_combined.geojson
Traceback (most recent call last):
File "./combine_features.py", line 107, in <module>
write_all_features(features, out_file_name, indent=4)
File "/Users/pwolfram/Documents/MPAS_pull_requests/simplify_feature_write_utils/utils/feature_write_utils.py", line 11, in write_all_features
features['features'][index] = check_feature(features['features'][index])
File "/Users/pwolfram/Documents/MPAS_pull_requests/simplify_feature_write_utils/utils/feature_write_utils.py", line 68, in check_feature
outFeature = OrderedDict((('type',feature['type']),
KeyError: 'type'Regarding the output style, e.g., where the standard approach with tabs represented by I think that it would be better to have lat/lon points on one line as opposed to four but there is definitely something to be said for using the standard json approach, for a variety of reasons. I can't imaging that most people will read/edit the geojson file by hand from the raw text, although they clearly can for either case. I would say both formats have their pros/cons but that using an established library brings the benefit of standardization with the broader community, which in my view is always a good thing. Regarding coding style:
@xylar, this should get us started but please let me know if there is something specific I should look at in the meantime before doing a more comprehensive review following mitigation of some of the issues discussed in this comment. |
56b3ffc to
6414a47
Compare
We could do this but it's definitely beyond the scope of this PR, even if we're using this for some clean up. The intended use of |
I agree. I found the behavior of this script unexpected even though I wrote it. This should be included in this PR, so I'll get to it as soon as I can. |
I'll make an issue for this as soon as I can get to it. |
|
I believe I have fixed the type error in |
I don't think we can control the I find that I do edit features by hand myself, for example, in order to create simple masks from other masks or to see why there seem to be redundant or self-intersecting points in a given feature. So readability is definitely important. But I find either the old or the new format approximately equally readable and the new is much easier to maintain. |
Agreed, I'll fix this wherever I find it in the util modules. |
I'm finding |
This makes it easier to support new geometries (such as MultiPoint) that weren't supported by the previous writer. It also means that no file formatting must be handled by calling scripts, since json.dump writes the full structure in one step. For convenience, write_all_features now takes a file name instead of a file pointer. Some gymnastics were required to make sure that properties come before geometry (which makes the files easier to manipulate. Also, The formatting, while similar, is not identical to the previous. write_single_feature has been removed since it can be handled by write_all_features in the only script that uses it. All scripts have been updated to work with the new syntax.
6414a47 to
150ba10
Compare
Previously, tag_features.py changed tags in place. Now, it writes out a new file, consistent with all other manipulation scripts.
|
I think I've fixed |
|
@pwolfram, I will make two issues for |
|
By the way @xylar, excellent PR: +185 −333 for code. Less code is almost always better. |
|
@xylar, this is just a nit but can you please update the date / authors in all the scripts you edited? I'm ok with this being a separate commit and that probably would be better because we haven't been keeping close tabs on this like we should have been. There may be more edits and I'll let you know when I'm done reviewing. |
Yep, that was the basic idea. |
|
Okay, I updated the authors and last-modified dates. I agree that is something we want to do a better job of keeping track of. Let me know what else you'd like me to address. |
|
@xylar, does └─▪ ./difference_features.py -h
usage: difference_features.py [-h] -f FILE1 -m FILE2 [-o PATH]
This script takes a file containing one or more feature definitions, that is
pointed to by the -f flag and a second masking feature definition, pointed
to with the -m flag. The masking features are masked out of (i.e. removed
from) the original feature definitions. The resulting features are placed
in (or appended to) the output file pointed to with the -o flag
(features.geojson by default).
Authors: Xylar Asay-Davis
Last Modified: 02/12/2016
optional arguments:
-h, --help show this help message and exit
-f FILE1, --feature_file FILE1
Single feature file to be clipped
-m FILE2, --mask_file FILE2
Single feature whose overlap with the first feature should be removed
-o PATH, --output PATH
Output file, e.g., features.geojson.indicate that the file pointed to by "-m" MUST only have a single feature definition? It seems like in general we would want this file to be able to have multiple feature definitions. If so, this is a new feature request that we don't need to do anything with right now and I can submit an issue. |
|
Testing capability above: ./merge_features.py -d iceshelves/ -o iceshelves.geojson
./merge_features.py -f iceshelves/region/Amery_1/region.geojson -o Amery_all.geojson
./merge_features.py -f iceshelves/region/Amery_2/region.geojson -o Amery_all.geojson
./merge_features.py -f iceshelves/region/Amery_3/region.geojson -o Amery_all.geojson
./difference_features.py -f iceshelves.geojson -m Amery_all.geojson
./plot_features.py -f features.geojson appears to work so we need to edit the doc string for |
|
@pwolfram, I will fix this but these comments are getting way outside the scope of this PR. Let's do a separate PR to address docstrings, coding style, etc. that needs to be fixed but that weren't broken by this PR. |
Docstring and help now state correctly that the features that the script supports multiple features to clip and multiple features as masks.
|
@pwolfram, I've updated the docstring and help for |
|
Thanks @xylar. I just want this to be as good as possible but you are right this is getting outside scope. I'm almost done testing... driver scripts work and as far as I can tell everything is working properly. One additional note, outside this PR: driver scripts use the old |

This merge removes code code for manually writing out
geojson files and replaces it with a call to json.dump.
This makes it easier to support new geometries (such as MultiPoint)
that weren't supported by the previous writer. It also means that
less file formatting must be handled by scripts before calling
write_all_features, since json.dump writes the full structure in one step.
Internally, some gymnastics were required to make sure that "properties" come
before "geometry" (which makes the files easier to manipulate). Also,
The formatting of the resulting geojson files, while similar, is not identical
to the that of the previous writer.
write_single_feature has been removed since it can be handled by
write_all_features in the only script that uses it.
All scripts have been updated to work with the new syntax.