Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct regionId of Sieniawka #24

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

goodhoko
Copy link

@goodhoko goodhoko commented Aug 6, 2020

Polish locality shouldn't belong to German region. This PR changes Sieniawka's region from Saxony to dolnoslaskie.

Note that this PR does not try to correct Sieniawka's geometry which is apparently also wrong as it's placed in Germany. Maybe, that's the cause of the wrong regionId which, I assume, is inferred by intersecting the centroid with region's polygons...?

I'm not sure if creating a PR here is the correct way to contribute.

Polish locality shouldn't belong to German region.
@stepps00
Copy link
Contributor

stepps00 commented Aug 7, 2020

Thank you @goodhoko!

The region_id (and the rest of the hierarchy) is derived from point-in-polygon work, which basically intersects the centroid and appends the "parent" hierarchy to each record. In this case, it looks like the record is purely in the wrong country (Germany), when it should be in Poland.

I need to poke around a bit more, but this may be a duplicate record that we can simply deprecate and flag as "not current". Let me get back to you on this, it would be nice to merge this fix soon.

@goodhoko
Copy link
Author

goodhoko commented Aug 9, 2020

The region_id (and the rest of the hierarchy) is derived from point-in-polygon work, which basically intersects the centroid and appends the "parent" hierarchy to each record. In this case, it looks like the record is purely in the wrong country (Germany), when it should be in Poland.

Thanks for clarification, @stepps00 . This also explains all other cases of this problem I discovered so far. Here's a snippet from our (unfortunately closed source) project that lists WOF ID's of localities that have hierarchies of other countries. (By no means exhaustive.)

It might help you with the poking. .)

# Some localities have wrong region id.
# This is a dictionary of correct regionIds (value) for such localities (key).
WOF_WRONG_REGION_ID = {
    # until https://github.com/whosonfirst-data/whosonfirst-data-admin-pl/pull/24 is resolved
    101841783: 85687301,
    # until https://github.com/whosonfirst-data/whosonfirst-data-admin-pl/pull/25 is resolved
    1293125413: 85687267,
    # German city but has Polish region (I run out of motivation to open PR in WHO's data repos)
    1125884123: 85682487,
    # German city but has French region
    1126027793: 85682491,
}

Could you maybe link the code that does the centroid intersection and hierarchy generation? I imagine this problem could be fixed there.

@stepps00
Copy link
Contributor

stepps00 commented Aug 10, 2020

Could you maybe link the code that does the centroid intersection and hierarchy generation? I imagine this problem could be fixed there.

This code lives in the following repos:

Some documentation about setup, here.

To be clear though, the code likely does not need to be updated. The issues you're seeing are driven by data. Either the existing data needs to be updated to "learn" about the new parent ids/hierarchy information, or we need to tweak centroids in existing records. The tools that intersect centroids/polygons are fairly straightforward.. we've just done quite a bit of updating to our admin data, so running records against the PIP tools again would likely solve some/all of the issues you're seeing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants