Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: 345 mis-geocoded points found near Kirkwood and College Ave #6

Open
markstos opened this issue Jan 30, 2024 · 2 comments · May be fixed by #8
Open

Bug: 345 mis-geocoded points found near Kirkwood and College Ave #6

markstos opened this issue Jan 30, 2024 · 2 comments · May be fixed by #8

Comments

@markstos
Copy link

This issue can be found on the publish interactive map, but here's a screenshot from the same data loaded into QGIS that makes the issue more apparent by displaying the primary and secondary roads as a label.

The location is just to the south east of Kirkwood and College Avenue. So, all the labels should read something like "COLLEGE AVE & KIRKWOOD", but instead they have a random assortment of streets around town and even I-69.

There are so many points and they can't all be easily displayed here, but in QGIS I counted them as 345.

image - 2024-01-29T201733 559

I don't think they street names are really random, though. I believe they are all addresses that failed to geocode normally.

For example, I think "MCNUTT CIRCLE" may be the informal drive in front of the building, and I don't expect mile marker notations like "I-69 MM#109" to geocode well either. My guess is that failing a street-level accuracy, the geocoder was allowed to fallback to "city-level" accuracy, with courthouse being the symbolic point that represents the location of Bloomington, and this intersection being one of the closest to the courthouse.

Sometimes geocoders can be configured to fail if they can't return a precise location, which may have been appropriate here.

This issue was found in master.geojson. The dates for the crashes involves span the whole range from 2003-2022.

I reviewed the accuracy of the source data around College Ave & Kirkwood and it looks highly accurate. It appears that the issue was introduced during the re-geocode step here in an attempt to fill in missing coordinates and make others more accurate.

@markstos
Copy link
Author

cc: @carsonology

@markstos
Copy link
Author

From reading the code, the geocoding was done with the ArcGIS geocoder, which I'm familiar with.

First, I can see that the ArcGIS geocoder has the option to return results as the nearest point on the street (the default) or the center of the roof. That aligns with the my theory that the results correspond to the street nearest the courthouse. This is described here, as "location_type": https://developers.arcgis.com/python/api-reference/arcgis.geocoding.html#geocode

Secondly, Each geocoded result includes a field called Addr_Type that describes the accuracy of the match. That's documented here: https://pro.arcgis.com/en/pro-app/latest/help/data/geocoding/what-is-included-in-the-geocoded-results-.htm

One of the options for the accuracy is "Locality". Again aligning with my theory that some points were geocoded generically to "Bloomington".

From reading the geocoding code in this repo, there was check against the accuracy level of the geocode, so the points were used no matter their level of precision.

So a fix could be to check the Addr_Type of the points return and decline to use any points that aren't particularly accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant