-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coordinates modelling (add support for WKT to Flatten Tool) #10
Comments
Thanks for laying this out @duncandewhurst. Am coming into this with minimal background so feel free to ignore if out of scope. One thing I noticed is that flatterer seems to deal with geojson, including points and linestrings, quite well already. Here's what an example output csv looks like:
More generally, my view is that for the vast majority of applications that would use the geospatial data, it will be easier to import a geojson file directly, so we shouldn't worry too much about presenting lat/long in analysis-ready format in a csv export. That being said, I don't love the idea of leaving the geospatial data out of the csv export entirely, as this could cause confusion e.g. if the csv conversion is used for database imports. So option 1, or something like the flatterer output if that's an option, might be the best bet. |
Thanks, for the reminder about flatterer, @lgs85. I've opened a separate issue (#14) on deciding what tool to use so that we can keep this issue focused on the desired modelling.
I agree that GeoJSON would be easier for many use cases but that it would also be desirable to have the same information available in each publication format. I don't think we should use Flatterer's representation of |
The W3C's Spatial Data on the Web Best Practice 8: State how coordinate values are encoded is a useful reference for this issue. |
Coming back to this, I see no benefit in representing points as separate long/lat fields and lineStrings as WKT, as i) very few users will want to use just node data, so ii) they will have to parse the WKT for lineStrings anyway, iii) representing both as WKT has the advantage of consistency, which iv) makes conversion and conversion tooling easier. Therefore suggest we represent both points and lineStrings as WKT. |
For the Alpha, we'll use the default format provided by Flatten Tool and look to update the tool to provide WKT format in the Beta. |
Feedback from the World Bank's infrastructure map team is that WKT is expected for CSV files so I think we do want to use WKT for both point and linestring geometries. |
@Bjwebb, I've created a draft PR with updated CSV examples showing what I expect the WKT format to look like. Please could you check that you're happy with it from a Flatten Tool perspective? Edit: Noting that I've replaced the whole |
This looks like what I'm expecting. |
We plan to reuse GeoJSON's
Feature
object to represent the physical location of a node (as aPoint
) and the route of a link between its endpoints (as aLineString
) in both the JSON and GeoJSON formats.Points
If we use Flatten Tool for conversion from the JSON format to the CSV format,
Point
geometries would be represented as a semi-colon separated list:This poses two potential problems for users:
There are a couple of possible alternatives, either would require some special-casing in Flatten Tool:
Separate fields for longitude and latitude
This seems like the most user-friendly alternative. It is readily supported by QGIS, and presumably other GIS tools, and is equally usable for users who are not using GIS-specific tooling.
Well known text
This option is readily supported by QGIS, and presumably other GIS tools, but is less usable for users who are not using GIS-specific tooling.
Linestrings
I ran into some problems trying to flatten a GeoJSON
Linestring
in Flatten Tool so I don't know what the default behaviour is. One possibility is a semi-colon separated list of semi-colon separated lists:Another possibility is a multi-table representation, related by
id
:Neither seems particularly desirable in terms of usability. Both would require substantial additional processing to import into GIS tools and the ordering of longitude and latitude is not explicit in either.
In terms of alternatives, separating longitude and latitude into separate fields would only work for the multi-table representation, which would still have significant usability issues. However, well-known text is an option:
Summary
Based on the analysis above, there are 3 options:
If consistency in the representation of
Point
andLinestring
geometries in the CSV format is desirable, then both could be represented using well-known text.If consistency is not important, then
Point
geometries could be represented using separate longitude and latitude fields andLinestring
geometries could be represented using well-known text.The detailed routes of links could simply be omitted from the CSV representation, since it is adequately handled by the JSON and GeoJSON formats.
The purpose of this issue is to surface any other options that should be considered, to seek feedback on the preferred option and to explore the implications for tooling.
The text was updated successfully, but these errors were encountered: