Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need some details about the imported OSM data #232

Closed
ehx-v1 opened this issue Apr 22, 2016 · 8 comments
Closed

need some details about the imported OSM data #232

ehx-v1 opened this issue Apr 22, 2016 · 8 comments

Comments

@ehx-v1
Copy link
Contributor

ehx-v1 commented Apr 22, 2016

I'm doing an internship at a company, working at an experimental solution in terms of routing (which my self-written library is also meant to be used for). This results in that I need to get the company-internal map data format into Neo4j Spatial and I have two approaches to this:

  • Generate OSM out of the data and import it.
  • Push the data directly into the database and have the RTree generated afterwards.

The OSM data will most probably have some special corner cases that need to be fulfilled, and there will also be some requirements the data has to meet for the RTree to be built.

So: Which way would you recommend, and what are the Must/MustNot's?

@craigtaverner
Copy link
Contributor

The OSM data format is very specific to OSM and is quite unlike almost any other GIS format out there. I would strongly suspect that the company internal format is more like other common formats than like OSM. The main difference is that OSM is actually a global graph, with all connected geometries actually connected in the graph. Most classic GIS formats keep geometries as separate entities. For example, two land areas that share an edge will each have a copy of that edge in ESRI Shape files, but in OSM they will literally share the same nodes in the shared edge.

For this reason, without knowing anything about your company internal format, I would recommend you write your own importer. Of course a lot has to do with how you intend to use the data within Neo4j. If your goal is to create a single connected graph, as is the case in OSM, then the OSM format is a reasonable intermediate format. But you need to do the work to figure out these connections. You need to, for example, take every pair of polygons and see if they share an edge and then make sure the OSM data you output has that edge actually shared (same nodes used for both). This is a non-trivial exercise, although the JTS library does have a lot of tools to do much of the heavy lifting for you.

If I were writing this code myself, I would write a new importer and a new GeometryEncoder inside Neo4j Spatial. I would not use OSM (which deals with the messiness of cloud sourced data), and I would not convert to a limited format like SHP which looses (or disallows) connected geometries.

If your company format is disconnected, and you are not planning to connect up, then perhaps SHP is a convenient (and very widely supported) format to use. The GDAL OGR library will be of likely use in generating and managing shapefiles.

@craigtaverner
Copy link
Contributor

I should also comment that the OSMImporter code is probably unreasonably complicated for your case. Rather look at the ShapefileImporter as a simpler example to follow when writing your importer.

The OSMImporter does a lot of fancy stuff you might find interesting, if you have the time to unravel its complexities, but I think it is also messy as a result of having to code a number of rules-of-thumb when dealing with the natural uncertainties found in cloud sourced data. Hopefully your company internal data does not have that level of complexity.

@ehx-v1
Copy link
Contributor Author

ehx-v1 commented Apr 25, 2016

thanks a lot!
making it in fashion of the OSMImporter would definitely be way too complex (regarding that I lost track of its structure while simply trying to figure out where relationships are built and where the import ends)

so, all the RTree needs is data having a fitting GeometryEncoder? or are there more constraints upon the data the RTree relies upon to be built, like some necessary properties?

@ehx-v1
Copy link
Contributor Author

ehx-v1 commented May 19, 2016

From what I understood of the REST API samples, there must be 2 properties the GeometryEncoder maps to lat and lon...
Anything else?

@craigtaverner
Copy link
Contributor

All the RTree needs is a layer. The layer includes the GeometryEncoder which formalizes how to map from a node to a Geometry. This mapping can be one, two or more properties, or an entire sub.graph connected to that node (in OSM it is a subgraph). There is in fact no limitation other than the fact that every geometry is associated with a node, normally called the geometry node.

The examples and documentation might imply that there is a need for lat/long, but that would only be a symptom of the particular geometry encoder used in those examples. That is not a requirement of the RTree at all.

@ehx-v1
Copy link
Contributor Author

ehx-v1 commented May 25, 2016

I found out the company format is a specific format which is also sold as a service - you might have heard of Advanced Geographic Format. From all I understood, it's best to encode as WKT, perhaps using an intermediate custom XML format. I think from where I am, I can figure out the rest on my own. Thanks a lot!

@ehx-v1 ehx-v1 closed this as completed May 25, 2016
@ehx-v1 ehx-v1 changed the title need some details about the imported OSM data need some details about the imported -OSM- WKT data May 31, 2016
@ehx-v1 ehx-v1 changed the title need some details about the imported -OSM- WKT data need some details about the imported OSM WKT data May 31, 2016
@ehx-v1 ehx-v1 changed the title need some details about the imported OSM WKT data need some details about the imported <del>OSM</del> WKT data May 31, 2016
@ehx-v1 ehx-v1 changed the title need some details about the imported <del>OSM</del> WKT data need some details about the imported OSM data May 31, 2016
@ehx-v1
Copy link
Contributor Author

ehx-v1 commented May 31, 2016

Ah never mind, I'll ask on a WKT topic.

@ehx-v1
Copy link
Contributor Author

ehx-v1 commented May 31, 2016

By the way, in the end handling the XML is more complex than importing the generated WKT. That perfectly shows how nice your Java API is compared to other Java APIs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants