Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with transportation theme #2

Closed
dbartles opened this issue Sep 20, 2023 · 5 comments
Closed

Issue with transportation theme #2

dbartles opened this issue Sep 20, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@dbartles
Copy link

I was able to use this to download places and everything ran as expected. When I change the parameters to download the transportation theme data (segments, connectors, or no ptype parameter specified) I get this error: Original Message: Error creating dataset. Could not read schema from 'overturemaps-us-west-2/release/2023-07-26-alpha.0/theme=transportation/type=segment/20230726_134827_00007_dg6b6_0cfe1913-2081-4429-b418-858bddb304bc'. Is this a 'parquet' file?: Map keys must be annotated as required.

The command I am running that generates this error is pasted below. Again this works fine for places themed data, but when I change the theme and pytpe it breaks.

docker run -v $(pwd):/examples --name omdownloader ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme transportation --ptype connector --bbox /examples/bbox.geojson --output /examples/transportationconnector.parquet

Thanks!

Youssef-Harby added a commit that referenced this issue Sep 20, 2023
@Youssef-Harby
Copy link
Owner

Hi @dbartles,

Thank you for bringing this issue to my attention.

I've implemented a workaround to resolve the issue you're experiencing with the transportation theme data. The problem was related to the S3 URLs from which the data is sourced. I've switched to WhereRobots S3 URLs to fix the schema reading issue.

You can now re-pull the latest Docker image using the following command, and it should address the problem:

docker pull ghcr.io/youssef-harby/overturemapsdownloader:latest

Please note that the updated Docker image will take approximately 2 hours to build and push. After that, you should be able to proceed without encountering the error.

Would you mind trying again once the new image is available, and let me know if everything works as expected?

Also see this may help

@Youssef-Harby Youssef-Harby added the bug Something isn't working label Sep 20, 2023
@dbartles
Copy link
Author

Thanks Youssef - that worked to fix that error and the download did complete. However, the download gives some errors and returns what appears to be an empty file after converting to ESRI GDB (which also throws errors).

I am going to rerun my same query again with places themed data and the same BBOX to ensure that there isn't an issue with my BBOX here. Stay tuned and I will let you know.

This is the later part of download message I got for the transportation connectors that produced an empty result:
"[ ] | 0% Completed | 102.20 ms/app/.venv/lib/python3.10/site-packages/dask/dataframe/core.py:8114: UserWarning: Insufficient elements for head. 10 elements requested, only 0 elements available. Try passing larger npartitions to head.
warnings.warn(
[########################################] | 100% Completed | 202.63 ms
[2023-09-20 12:05:10] INFO: Writing GeoDataFrame to "/examples/transportationconnector.parquet"
Empty GeoDataFrame
Columns: [updatetime, version, level, subtype, connectors, road, sources, bbox, geometry, theme, type, geohash]
Index: []"

@Youssef-Harby
Copy link
Owner

@dbartles, I've just run the same steps provided in the README.md and tried it with the example bbox.geojson located in the /examples folder in the repo root (generate your own from https://geojson.io). I can confirm that the download of the parquet file was successful, and I was able to open it in QGIS without any issues.

Additionally, I tried converting it to a GeoPackage, and that also worked seamlessly, if you're trying to open the dataset in ArcGIS Pro, you might want to consider using the GeoPackage format as a viable alternative to file geodatabase (GDB). I've found the GeoPackage format to be pretty compatible and efficient for various geospatial workflows.

Let's see if the issue you're facing is specifically related to the ESRI GDB conversion process.

Can you provide me with the bbox and the commands you are trying to make the debug process more clear for me ?

image

@dbartles
Copy link
Author

dbartles commented Sep 21, 2023

Hey Youssef, converting to GPKG works and that does prove that the download works for the Transporation theme, and everything looks good!

The issue I was having is indeed with the conversion to ESRI file geodatabase. I am actually using QGIS so GPKG file format works fine for me.

Here is the command I am running to convert from .parquet to .gdb:

sudo docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest ogr2ogr /examples/transportationconnector.gdb /examples/transportationconnector.parquet

The result of this is:

Warning 1: Field sources of unhandled type list<element: map<string, string ('element')>> ignored
Warning 1: Several drivers matching gdb extension. Using OpenFileGDB
ERROR 6: Unsupported geometry type
ERROR 1: Terminating translation prematurely after failed

However when I run (same command as above just with .gdb replaced with .gpkg):

sudo docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest ogr2ogr /examples/transportationconnector.gpkg /examples/transportationconnector.parquet

I get this output and the resulting GPKG file looks great when loaded into QGIS:

Warning 1: Field sources of unhandled type list<element: map<string, string ('element')>> ignored
Warning 1: The output driver does not seem to natively support StringList type for field connectors. Converting it to String(JSON) instead. -mapFieldType can be used to control field type conversion.

It probaby doesnt matter all that much anymore, but this is the content of my bbox.geojson file:

{"type":"FeatureCollection","features":[{"type":"Feature","properties":{},"geometry":{"coordinates":[[[-115.98910693870181,51.439037494527895],[-115.98910693870181,50.69653452393939],[-114.8264333807063,50.69653452393939],[-114.8264333807063,51.439037494527895],[-115.98910693870181,51.439037494527895]]],"type":"Polygon"}}]}

Thank you for helping me sort this all out. I am excited to now be able to review and work with the transportation data!

@Youssef-Harby
Copy link
Owner

@dbartles,
The errors you're encountering during the conversion of the Parquet data to ESRI file geodatabase (GDB) are indeed related to the handling of certain data types by the OGR toolset. The specific issue with unhandled nested data types, like list<element: map<string, string ('element')>>, has been recognized and discussed in the GDAL/OGR community, as reflected in the Parquet driver: unhandled data types from Overture maps datasets (OSGeo/gdal#8227)

@rouault, a GDAL contributor, addressed this issue in August 2023. As per the commits mentioned, support for reading nested list/map data types as JSON was added, I believe the fix will be released soon.

Note: You can open a GeoPackage in ArcGIS Pro as it's supported. Additionally, you can export a GeoPackage to GDB using QGIS, or convert a GeoPackage to GDB with ogr2ogr.

For now, I'll proceed to close the issue. However, if there are further updates or concerns, feel free to comment, Hope this helps, and good luck with your geospatial projects!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants