Skip to content

Commit

Permalink
fix: transportation theme #2
Browse files Browse the repository at this point in the history
  • Loading branch information
Youssef-Harby committed Sep 20, 2023
1 parent 5ae4b48 commit e6b7d2a
Show file tree
Hide file tree
Showing 5 changed files with 67 additions and 15 deletions.
53 changes: 51 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,13 @@ Whether you're a data scientist, a geospatial analyst, or a developer, OvertureM
- [Data Manipulation/Downloading and Conversion](#data-manipulationdownloading-and-conversion-1)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [the docker image manily contains the following tools:](#the-docker-image-manily-contains-the-following-tools)
- [the docker image manily contains the following tools:](#the-docker-image-manily-contains-the-following-tools)
- [Usage](#usage)
- [Download Geospatial Data](#download-geospatial-data)
- [Commands](#commands)
- [Examples of downloading data CLI:](#examples-of-downloading-data-cli)
- [With Docker:](#with-docker)
- [Without Docker you can use the following commands with (`pip install overturemapsdownloader`):](#without-docker-you-can-use-the-following-commands-with-pip-install-overturemapsdownloader)
- [Convert Parquet to GeoPackage](#convert-parquet-to-geopackage)
- [Convert Parquet to MBTiles (will support tippecanoe in the future)](#convert-parquet-to-mbtiles-will-support-tippecanoe-in-the-future)
- [Convert Parquet to ESRI File Geodatabase vector (OpenFileGDB)](#convert-parquet-to-esri-file-geodatabase-vector-openfilegdb)
Expand Down Expand Up @@ -111,7 +114,7 @@ docker pull ghcr.io/youssef-harby/overturemapsdownloader:latest
3. Run the following command to download geospatial data:

```bash
docker run -v $(pwd):/examples --name omdownloader ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme places --ptype place --bbox /examples/bbox.geojson --output /examples/places.parquet
docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme places --ptype place --bbox /examples/bbox.geojson --output /examples/places.parquet
```

#### Commands
Expand All @@ -130,6 +133,52 @@ options:
- `--output PATH` Output file path (e.g., `places.parquet`)
- `--help` Show this message and exit.

## Examples of downloading data CLI:

### With Docker:

```bash
# admins/locality
docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme admins --ptype locality --bbox /examples/bbox.geojson --output /examples/locality.parquet

# admins/administrativeBoundary
docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme admins --ptype administrativeBoundary --bbox /examples/bbox.geojson --output /examples/admins.parquet

# buildings/building
docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme buildings --ptype building --bbox /examples/bbox.geojson --output /examples/building.parquet

# places/place
docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme places --ptype place --bbox /examples/bbox.geojson --output /examples/place.parquet

# transportation/connector
docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme transportation --ptype connector --bbox /examples/bbox.geojson --output /examples/connector.parquet

# transportation/segment
docker run -v $(pwd):/examples ghcr.io/youssef-harby/overturemapsdownloader:latest OMDownloader omaps --theme transportation --ptype segment --bbox /examples/bbox.geojson --output /examples/segment.parquet
```

### Without Docker you can use the following commands with (`pip install overturemapsdownloader`):

```bash
# admins/locality
OMDownloader omaps --theme admins --ptype locality --bbox examples/bbox.geojson --output examples/locality.parquet

# admins/administrativeBoundary
OMDownloader omaps --theme admins --ptype administrativeBoundary --bbox examples/bbox.geojson --output examples/admins.parquet

# buildings/building
OMDownloader omaps --theme buildings --ptype building --bbox examples/bbox.geojson --output examples/building.parquet

# places/place
OMDownloader omaps --theme places --ptype place --bbox examples/bbox.geojson --output examples/place.parquet

# transportation/connector
OMDownloader omaps --theme transportation --ptype connector --bbox examples/bbox.geojson --output examples/connector.parquet

# transportation/segment
OMDownloader omaps --theme transportation --ptype segment --bbox examples/bbox.geojson --output examples/segment.parquet
```

### Convert Parquet to GeoPackage

To convert the downloaded data to GeoPackage format, run the following command:
Expand Down
1 change: 1 addition & 0 deletions config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ s3_credentials: # Optional
urls:
Amazon_S3: "s3://overturemaps-us-west-2/release/{release}/theme={theme}/type={ptype}/*"
Microsoft_Azure: "https://overturemapswestus2.blob.core.windows.net/release/{release}/theme={theme}/type={ptype}/*"
Wherobots_S3: "s3://wherobots-public-data/overturemaps-us-west-2/release/{release}/theme={theme}/type={ptype}"

themes:
- name: "admins"
Expand Down
5 changes: 3 additions & 2 deletions overturemapsdownloader/dask_qrys.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,18 @@
def get_df_from_parquet(
parquet_path,
engine="pyarrow",
# columns=["geometry"], / comment to get all columns by default
# columns=["geometry"], # comment to get all columns by default
storage_options={"anon": True},
parquet_file_extensions=False,
):
"""
Reads a Dask DataFrame from a Parquet file.
"""
try:
logging.info(f"Reading Parquet file from {parquet_path}")
df = dd.read_parquet(
parquet_path,
columns=columns,
# columns=columns, # comment to get all columns by default
engine=engine,
index="id",
dtype_backend=engine,
Expand Down
18 changes: 9 additions & 9 deletions overturemapsdownloader/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,14 @@ def omaps(
config.update_attribute("global_variables", {"default_theme": theme})
logging.info(f"Theme: {config.global_variables.default_theme}")
if ptype:
config.update_attribute("global_variables", {"default_type": ptype})
logging.info(f"Parquet Type: {config.global_variables.default_type}")
if theme == "transportation" and ptype in ["connector", "segment"]:
config.update_attribute("global_variables", {"default_type": ptype})
query_url = config.format_url("Wherobots_S3", theme=theme, ptype=ptype)
logging.info(f"Parquet url: {query_url}")
else:
config.update_attribute("global_variables", {"default_type": ptype})
query_url = config.format_url("Amazon_S3", theme=theme, ptype=ptype)
logging.info(f"Parquet url: {query_url}")
if bbox:
config.update_attribute("global_variables", {"filter_by_bbox": True})
config.update_attribute("global_variables", {"bbox_file_path": str(bbox)})
Expand All @@ -41,13 +47,7 @@ def omaps(
config.update_attribute("global_variables", {"output_file_path": str(output)})
logging.info(f"Output File: {config.global_variables.output_file_path}")

s3_url = config.format_url("Amazon_S3", theme=theme, ptype=ptype)
logging.info(f"Amazon S3 URL: {s3_url}")

azure_url = config.format_url("Microsoft_Azure", theme=theme, ptype=ptype)
logging.info(f"Microsoft Azure URL: {azure_url}")

om_dask_to_parquet(config)
om_dask_to_parquet(config, query_url)


if __name__ == "__main__":
Expand Down
5 changes: 3 additions & 2 deletions overturemapsdownloader/om_logic.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,16 @@
from overturemapsdownloader.utils_helper import read_geospatial_data


def om_dask_to_parquet(config):
def om_dask_to_parquet(config, query_url):
print(f"Query URL: {query_url}")
bbox_filter = read_geospatial_data(
config.global_variables.bbox_file_path,
as_shapely_str=True,
output_format="Custom",
)

df = get_df_from_parquet(
parquet_path=config.format_url("Amazon_S3"),
parquet_path=query_url,
# columns=get_columns_from_om_schema_yaml(schema_yaml_path),
)

Expand Down

0 comments on commit e6b7d2a

Please sign in to comment.