Skip to content

Making maps connectable: stable, non-proprietary IDs and data standards for streets

License

Notifications You must be signed in to change notification settings

remix/sharedstreets-builder

 
 

Repository files navigation

SharedStreets Builder

The SharedStreets Builder application converts OpenStreetMap data to SharedStreets protocol buffer tiles.

SharedStreets uses this tool to generate and maintain a complete global OSM-dervied tile set. Users can operate the tool directly on their OSM or use a pregenerated global tileset provided by SharedStreets.

Support for non-OSM data sources has been moved to the sharedstreets-conflator tool.

Notes

The builder application is built on Apache Flink. If memory requirements exceed available space, Flink uses a disk-based cache for processing. Processing large OSM data sets may require several hundred gigabytes of free disk space.

Roadmap

Local Development

Using Docker container

The Docker image entrypoint is defined at docker-entrypoint.sh

Make sure you have Docker installed. Then build and run the image:

docker build -t shst .
docker run --rm \
    -v /tmp \
    -v /data \
    -v $PWD/out/:/out/ \
    shst \
    <MEMORY> \
    <PBF_FILE_URL>

When running the image, you'll note the mounted volumes:

  • /tmp: for storage of intermediate files (Flink uses systems tmp dir by default)
  • /data: where the fetched contents of the OSM PBF file are written. The data/$filename is provided to the java command as input in docker-entrypoint.sh
  • $PWD/out/:/out/: where the SharedStreets output files are written to inside and outside the container. The output location is provided to the java command as output in docker-entrypoint.sh

Example:

curl https://github.com/sharedstreets/sharedstreets-builder/raw/master/data/nyc_test.pbf -L -o data/nyc_test.pbf
docker run \
    --rm \
    -v /tmp \
    -v $PWD/data/:/data/ \
    -v $PWD/out/:/out/ \
    shst \
    1G \
    'https://github.com/sharedstreets/sharedstreets-builder/raw/master/data/nyc_test.pbf'

Using Java directly

If you want to invoke sharedstreets builder directly, you first need to have fetched the desired OSM protobuff file and store it locally. You will also need to have a root disk with available space to store the output.

java -jar ./sharedstreets-builder-0.3.jar --input data/[osm_input_file].pbf --output ./[tile_output_directory]

Running in EC2

Through the infrastructure defined in the Remix monorepo under terraform/aws/shared-streets-builder, we have an EC2 instance defined, "SharedStreets-Tiles-Instance", in the getremix account in us-east-1. This is a very beefy instance. To run:

cd ~/sharedstreets-builder
git pull
sudo docker build . -t shst
curl https://github.com/sharedstreets/sharedstreets-builder/raw/master/data/nyc_test.pbf -L -o data/nyc_test.pbf
sudo docker run \
    -d \
    --log-driver=awslogs --log-opt awslogs-group=SharedStreets-Tiles --log-opt awslogs-create-group=true \
    --rm \
    -v /tmp \
    -v $PWD/data/:/data/ \
    -v $PWD/out/:/out/ \
    shst \
    256G \
    out/nyc_test.pbf

Example datasets

These are OSM datasets found online that can be used for testing SharedStreets builder.

City datasets

Planet datasets Running locally to test SharedStreets builder is not an option, due to the size of these datasets.

Other info

Known Gaps

  1. As mentioned above, running the dataset locally against the planet is not feasible. There's an outstanding item to validate that the Docker image can run against "the world". It may fail due to memory issues, but the results of a test run on large EC2 instance is unknown.
  2. Validating that metadata output of the SharedStreets builder is what we expect has not been done. There are questions about the hardcoded "tile z-level" and if that is what we want to generate.
  3. More generally, testing the metadata output of the SharedStreets builder is unknown/undocumented.

Resources

These additional resources were useful when understanding and debugging Flink JVM memory issues while trying to run larger and larger datasets against SharedStreets builder.

About

Making maps connectable: stable, non-proprietary IDs and data standards for streets

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 99.2%
  • Other 0.8%