Skip to content

Commit

Permalink
update process optimizations
Browse files Browse the repository at this point in the history
  • Loading branch information
takb committed Feb 4, 2021
1 parent e543e0d commit 607b9ab
Show file tree
Hide file tree
Showing 14 changed files with 196 additions and 132 deletions.
6 changes: 4 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -123,5 +123,7 @@ osm/*
!osm/GeoFabrikSpider.py
!osm/bremen-tests.osm.pbf
IMPORTANT_NOTES.md
ops_settings_docker-*.yml
docker-compose-instance-*.yml
ops_settings_docker*.yml
!ops_settings_docker_standalone.yml
docker-compose-*.yml
!docker-compose-standalone.yml
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[![Tests](https://github.com/GIScience/openpoiservice/workflows/run%20tests/badge.svg)](https://github.com/GIScience/openpoiservice/actions?query=workflow%3A%22run+tests%22)

Openpoiservice (ops) is a flask application which hosts a highly customizable points of interest database derived from
OpenStreetMap.org data and thereby **exploits** it's notion of tags...
OpenStreetMap.org data and thereby **exploits** its notion of tags...

> OpenStreetMap [tags](https://wiki.openstreetmap.org/wiki/Tags) consisting of a key and value describe specific features of
> map elements (nodes, ways, or relations) or changesets. Both items are free format text fields, but often represent numeric
Expand All @@ -17,7 +17,7 @@ for instance `wheelchair` or `smoking` may then be used to query the service via
For instance, if you want to request all pois accessible by wheelchair within a geometry, you could add then add
`wheelchair: ['yes', 'dedicated]` in `filters` within the body of your HTTP POST request.

You may pass 3 different types of geometry within the request to the database. Currently "Point" and "LineString" with
You may pass 3 different types of geometry within the request to the database. Currently, "Point" and "LineString" with
a corresponding and buffer are supported as well as a polygon. Points of interest will be returned within the given geometry.

You can control the maximum size of geometries and further restrictions in the settings file of this service.
Expand All @@ -26,11 +26,11 @@ You can control the maximum size of geometries and further restrictions in the s

The osm file(s) to be imported are parsed several times to extract points of interest from relations (osm_type 3),
ways (osm_type 2) and nodes (osm_type 1) in order. Which type the specific point of interest originated from will be
returned in the response - this will help you find the object directly on [OpenStreetMap.org](OpenStreetMap.org).
returned within the response - this will help you find the object directly on [OpenStreetMap.org](https://www.openstreetmap.org).

## Installation

You can either run **openpoiservice** on your host machine in a virtual environment or simply with docker. The Dockerfile
You can either run **openpoiservice** on your host machine in a virtual environment or simply with Docker. The Dockerfile
provided installs a WSGI server (gunicorn) which starts the flask service on port 5000.


Expand Down Expand Up @@ -69,9 +69,9 @@ for osm files) as this will be a shared volume.

#### All-in-one docker image

This docker compose will allow you to run openpoiservice with `psql/postgis` image. This will allow you to deploy this project fast.
This docker-compose will allow you to run openpoiservice with `psql/postgis` image. This will allow you to deploy this project fast.

**Important :** The database is not exposed, you won't be able to access it from outside the container. If you want to acces it simply add those lines to the database definition inside the `docker-compose-with-postgis.yml`:
**Important :** The database is not exposed, you won't be able to access it from outside the container. If you want to acces it simply adds those lines to the database definition inside the `docker-compose-with-postgis.yml`:

```sh
ports:
Expand Down Expand Up @@ -211,7 +211,7 @@ If you keep the structure as follows, you can manipulate this list as you wish.
`column_mappings` in `openpoiservice/server/ops_settings.yml` controls which OSM information will be considered in the database and also if
these may be queried by the user via the API , e.g.

```py
```yaml
wheelchair:

smoking:
Expand Down
15 changes: 8 additions & 7 deletions docker-compose-standalone.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
version: '2.2'
services:
ops-api:
api:
container_name: ops-api
build: .
volumes:
Expand All @@ -11,29 +11,30 @@ services:
- "5000:5000"
mem_limit: 28g

ops-rebuild:
container_name: ops-rebuild
init:
container_name: ops-init
build: .
environment:
- REBUILD_DB=1
- INIT_DB=1
volumes:
- ./osm:/deploy/app/osm
- ./import-log.json:/deploy/app/import-log.json
- ./ops_settings_docker_standalone.yml:/deploy/app/openpoiservice/server/ops_settings.yml
- ./categories_docker.yml:/deploy/app/openpoiservice/server/categories/categories.yml
mem_limit: 28g
profiles:
- rebuild
- initialize

ops-update:
update:
container_name: ops-update
build: .
environment:
- UPDATE_DB=1
volumes:
- ./osm:/deploy/app/osm
- ./import-log.json:/deploy/app/import-log.json
- ./ops_settings_docker_standalone.yml:/deploy/app/openpoiservice/server/ops_settings.yml
- ./categories_docker.yml:/deploy/app/openpoiservice/server/categories/categories.yml
mem_limit: 28g
profiles:
- update

70 changes: 37 additions & 33 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,11 @@ version: "2.2"
volumes:
postgis-data:

networks:
poi_network:

services:
ops-api:
api:
container_name: ops-api
build: .
volumes:
Expand All @@ -14,66 +17,67 @@ services:
ports:
- "5000:5000"
depends_on:
- ops-db
- db
mem_limit: 28g
networks:
- poi_network
profiles:
- api

ops-rebuild:
container_name: ops-rebuild
# Don't forget to change the host name inside ops_settings_docker.yml by the one given to docker container.
# Also port should be set to 5432 (default value) inside the same file since they are on the same network
db:
container_name: ops-db
image: kartoza/postgis:11.0-2.5
volumes:
- postgis-data:/var/lib/postgresql
environment:
# If you need to create multiple database you can add coma separated databases eg gis,data
- POSTGRES_DB=gis
- POSTGRES_USER=gis_admin # Here it's important to keep the same name as the one configured inside ops_settings_docker.yml
- POSTGRES_PASS=admin # Here it's important to keep the same name as the one configured inside ops_settings_docker.yml
- POSTGRES_DBNAME=gis # Here it's important to keep the same name as the one configured inside ops_settings_docker.yml
- ALLOW_IP_RANGE=0.0.0.0/0
ports:
- 5432:5432
restart: on-failure
networks:
- poi_network

# These two services will not start by default and can be run as necessary:
# docker-compose up init
# docker-compose up update
init:
container_name: ops-init
build: .
environment:
- REBUILD_DB=1
- INIT_DB=1
volumes:
- ./osm:/deploy/app/osm
- ./import-log.json:/deploy/app/import-log.json
- ./ops_settings_docker.yml:/deploy/app/openpoiservice/server/ops_settings.yml
- ./categories_docker.yml:/deploy/app/openpoiservice/server/categories/categories.yml
depends_on:
- ops-db
- db
mem_limit: 28g
networks:
- poi_network
profiles:
- rebuild
- initialize

ops-update:
update:
container_name: ops-update
build: .
environment:
- UPDATE_DB=1
volumes:
- ./osm:/deploy/app/osm
- ./import-log.json:/deploy/app/import-log.json
- ./ops_settings_docker.yml:/deploy/app/openpoiservice/server/ops_settings.yml
- ./categories_docker.yml:/deploy/app/openpoiservice/server/categories/categories.yml
depends_on:
- ops-db
- db
mem_limit: 28g
networks:
- poi_network
profiles:
- update

# Don't forget to change the host name inside ops_settings_docker.yml by the one given to docker container.
# Also port should be set to 5432 (default value) inside the same file since they are on the same network
ops-db:
container_name: ops-db
image: kartoza/postgis:11.0-2.5
volumes:
- postgis-data:/var/lib/postgresql
environment:
# If you need to create multiple database you can add coma separated databases eg gis,data
- POSTGRES_DB=gis
- POSTGRES_USER=gis_admin # Here it's important to keep the same name as the one configured inside ops_settings_docker.yml
- POSTGRES_PASS=admin # Here it's important to keep the same name as the one configured inside ops_settings_docker.yml
- POSTGRES_DBNAME=gis # Here it's important to keep the same name as the one configured inside ops_settings_docker.yml
- ALLOW_IP_RANGE=0.0.0.0/0
ports:
- 5432:5432
restart: on-failure
networks:
- poi_network

networks:
poi_network:
1 change: 1 addition & 0 deletions import-log.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{}
68 changes: 51 additions & 17 deletions manage.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,41 @@
# manage.py

import os
import sys
import logging
import unittest
import json
from json import JSONDecodeError

from flask.cli import FlaskGroup
from openpoiservice.server import create_app, db
from openpoiservice.server.db_import import parser
import os
import sys
import logging

logging.basicConfig(level=logging.INFO)
logging.basicConfig(
level=logging.DEBUG if os.environ.get('OPS_DEBUG', False) else logging.INFO,
format='%(levelname)-8s %(message)s',
)
logger = logging.getLogger(__name__)

app = create_app()
cli = FlaskGroup(create_app=create_app)

# File for logging files list and change dates
logfile = "import-log.json"


def clear_log():
with open(logfile, "w") as f:
f.write("{}\n")
f.close()


@cli.command()
def test():
"""Runs the unit tests without test coverage."""

clear_log()
db.drop_all()

tests = unittest.TestLoader().discover('openpoiservice/tests', pattern='test*.py')
tests = unittest.TestLoader().discover("openpoiservice/tests", pattern="test*.py")
result = unittest.TextTestRunner(verbosity=2).run(tests)
if not result.wasSuccessful():
sys.exit(1)
Expand All @@ -31,32 +45,52 @@ def test():
@cli.command()
def create_db():
"""Creates the db tables."""

db.create_all()
clear_log()


@cli.command()
def drop_db():
"""Drops the db tables."""

db.drop_all()
clear_log()


@cli.command()
def import_data():
"""Imports osm pbf data to postgis."""

osm_files = []
osm_dir = os.getcwd() + '/osm'
osm_dir = os.getcwd() + "/osm"

for dir_name, subdir_list, file_list in os.walk(osm_dir):
print('Found directory: %s' % dir_name)
for fname in file_list:
if fname.endswith('.osm.pbf') or fname.endswith('.osm'):
osm_files.append(os.path.join(dir_name, fname))

logger.info(f"Starting to import OSM data... {osm_files}")
parser.run_import(osm_files)
if dir_name.endswith("__pycache__"):
continue
logger.info(f"Found directory: {dir_name}")
for filename in file_list:
if filename.endswith(".osm.pbf") or filename.endswith(".osm"):
osm_files.append(os.path.join(dir_name, filename))
osm_files.sort()

import_log = {}
if os.path.isfile(logfile):
with open(logfile) as f:
try:
import_log = json.load(f)
except JSONDecodeError:
pass

# we have found previous data in the database, check if file list has changed which would require a full rebuild
if len(import_log) and set(import_log.keys()) != set(osm_files):
logger.error(f"File set has changed since last import, full rebuild required. Exiting.")
return

logger.info(f"Starting to import OSM data ({len(osm_files)} files in batch)")
logger.debug(f"Files in import batch: {osm_files}")
parser.run_import(osm_files, import_log)

with open(logfile, "w") as f:
json.dump(import_log, f, indent=4, sort_keys=True)


if __name__ == '__main__':
Expand Down
3 changes: 2 additions & 1 deletion openpoiservice/server/db_import/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ class POIs(db.Model):
osm_type = db.Column(db.Integer, primary_key=True)
osm_id = db.Column(db.BigInteger, primary_key=True)
geom = db.Column(Geography(geometry_type="POINT", srid=4326, spatial_index=True), nullable=False)
delete = db.Column(db.Boolean, nullable=False, index=True)
src_index = db.Column(db.Integer, index=True)
delflag = db.Column(db.Boolean, nullable=False, index=True)

tags = db.relationship("Tags", backref=db.backref("POIs", cascade="delete"), lazy='dynamic')
categories = db.relationship("Categories", backref=db.backref("POIs", cascade="delete"), lazy='dynamic')
Expand Down
Loading

0 comments on commit 607b9ab

Please sign in to comment.