This is a project used to get information about apartments available to rent or buy from Willhaben.at, on the biggest platforms available in Austria. You can select yourself what areas of Austria you are interested in, as well as if it is for buying or renting. The data will then be saved in a local database that easily can be used to get a good overview of the market.
- Currently it only stores data in the database. It would be nice with some kind of export to for example .csv or excel files.
- Currently only supports Willhaben.at. Might be of interest to add other platforms as well.
Checkout out the repo from Github:
git clone https://github.com/golgor/apartment_scraper.git
Then install it as editable install using pip:
pip install -e .
Customize main.py accordingly, execute it, and you will then find the database in the folder of the package, typically apartment_scraper/apartment_scraper/test.db
.
To use the API, build the image using docker. While in the main directory, execute:
> docker build -t apartment-api .
To start the image in a detached state, execute:
> docker run -d -p 8080:80 apartment-api:latest
You can use the kompose to generate Kubernetes deployment files from the docker-compose.yaml. Note the added label for the web service, this is used by kompose to also generate a load balancer so the pod is reachable externally. Also, it seems to not work when using profiles in the docker-compose.yaml. These have to be commented out when using kompose
. To generate the files the command kompose convert
was used.
Minikube is used to deploy the app locally for this testing purposes. To deploy the app locally, the steps below is necessary:
- Start the minikube
minikube start
. - Verify that the correct context is selected
kubectl config current-context
. It should beminikube
- Being in the same folder as the generated files, run
kubectl apply -f database-deployment.yaml,database-service.yaml,postgres-data-persistentvolumeclaim.yaml,web-deployment.yaml,web-service.yaml,web-tcp-service.yaml
- Running
kubectl get pods
should show two pods that are either running or starting up. - We need to initialize the database
- Login into the shell of the database pod.
kubectl exec --stdin --tty database-db897986c-qkdqs -- /bin/bash
. The exact name of the database pod is available usingkubectl get pods
- Run the command
createdb -h localhost -U postgres
apartments - (Optional) Verify everything by running
psql -h localhost -U postgres
and then run\l
to list all the databases. One of the rows returned should be apartments.
- Login into the shell of the database pod.
- Get the external url for the load balancer using
minikube service web-tcp --url
. This typically returns something likehttp://192.168.49.2:31717
. - Open the url from above in a browser should return
{"Hello":"World"}
.