Table of Contents
A dockerized Extract, Transform, Load (ETL) pipeline with PostgreSQL, Airflow, and DBT.
- Ingestion of given raw data into a data lake of your choice.
- Modeling the data to reduce the memory process and improve the performance of fetch queries.
- ETL pipeline to enrich your data into a data warehouse following your models.
- Validator to validate the correctness of your data for the ETL pipeline.
- And finally, an interface to expose your actionable data for the Machine learning purposes.
Tech Stack used in this project
Make sure you have docker installed on local machine.
- Docker
- Docker Compose
- Clone the repo
git clone https://github.com/skevin-dev/ad_challenge
- Navigate to the folder
cd ad_challenge
- Build an airflow image
docker build . --tag apache_dbt/airflow:2.3.3
- Run
docker-compose up
- Open Airflow web browser
Navigate to `http://localhost:8089/` on the browser activate and trigger ingestion_data
https://curious-wisp-482466.netlify.app
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Shyaka Kevin - [email protected]