Skip to content

jhumigas/azfn-python-advanced-etl

Repository files navigation

README

This repo demonstrates use of durable orchestration azure function. With this, you have an advanced structure of ETL orchestration using azure functions.

Here you will find usage of:

  • Clean Architecture
  • Data Validation
  • Function Chaining in ETL Orchestration

Orchestration with azure function recipe

ETL With Azure Function Pattern

Note

Orchestration is one of the main challenges of the ingestion.

Most of the extraction, processing and loading operation we perform require multiple data storages, in-between saving intermediate data or consuming external services (such as Azure Machine Learning Service).

We can summarize the orchestration scheme as:

  1. Trigger Orchestrator
  2. Orchestrator function starts
  3. For each subtask, orchestration logs state, schedules tasks, once completed, logs state

Pre-requisities

Make sure you have:

  • Python
  • Poetry (installation instructions here)
  • Docker container manager (use colima for macOs)

Optional:

If you are using Code as your Code editor you can install the following extensions:

Setup local env

You can run:

make init-env-file # Copy default ./docker/template.env to ./docker/.env

make setup # Install project dependencies

You can then run an orchestration for demonstration purpose by doing the following:

colima start # Start docker service if you are using colima
make start-dev 
make check-up-azfn # To check if azure function hub is up and running

At this point you can monitor the orchestration using the durable functions monitor vscode extension. The connection string should be:

DefaultEndpointsProtocol=http;AccountName=localstoreaccount;AccountKey=key1;BlobEndpoint=http://localhost:10000/localstoreaccount;QueueEndpoint=http://localhost:10001/localstoreaccount;TableEndpoint=http://localhost:10002/localstoreaccount;

Then to trigger an orchestration, do the following:

curl http://localhost:8080/api/orchestrator

Run tests

For unit tests

make run-unit-tests # Check if unit tests work

For integration tests:

make start-dev # To get local env running
make check-up-azfn # To check if azure function hub is up and running
make run-integration-tests
make stop-dev # To shutdown local env 

Project structure

.
├── README.md              <- The top-level README for developers using this project.
├── Makefile               <- Makefile with commands like `make setup` to install the project
├── docker                 <- Docker Configuration to run a local environment           
│   ├── Dockerfile
│   ├── docker-compose.yml
│   ├── template.env       <- Template of .env file to use for docker configuration
│   └── wait-azfn.sh
├── function_apps          <- Folder containing all the azure functions
├── py_project             <- Python project to import in azure functions
├── pyproject.toml         <- Build poetry configuration holding dependencies and tools configurations
├── poetry.lock
└── tests                  <- Tests
    ├── integration
    └── unit

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published